In the fall of 2017 I started my first internship at ISO New England, New England's non-profit electric grid operator. ISO-NE is responsible for the operation of the bulk electric grid for the six New England states, designing and running wholesale energy markets, and planning to ensure electricity needs will be met over the next ten years. The team I worked in, IT-EMS Day-Ahead Support, is responsible for the development and maintenance of software that the operators running the power grid use every minute. This includes software responsible for allowing participants in the energy markets to make bids, solving for the most economical way to schedule power generation, forecasting electricity demand, wind, and solar generation.
My work was mostly dedicated to short-term energy demand forecasting. The power system operators need to have as accurate of a picture as possible of the amount of generation that needs to be scheduled and for the upcoming week. With the forecast for the immediate next day being the most critical as it is used as input to our energy markets to make sure there is enough generation purchased and available to meet upcoming electric demand. An accurate forecast is critical for both ensuring grid reliability and keeping costs down for the people of New England.
I developed a new machine learning system to forecast energy demand for the next seven days. Currently the solution uses LightGBM as the underlying ML framework, but also employs several other tricks to squeeze as much accuracy as possible. A few examples are correcting errors temporally (similar to a moving average) and upweighting certain critical instances. This project also included the software engineering aspect of creating a system that integrates with our existing databases and is completely reliable day to day.
I created a web app using RShiny to make it as easy as possible to view our data and forecasts in order to quickly analyse their relative performance and provide operational metrics. It includes highlighting how much of our errors come from solar PV forecasts and interactive demos of how several of the techniques used in the model impact the final output.
I also worked on several experimental projects to help streamline or improve several processes. One was trying to quantify the uncertainty in our forecasts. Just as with any other ML system, the load forecast will never be 100% accurate. However, could we derive bounds or intervals that we can expect the forecast to fall under most of the time or at least classify a day as likely to have high error? To answer this question I worked on a project that broke down where errors come from into two components. One of which is the errors that come from the difference between the forecasted weather and what the weather actually ends up being. The other component being measuring errors inherent to the model itself. The motivation here being that instances/days with very hot temperatures are likely to have higher errors than days with mild temperatures. So to get an estimate of model error, the general idea was to condition the dataset based on the new sample and look at the errors of the historical samples. With these estimates of errors combined, we have the desired metric that could be used to compare instances to each other and see if power system operations should be more wary of the upcoming load forecast.