Batch reinforcement learning for network-safe demand response in unknown electric grids


BibTeX reference

We formulate a batch reinforcement learning-based demand response approach to prevent distribution network constraint violations in unknown grids. We use the fitted Q-iteration to compute a network-safe policy from historical measurements for thermostatically controlled load aggregations providing frequency regulation. We test our approach in a numerical case study based on real load profiles from Austin, TX. We compare our approach's performance to a greedy, grid-aware approach and a standard, grid-agnostic approach. The average tracking root mean square error is 0.0932 for our approach, and 0.0600 and 0.0614 for, respectively, the grid-aware and grid-agnostic implementations. Our numerical case study shows that our approach leads to a 95% reduction, on average, in the total number of rounds with at least a constraint violation when compared to the grid-agnostic approach. Working under limited information, our approach thus offers lower but acceptable setpoint tracking performance while ensuring safer distribution network operations.

, 14 pages

This cahier was revised in March 2022

Research Axis

Research application


Batch Reinforcement Learning for Network-Safe Demand Response in Unknown Electric Grids
22nd Power Systems Computation Conference (PSCC 2022), Porto, Portugal, 2022 BibTeX reference


G2155R.pdf (600 KB)