Skip to main content

Energy Data Analysis with Pandas

The organisation of Europe's power grid operators (ENTSO-E) is providing an open-data transparency platform with a lot of interesting data about the state of the  power grid in its various member countries. This data is among others also used to power websites like

In order to access the REST API, one needs to register a user account on the site and request API access via an email to the support help-desk following the instructions here.

There is also a Python client for this API which also converts the raw XML data into Pandas dataframes. Pandas is a Swiss army knives for dataset manipulations and one of the reasons why Python is so popular among data scientists.

The following code shows how to do simple ad-hoc analysis with the granular  time-series data returned by the API. The example shows a very over-simplistic back of the envelope estimate of the idealized storage that would be needed to align the variable solar and wind energy production with the fluctuation of demand over the same time that is used in the following blog post on Moving variable renewables from "pay-as-produced" to pay-as-needed"

from entsoe import EntsoePandasClient
import pandas as pd

client = EntsoePandasClient(api_key='<api key>')

start = pd.Timestamp('20220101', tz='Europe/Madrid')
end = pd.Timestamp('20230101', tz='Europe/Madrid')
country = 'ES'
country_name = 'Spain'

# query day-ahead wind & solar forecast & actual load timeseries
forecast = client.query_wind_and_solar_forecast(country, start=start, end=end)
load = client.query_load(country, start=start, end=end)

# Resample time-series into uniform hourly resolution for easy
# "integration" from power to energy (MH / MWh)
forecast = forecast.resample('H').agg('mean')
load = load.resample('H').agg('mean')

forecast['combined'] = forecast['Solar'].add(forecast['Wind Onshore'])

# Averages off all the metrics over the
load_mean = load['Actual Load'].mean()
solar_mean = forecast['Solar'].mean()
wind_mean = forecast['Wind Onshore'].mean()
combined_mean = forecast['combined'].mean()

# "Simulate" storage to reshape generation into desired output
sim = pd.DataFrame()
# Instant generation power is set to hourly renewable forecast
sim['gen_power'] = forecast['combined']
# Set target output to a constant fraction of the actual load that corresponds
# to the yearly share of energy production potential
sim['out_power'] = load['Actual Load'] * (combined_mean / load_mean)
# Compute surplus/shortfall power for each hour as the flow in and out of storage
sim['charge_discharge_power'] = sim['gen_power'].sub(sim['out_power'])
# Storage 'state of charge" as the running tab of charge & discharge contributions
sim['storage_soc'] = sim['charge_discharge_power'].cumsum()

total_energy = sim['out_power'].sum()
# Storage connection power needed is highes charge
# or discharge movement for any of the time intervals
storage_power = max(abs(sim['charge_discharge_power'].max()),
# Storage capacity is the delta between highest and lowest
# fill-levels of storage state of charge
storage_capacity = sim['storage_soc'].max() - sim['storage_soc'].min()
# Storage energy production is "integral" over outflows only
stored_energy = sim[sim['charge_discharge_power'] > 0]['charge_discharge_power'].sum()
storage_duration = storage_capacity / storage_power
storage_ratio = storage_capacity / total_energy
stored_energy_ratio = stored_energy / total_energy

print ('VRES generation potential for %s: %.2f%% of load (%.2f%% Solar / %.2f%% Wind)'
       % (country_name, combined_mean / load_mean * 100,
          solar_mean / load_mean * 100, wind_mean / load_mean * 100))
print ('Energy output %.3f TWh' % (total_energy / 1000000,))
print ('Storage dimensions: %.3f TWh @ %.3f GW (%d h duration)'
    % (storage_capacity / 1000000, storage_power / 1000, storage_duration))
print ('Storage capacity relative to produced energy %.2f%%'
       % (storage_ratio * 100,))
print ('Share of energy cycled through storage %.2f%%'
       % (stored_energy_ratio * 100))

Producing the following output:

VRES generation potential for Spain: 38.29% of load (13.28% Solar / 25.01% Wind)
Energy output 90.318 TWh
Storage dimensions: 3.306 TWh @ 17.575 GW (188 h duration)
Storage capacity relative to produced energy 3.66%
Share of energy cycled through storage 18.75%