Skip to main content

Linear Programming with Energy Data

In the previous post, we made some back of the envelope estimation of what an optimal solar, wind & storage based power grid could look like. For the analysis we were using a standard linear programming model using open-data from the ENTSO-E transparency platform as well as a Python based linear programming (LP) framework.

For the LP framework we chose PuLP for its beginner friendly fluent and natural sounding model definition. See this tutorial for a more detailed introduction into LP with PuLP or this this  comparison of some popular Python LP frameworks. Understanding and debugging the output of an LP solver is non-obvious for casual users. Hence, being able to translate the mathematical model into what looks like natural python code can help to prevent mistakes during model setup.

The PuLP framework can interface with different external solver backends out of which the CBC solver is packaged with the PuLP distribution.

As input data, we can get time-series data with hourly granularity for a given year for the total load (electricity consumption) as well as the total forecasted production potential for solar, onshore and offshore wind with an hourly granularity as well. This grid-level granularity data essentially represents the demand and production capacity of a given grid-zone (typically a country) in aggregate.

The key question to ask the model is by how much would the existing capacity for each of the renewable source categories need to be scaled up such that it could satisfy the current demand in combination with an assumed storage system.

The objective of the model is to minimize the annual cost which can satisfy the operational balancing constraints during each hourly interval.

Since the forecasting data provides estimates of electricity production volume for each hour, we are using the levelized cost of electricity in Euro per MWh for each of the source types to determine to total cost of electricity produced, without knowing what would be installed generation capacity required. We are making the assumption that the current production profile or a country or region could be scaled up linearly while maintaining the same hourly distribution of generation over time.

For the storage model we are going to estimate the charge & discharge power and storage capacity respectively that would need to be installed to satisfy the constraints and are using and capital and operating cost estimate to translate these installed capacities into an annualised cost assuming  a 3% interest rate over 15 or 25 year amortisation periods for the capital cost. The storage system is broken up into  lithium-ion battery inspired short-duration storage (SDS) and H2 inspired long-duration storage (LDS) respectively. Which means the SDS systems has a single, shared charge and discharge capacity, while the LDS can be charged and discharged independently using different power conversion systems.

Which all this results in the following hard-coded cost assumptions:

Sources = enum.Enum('Sources', ['PV', 'ONW', 'OFFW'])
StorageConfig = namedtuple('StorageConfig', 'name cp_cost, dp_cost, c_loss, d_loss, capacity_cost')

# LCOE for renewable energy sources in Euro per MWh
# ( and
lcoe = {
    Sources.PV : 55,
    Sources.ONW : 60,
    Sources.OFFW : 90

# Annualised cost for storage technologies in Euro per MW / MWh respectively
# (
# / Appendix C)
inverter_cost = amount.calculate_amortization_amount(100000, 0.03, 15)
battery_cost = amount.calculate_amortization_amount(125000, 0.03, 15)
sds = StorageConfig('Li-ion', inverter_cost/2, inverter_cost/2, 0.05, 0.05,

h2_electrolyzer_cost = amount.calculate_amortization_amount(450000, 0.03, 25) + 9000
h2_ccgt_cost = amount.calculate_amortization_amount(750000, 0.03, 25) + 15000
h2_storage_cost = amount.calculate_amortization_amount(2000, 0.03, 25)
lds = StorageConfig('H2 P2X2P', h2_electrolyzer_cost, h2_ccgt_cost, 0.2, 0.4,

To express the cost function requires the definition of a few global variables for things like the global capacity and cost allocations, as well as time-series indexed variables to express constraints for each of the 8760 hourly intervals in a year.

The system constraints are relatively simple: the power balance requires that for each interval, the sum of all power generation and consumption needs to balance out. For each storage system the SOC constraint requires that the state of charge in the next interval is the current state of charge adjusted for any charge and discharge - taking into account losses.

At any time, capacity constraints need to be respected, i.e. state of charge cannot exceed the storage capacity or charge and discarge power cannot exceed the respective power capacity limits.

Storage should be a closed, cyclical system over the year which is arbitrarily enforced by making the state of charge after the last interval to be the same as the original one.

Finally the availability or coverage constraint defines what percentage of overall demand must be satisfied by the system.

model = pulp.LpProblem('RE_deployment', pulp.LpMinimize)

source_factors = pulp.LpVariable.dicts('source_factors', list(Sources), 0)
sds_power = pulp.LpVariable('sds_power', 0)
sds_capacity = pulp.LpVariable('sds_capacity', 0)
lds_c_power = pulp.LpVariable('lds_c_power', 0)
lds_d_power = pulp.LpVariable('lds_d_power', 0)
lds_capacity = pulp.LpVariable('lds_capacity', 0)
total_curtailment = pulp.LpVariable('total_curtailment', 0)
total_shortfall = pulp.LpVariable('total_shortfall', 0)

# Expressing the cost function to be minimised as the sum of all annual costs or energy production
# from all types of renewable sources as well as the storage capacity costs.
# Note the small bonus term for curtailed surplus production to avoid storage over-utilization
# artifacts.
model += (pulp.lpSum(source_factors[src] * source_total[src] * lcoe[src] for src in Sources)
          + sds_power * (sds.cp_cost + sds.dp_cost) + sds_capacity * sds.capacity_cost
          + lds_c_power * lds.cp_cost + lds_d_power * lds.dp_cost + lds_capacity * lds.capacity_cost
          - 10e-6 * total_curtailment
          ), 'minimize yearly cost'

# enumerate all the value of the time-series index
hours = list(range(ts_size))

# Create time-series indexed variables
sds_charge = pulp.LpVariable.dicts('sds_charge', hours, 0)
sds_discharge = pulp.LpVariable.dicts('sds_discharge', hours, 0)
sds_soc = pulp.LpVariable.dicts('sds_soc', list(range(ts_size + 1)), 0)
lds_charge = pulp.LpVariable.dicts('lds_charge', hours, 0)
lds_discharge = pulp.LpVariable.dicts('lds_discharge', hours, 0)
lds_soc = pulp.LpVariable.dicts('lds_soc', list(range(ts_size + 1)), 0)
curtailment = pulp.LpVariable.dicts('curtailment', hours, 0)
shortfall = pulp.LpVariable.dicts('shortfall', hours, 0)

for i in hours:
    # primary power balance constraint
    model += (pulp.lpSum(source_factors[src] * source_ts[src][i] for src in Sources)
              + sds_discharge[i] - sds_charge[i]
              + lds_discharge[i] - lds_charge[i]
              - curtailment[i] + shortfall[i] == load_ts[i])

    # storage charge/discharge power limit
    model += (sds_discharge[i] + sds_charge[i] <= sds_power)
    model += (lds_charge[i] <= lds_c_power)
    model += (lds_discharge[i] <= lds_d_power)

    # State of charge constraints
    model += (sds_soc[i] - sds_discharge[i] * (1 + sds.d_loss) + sds_charge[i] * (1 - sds.c_loss) == sds_soc[i + 1])
    model += (lds_soc[i] - lds_discharge[i] * (1 + lds.d_loss) + lds_charge[i] * (1 - lds.c_loss) == lds_soc[i + 1])

    # State of charge cannot exceed storage capacity
    model += (sds_soc[i] <= sds_capacity)
    model += (lds_soc[i] <= lds_capacity)

# Yearly cycle should represent a closed loop
model += (sds_soc[0] == sds_soc[ts_size])
model += (lds_soc[0] == lds_soc[ts_size])

# constraint for availability. I.e. how much of the demand should be met by this system.
model += (pulp.lpSum(shortfall[i] for i in hours) <= load_total * (1 - availability_factor))

# Convenience rollup from ts into total values for curtailment and shortfall
model += (pulp.lpSum(curtailment[i] for i in hours) - total_curtailment == 0)
model += (pulp.lpSum(shortfall[i] for i in hours) - total_shortfall == 0)

# solve
results = model.solve()

The load and production forecast data which are used as the constant parameters in the linear inequality constraints for each interval in the time-series can be obtained from ENTSO-E API. We resample the time series from potentially 15min granularity into hourly to facilitate an easy conversion between power levels and energy (1 MW for 1h -> 1MWh).

client = EntsoePandasClient(api_key='<api-key>')

country = 'NL'
start = pd.Timestamp('20220401', tz='Europe/Berlin')
end = pd.Timestamp('20230401', tz='Europe/Berlin')

forecast = client.query_wind_and_solar_forecast(country, start=start, end=end)
load = client.query_load(country, start=start, end=end)

# Align with hourly resolution for easy conversion from power to energy (MH / MWh)
forecast = forecast.resample('H').agg('mean').fillna(0)
load = load.resample('H').agg('mean').fillna(0)

load_ts = load['Actual Load']
load_total = load_ts.sum()

# create an empty/zero-valued time-series for missing source types
forecast['default'] = 0.0

source_ts = {
    Sources.PV: forecast.get('Solar', forecast['default']),
    Sources.ONW: forecast.get('Wind Onshore', forecast['default']),
    Sources.OFFW: forecast.get('Wind Offshore', forecast['default'])

source_total =dict([(source, source_ts[source].sum()) for source in Sources])

ts_size = min([load_ts.size] + [ts.size for ts in source_ts.values()])