energy_analysis_toolbox.thermosensitivity.thermosensitivity module#
Process the thermosensitivity data.
# Available classes
## ThermoSensitivity
Class to compute the thermosensitivity of a building. Needs a time series of energy consumption and outdoor temperature.
## CategoricalThermoSensitivity
Class to compute the thermosensitivity of a building with labeled periods. Needs a time series of energy consumption, outdoor temperature, and labeled periods.
The labeled periods are resampled to the same frequency as the energy and temperature data by taking the most common category in the period.
Currently, the class only calibrates one base temperature for all the categories aggregated.
# Implementation details
## Resampling frequency
The energy and temperature data are resampled at a given frequency. The degree days are computed at the same frequency.
## Thermo Sensitivity
The thermo-sensitivity is modelled as a linear regression between the energy consumption and the degree days.
\[E ~ E0 + TS \times DegreeDays\]
The degree days are computed from the temperature data and the base temperature.
\[\begin{split}DegreeDays = \\int max(0, BaseTemperature - T(t)) dt\end{split}\]
Different methods are available to compute the degree days:
- Integral: sum the difference between the base temperature and the temperature.
- \[\begin{split}DegreeDays = \\sum_{t=0}^N max(0, BaseTemperature - T(t))\end{split}\]
- Mean: sum the difference between the base temperature and the mean temperature.
- \[\begin{split}DegreeDays = max(0, BaseTemperature - \\bar{T} )\end{split}\]
MinMax: sum the difference between the base temperature and the mean temperature computed as the mean of the minimum and maximum temperature.
\[\begin{split}DegreeDays = max(0, BaseTemperature - \\frac{T_{min} + T_{max}}{2} )\end{split}\]
See the dd_compute function in the energy_analysis_toolbox.weather.degree_days module.
Over a long period, the data can present a thermosensitivity with different types of degree days:
Heating: the energy consumption increases when the temperature decreases. Usually, the base temperature is around 18°C.
Cooling: the energy consumption increases when the temperature increases. Usually, the base temperature is around 24°C.
## Auto-calibration
Two aspects of the thermosensitivity can be automatically detected:
The degree days type: heating, cooling, or both.
The base temperature.
### Degree days type
Each building, depending of the installed systems, can have different thermosensitivity types : - use of heating systems (heating degree days, during the winter) - use of cooling systems (cooling degree days, during the summer) - use of both systems (heating and cooling degree days)
The degree days type can be automatically detected by setting the degree_days_type
parameter to "auto". The method will compute the Spearman correlation between the
energy and the temperature for the periods with the mean temperature below and above
the intersaison mean temperature (default is 20°C).
### Base temperature
The base temperature can be calibrated by minimizing the mean squared error between the data and the model.
Each degree days type has a specific base temperature that is determined by analyzing the data over the corresponding periods. The heating (resp. cooling) base temperature is calibrated by minimizing the mean squared error between the energy and the heating (resp. cooling) degree days for the periods with the mean temperature below (resp. above) the intersaison mean temperature.
The optimization is done with the scipy.optimize.minimize_scalar function with the bounded method.
- class energy_analysis_toolbox.thermosensitivity.thermosensitivity.CategoricalThermoSensitivity(energy_data: Series, temperature_data: Series, categories: Series, frequency: str = '1D', degree_days_type: Literal['heating', 'cooling', 'both', 'auto'] = 'heating', degree_days_base_temperature: dict | None = None, degree_days_computation_method: Literal['min_max', 'mean', 'integral', 'pro'] = 'integral', interseason_mean_temperature: float = 20, base_logger_name: str | None = None, min_logger_level_stdout: int | str = 40)[source]#
Bases:
ThermoSensitivityClass to compute the thermosensitivity of a building with labeled periods.
Based on the ThermoSensitivity class.
- property categories: Series#
The categories of the periods.
- categories_name = 'category'#
- property resampled_categories: Series#
Resample categories at the specified frequency.
This method resamples the categorical data (self.categories) at the given frequency (self.frequency) and computes the most common category within each resampled period. If a period contains no data, it returns None for that period. The method uses an efficient aggregation approach by leveraging agg() and value_counts() to find the most frequent category.
The result is a pd.Series with the same frequency as the resampling, containing the most common category for each resampled time window.
This property is cached to avoid recomputing it multiple times, improving performance in case of repeated access.
- Returns:
pd.Series – A Series where each value corresponds to the most common category in the resampled period, indexed by the resampled time index.
Example
>>> self.frequency = '1D' # Daily resampling >>> self.categories = pd.Series(['A', 'B', 'A', 'A', 'B'], index=pd.date_range('2023-01-01', periods=5)) >>> resampled = self.resampled_categories >>> print(resampled) 2023-01-01 A 2023-01-02 A 2023-01-03 A 2023-01-04 B Freq: D, dtype: object
Note
In case there are multiple categories for one resampled period, the category assigned to the resampled period is the most common one.
- property resampled_energy_temperature_category: DataFrame#
The resampled energy, temperature and category data.
The DataFrame contains the resampled energy and temperature data. Periods with missing values are removed.
- class energy_analysis_toolbox.thermosensitivity.thermosensitivity.ThermoSensitivity(energy_data: Series, temperature_data: Series, frequency: str = '1D', degree_days_type: Literal['heating', 'cooling', 'both', 'auto'] = 'heating', degree_days_base_temperature: dict | None = None, degree_days_computation_method: Literal['min_max', 'mean', 'integral', 'pro'] = 'integral', interseason_mean_temperature: float = 20, base_logger_name: str | None = None, min_logger_level_stdout: int | str = 40)[source]#
Bases:
objectClass to compute the thermosensitivity of a building.
Examples
>>> from energy_analysis_toolbox.thermosensitivity import ThermoSensitivity >>> import pandas as pd >>> import numpy as np >>> np.random.seed(0) >>> temperature_data = 15 + 2*pd.Series(np.random.rand(366), ... index=pd.date_range("2020-01-01", "2020-12-31", freq="D")) >>> energy_data = 10 + (16 - temperature_data).clip(0) * 5 + np.random.rand(366) >>> ts = ThermoSensitivity(energy_data, temperature_data, degree_days_type="auto") >>> ts.fit() >>> ts ThermoSensitivity(frequency=1D, degree_days_type=heating, degree_days_base_temperature={"heating": 15.98}, degree_days_computation_method="integral", interseason_mean_temperature=20) OLS Regression Results ============================================================================== Dep. Variable: energy R-squared: 0.969 Model: OLS Adj. R-squared: 0.969 No. Observations: 366 F-statistic: 1.137e+04 Covariance Type: nonrobust Prob (F-statistic): 1.25e-276 ======================================================================================= coef std err t P>|t| [0.025 0.975] --------------------------------------------------------------------------------------- heating_degree_days 5.1177 0.048 106.638 0.000 5.023 5.212 Intercept 10.5120 0.019 539.733 0.000 10.474 10.550 ======================================================================================= Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
As you can see from the example above:
the type of thermosensitivity is automatically detected (heating in this case)
the base temperature is calibrated to 15.98 (true value is 16)
the model is fitted with a R-squared of 0.969
the heating degree days coefficient is 5.1177 (true value is 5)
the intercept is 10.5120 (true value is 10)
- property aggregated_data: DataFrame#
The aggregated data used to fit the model.
The Data is a DataFrame resampled at the provided Frequency with the following columns: - “energy”: the total energy at the frequency. - “temperature”: the mean temperature at the frequency. - (Optional) “heating_degree_days”: the heating degree days at the frequency. - (Optional) “cooling_degree_days”: the cooling degree days at the frequency.
- Raises:
ValueError – If the data is not aggregated. Use the fit method to aggregate the data.
- calibrate_base_temperature(dd_type: Literal['heating', 'cooling', 'both', 'auto'] = 'heating', t0: float | None = None, xatol: float = 0.1) float[source]#
Calibrate the base temperature for the specified degree days type.
This method optimizes the base temperature used to compute the degree days by minimizing the mean squared error between the energy consumption data and the degree days model. The optimization is done using the scipy.optimize.minimize_scalar function with a bounded method.
- Parameters:
dd_type (str, optional) – The type of degree days to calibrate, must be one of the following: - “heating”: to calibrate the base temperature for heating degree days. - “cooling”: to calibrate the base temperature for cooling degree days. The default is “heating”.
t0 (float, optional) – The initial guess for the base temperature. If not provided, the default initial guess is 16°C for heating or 24°C for cooling.
xatol (float, optional) – The absolute error tolerance for the optimization. This controls how precise the optimized base temperature needs to be. Default is 1e-1 (0.1°C).
- Returns:
float – The optimized base temperature for the specified degree days type.
- Raises:
ValueError – If the dd_type is invalid (not one of “heating” or “cooling”).
Example
>>> ts.calibrate_base_temperature(dd_type="heating", t0=15, xatol=0.05)
- calibrate_base_temperatures(t0_heating: float | None = None, t0_cooling: float | None = None, xatol: float = 0.1) None[source]#
Calibrate the base temperatures for both heating and cooling degree days.
This method optimizes the base temperatures for heating and/or cooling degree days by minimizing the mean squared error between the energy consumption data and the degree days model. The method will calibrate the base temperatures based on the detected or specified degree_days_type.
If the degree_days_type is “heating”, only the heating base temperature is calibrated. If it is “cooling”, only the cooling base temperature is calibrated. If it is “both”, both base temperatures are calibrated.
- Parameters:
t0_heating (float, optional) – The initial guess for the heating base temperature. If not provided, the default initial guess is 16°C.
t0_cooling (float, optional) – The initial guess for the cooling base temperature. If not provided, the default initial guess is 24°C.
xatol (float, optional) – The absolute error tolerance for the optimization. This controls how precise the optimized base temperatures need to be. Default is 1e-1 (0.1°C).
- Returns:
None – The method updates the degree_days_base_temperature attribute with the optimized base temperatures for heating and/or cooling.
- Raises:
ValueError – If the degree_days_type is invalid.
Example
>>> ts.calibrate_base_temperatures(t0_heating=15, t0_cooling=25, xatol=0.05)
- property energy_data: Series#
The energy data of the building.
The property is unmutable. To change the energy data, create a new object.
- fit() ThermosensitivityInstance[source]#
Train the model.
This method will:
Calibrate the base temperature if it is not set. See
calibrate_base_temperature.Aggregate the data. This consists of resampling the energy and temperature data and the computation of the degree days. See
_aggregate_data.Fit the thermosensitivity model. See
_fit_thermosensitivity.
- property frequency: str#
The frequency of the resampled data.
The property is unmutable. To change the frequency, create a new object.
- loss_function(t0: float, dd_type: Literal['heating', 'cooling'], resampled_energy: Series, raw_temperature: Series, frequency: str, mask: Series | None = None, degree_days_computation_method: Literal['min_max', 'mean', 'integral', 'pro'] = 'integral') float[source]#
Loss function for the optimization of the base temperature.
Compute the mean squared error (MSE) between the observed energy data and the energy model based on degree days. This function is used for calibrating the base temperature in the thermosensitivity model by finding the base temperature that minimizes this error.
The degree days are calculated based on the input temperature data, resampled to the desired frequency, and compared against the resampled energy data. The MSE is used as the objective function to optimize the base temperature.
- Parameters:
t0 (float) – Base temperature used to compute degree days.
dd_type (literal_dd_types) – Type of degree days to compute. Must be one of: - “heating”: degree days when temperatures are below the base temperature. - “cooling”: degree days when temperatures are above the base temperature.
resampled_energy (pd.Series) – The resampled time series of energy data to be modeled.
raw_temperature (pd.Series) – The original time series of temperature data, which is used to compute degree days based on the base temperature.
frequency (str) – The frequency to resample the data (e.g., “1D” for daily, “7D” for weekly).
mask (pd.Series or None, optional) – A boolean mask to filter the data before fitting the model. This allows for focusing on specific periods (e.g., only heating or cooling periods). Default is None.
degree_days_computation_method (literal_computation_dd_types, optional) – The method used to compute degree days. Must be one of: - “integral”: sum the difference between the base temperature and actual temperature. - “mean”: use the mean temperature for degree day calculations. - “min_max”: use the average of daily minimum and maximum temperatures. Default is “integral”.
- Returns:
float – The mean squared error (MSE) of the residuals between the observed energy data and the modeled energy data based on degree days.
Example
>>> loss = loss_function( t0=18.0, dd_type="heating", resampled_energy=energy_data, raw_temperature=temperature_data, frequency="1D", degree_days_computation_method="integral", ) >>> print(f"Calculated loss: {loss}")
- property model: RegressionResults#
The thermosensitivity model.
A statsmodels.regression.linear_model.RegressionResults object.
- Raises:
ValueError – If the model is not fitted. Use the fit method to train the model.
- property resampled_energy: Series#
The energy data resampled at the given frequency.
Uses the to_freq function from the energy_analysis_toolbox.energy.resample module to convert the energy data to the desired frequency.
This property is cached to avoid recomputing it multiple times.
- property resampled_energy_temperature: DataFrame#
The resampled energy and temperature data.
The DataFrame contains the resampled energy and temperature data. Periods with missing values are removed.
- property resampled_temperature: Series#
The temperature data resampled at the given frequency.
Average the temperature data over the given frequency.
This property is cached to avoid recomputing it multiple times.
- target_name = 'energy'#
- property temperature_data: Series#
The outdoor temperature data.
The property is unmutable. To change the temperature data, create a new object.
- temperature_name = 'temperature'#