energy_analysis_toolbox.power.overconsumption#

Module detecting unusual consumption.

This subpackage contains the functionalities dedicated to locating overconsumption periods in a power timeseries. It is divided in three subpackages documented below :

energy_analysis_toolbox.power.overconsumption.find#

Finds the overconsumption from a power series and a threshold.

energy_analysis_toolbox.power.overconsumption.find.from_power_threshold(power_series: Series, overshoot_tshd: Series | float, reference_energy_tshd: Series | float | None = None) DataFrame[source]#

Return a table of overconsumption where power_series is above overshoot_tshd.

The function also computes the overshoot energy as the energy of the difference between power_series and reference_energy_tshd during the overshoot overconsumption.

Parameters:
  • power_series (pd.Series) – A timeseries of power measures in (W). The power in each element is the averaged power during [ti, ti+1[ where ti and ti+1 are the indices of the considered and next elements.

  • overshoot_tshd (pd.Series or float) – The threshold in (W) over which the power is considered as over-consumption. In case a series is given, it should have the same index as power_series.

  • reference_energy_tshd (pd.Series or float or None) – A power in (W) to be subtracted from the power series in order to compute an “overshoot energy” for each interval. The default is None in which case overshoot_tshd is used. In case a series is given, it should have the same index as power_series.

Returns:

pd.DataFrame – A table of overconsumption with the following columns :

  • start : timestamp of the first instant of an overshoot interval;

  • end : timestamp of the first instant after an overshoot interval;

  • duration : duration in (s) of the overshoot interval. This is the difference in (s) between the start and end bounds.

  • energy : the energy associated to the difference between power_series and reference_energy_tshd during the [start, end[ interval.

Note

Why use two thresholds ?

When looking for overconsumption, the threshold defining the “anomaly” of the power may not be the reference VS which the overconsumption is computed. E.g., the overconsumption may be computed VS the average consumption while the abnormal overconsumption may be identified using variability-related thresholds.

This function provides both location and “overshoot-energy” computation for convenience.

energy_analysis_toolbox.power.overconsumption.transform#

Functions to transform and merge overconsumption intervals after they are located.

Provides functions for transforming overconsumption intervals after they have been located, including merging nearby intervals and recalculating their duration and energy contributions.

energy_analysis_toolbox.power.overconsumption.transform.merge_by_proximity(intervals_overshoot: DataFrame, min_interval: float = 600) DataFrame[source]#

Return a table where the overconsumption events too short are merged.

Parameters:
  • intervals_overshoot (pd.DataFrame) – A table of overshoot overconsumption with at least ‘start’, ‘end’ and ‘energy’ columns.

  • min_interval (float, optional) – The minimum duration in (s) to be imposed between two overshoot overconsumption. All overconsumption separated by a duration under this threshold are merged. Default is 600 seconds corresponding to 10 minutes.

Returns:

overconsumption (pd.DataFrame) – A table of overconsumption with ‘start’, ‘end’ and ‘energy’ obtained after merging the time-neighboring overconsumption in intervals_overshoot.

Note

The energy of the result of the merge between two overconsumption is the sum of the interval energies.

Notes

The function proceeds as follows :

  • [1] flatten the overconsumption to a table of timeseries for each variable. Fill all values between overconsumption with zeros.

  • [2] Recompute the durations so that the “duration” variable has the right value between the overshoot overconsumption.

  • [3] Drop all the rows for which the duration is under the defined threshold and the energy is 0. By construction, overshoot overconsumption have a non-zero energy (they are overshoots!) while other rows were filled with 0. Overshoots which were closed than the threshold are now contiguous in the timeseries.

  • [4] Re-extract the periods during which the energy is > 0 : the contiguous overconsumption are merged by this process.

  • [5] Recompute the duration and energy for the new overconsumption.

Some limit cases are managed :

  • empty data is returned directly

  • contiguous overconsumption (which are a limit limit case) are managed by dropping the “interstitial” row with duplicate index which appears in the flattened overconsumption. This case should not be encountered.

energy_analysis_toolbox.power.overconsumption.select#

Selects overconsumption according to various criteria.

The possible criteria to select overconsumption are the following:

  • by_individual_proportion selects those which energy content (over their reference) is beyond a certain proportion of the total overconsumption energy. It can be used to keep only “big enough” overconsumptions.

  • by_cumulated_proportion selects the minimum set of overconsumption which are necessary to explain a certain proportion of the total overconsumption energy. It can be used when one wants to keep all the overconsumption which explain “most of the overconsumption”, but can include small non-significant overshoots in certain cases.

  • by_combined_proportions is the combination of both previous approaches. It enables the selection of the most significant overconsumption to explain “most of the overconsumption”.

energy_analysis_toolbox.power.overconsumption.select.by_combined_proportions(intervals_overshoot: DataFrame, proportion_tshd: float = 0.8, proportion_indiv_tshd: float = 0.05, energy_reference: float | None = None) DataFrame[source]#

Return overshoot intervals by their energy contribution based on proportions.

Parameters:
  • intervals_overshoot (pd.DataFrame) – A table of overshoot overconsumption with at least ‘start’, ‘end’ and ‘energy’ columns.

  • proportion_tshd (float, optional) – Proportion (in [0,1]) of the total energy of the intervals_overshoot. Intervals are sorted by decreasing order of energy and conserved until this proportion - at least - of energy_reference is reached. Default is 0.8, meaning that the minimum set of intervals which represent at least 80% of the total overshoot energy is conserved.

  • proportion_indiv_tshd (float, optional) – Proportion (in [0,1]) of the total energy of the intervals_overshoot. An interval is conserved by the function only if it represents at least this proportion of the energy_reference. Default is 5%.

  • energy_reference (float or None, optional) – The total energy (in the same unit as the values in the "energy" column, relatively to which the proportions are computed. Default is None, in which case the sum of the column values is used.

Returns:

pd.DataFrame – A copy of the input dataframe with additional “proportion” and “cum_energy_prop” columns, where only the overconsumption which represent together and individually at least the specified proportions of the total energy, are conserved. The returned overconsumption are sorted by decreasing order of overshoot energy.

energy_analysis_toolbox.power.overconsumption.select.by_cumulated_proportion(intervals_overshoot: DataFrame, proportion_tshd: float | None = 0.8, energy_reference: float | None = None) DataFrame[source]#

Return overconsumption where total energy is a proportion of the overshoot.

Parameters:
  • intervals_overshoot (pd.DataFrame) – A table of overshoot overconsumption with at least ‘start’, ‘end’ and ‘energy’ columns.

  • proportion_tshd (float, optional) – Proportion (in [0,1]) of the total energy of the intervals_overshoot. Intervals are sorted by decreasing order of energy and conserved until this proportion - at least (>=) - of the total energy is reached. Default is 0.8, meaning that the minimum set of overconsumption which represent at least 80% of the total overshoot energy is conserved.

  • energy_reference (float or None, optional) – The total energy (in the same unit as the values in the "energy" column), relatively to which the proportions are computed. Default is None, in which case the sum of the column values is used.

Returns:

pd.DataFrame – A copy of the input dataframe with an additional “cum_energy_prop” column, where only the overconsumption which represent together at least the specified proportion of the total energy are conserved. The returned overconsumption are sorted by decreasing order of overshoot energy.

energy_analysis_toolbox.power.overconsumption.select.by_individual_proportion(intervals_overshoot: DataFrame, proportion_tshd: float = 0.05, energy_reference: float | None = None) DataFrame[source]#

Return overconsumption for which overshoot is above a proportion of the total.

Parameters:
  • intervals_overshoot (pd.DataFrame) – A table of overshoot overconsumption with at least ‘start’, ‘end’ and ‘energy’ columns.

  • proportion_tshd (float, optional) – Proportion (in [0,1]) of the total energy of the intervals_overshoot. An interval is conserved by the function only if it represents at least (>=) this proportion of the total. Default is 5%.

  • energy_reference (float or None, optional) – The total energy (in the same unit as the values in the "energy" column, relatively to which the proportions are computed. Default is None, in which case the sum of the column values is used.

Returns:

pd.DataFrame – A copy of the input dataframe with an additional “proportion” column, where only the overconsumption which represent at least the specified proportion of the total energy are conserved. The returned overconsumption are sorted by decreasing order of overshoot energy.