featuretools.primitives.PercentChange#
- class featuretools.primitives.PercentChange(periods=1, fill_method='pad', limit=None, freq=None)[source]#
Determines the percent difference between values in a list.
- Description:
Given a list of numbers, return the percent difference between each subsequent number. Percentages are shown in decimal form (not multiplied by 100). Uses pandas’ pct_change function.
- Parameters:
periods (int) – Periods to shift for calculating percent change. Default is 1.
fill_method (str) – Method for filling gaps in reindexed Series. Valid options are backfill, bfill, pad, ffill. pad / ffill: fill gap with last valid observation. backfill / bfill: fill gap with next valid observation. Default is pad.
limit (int) – The max number of consecutive NaN values in a gap that can be filled. Default is None.
freq (DateOffset, timedelta, or offset alias string) –
If freq is specified, instead of calcualting change between subsequent points, PercentChange will calculate change between points with a certain interval between their date indices. freq defines the desired interval. When freq is used, the resulting index will also be filled to include any missing dates from the specified interval.
If the index is not date/datetime and freq is used, it will raise a NotImplementedError.
If freq is None, no changes will be applied. Default is None.
Examples
>>> percent_change = PercentChange() >>> percent_change([2, 5, 15, 3, 3, 9, 4.5]).to_list() [nan, 1.5, 2.0, -0.8, 0.0, 2.0, -0.5]
- We can control the number of periods to return the percent
difference between points further from one another.
>>> percent_change_2 = PercentChange(periods=2) >>> percent_change_2([2, 5, 15, 3, 3, 9, 4.5]).to_list() [nan, nan, 6.5, -0.4, -0.8, 2.0, 0.5]
We can control the method used to handle gaps in data.
>>> percent_change = PercentChange() >>> percent_change([2, 4, 8, None, 16, None, 32, None]).to_list() [nan, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0] >>> percent_change_backfill = PercentChange(fill_method='backfill') >>> percent_change_backfill([2, 4, 8, None, 16, None, 32, None]).to_list() [nan, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, nan]
We can also control the maximum number of NaN values to fill in a gap.
>>> percent_change = PercentChange() >>> percent_change([2, None, None, None, 4]).to_list() [nan, 0.0, 0.0, 0.0, 1.0] >>> percent_change_limited = PercentChange(limit=2) >>> percent_change_limited([2, None, None, None, 4]).to_list() [nan, 0.0, 0.0, nan, nan]
- Finally, we can specify a date frequency on which to calculate percent
change.
>>> import pandas as pd >>> dates = pd.DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-05']) >>> x_indexed = pd.Series([1, 2, 3, 4], index=dates) >>> percent_change = PercentChange() >>> percent_change(x_indexed).to_list() [nan, 1.0, 0.5, 0.33333333333333326] >>> date_offset = pd.tseries.offsets.DateOffset(days=1) >>> percent_change_freq = PercentChange(freq=date_offset) >>> percent_change_freq(x_indexed).to_list() [nan, 1.0, 0.5, nan]
Methods
__init__([periods, fill_method, limit, freq])flatten_nested_input_types(input_types)Flattens nested column schema inputs into a single list.
generate_name(base_feature_names)generate_names(base_feature_names)get_args_string()get_arguments()get_description(input_column_descriptions[, ...])get_filepath(filename)get_function()Attributes
base_ofbase_of_excludecommutativedefault_valueDefault value this feature returns if no data found.
description_templateinput_typeswoodwork.ColumnSchema types of inputs
max_stack_depthnameName of the primitive
number_output_featuresNumber of columns in feature matrix associated with this feature
return_typeColumnSchema type of return
stack_onstack_on_excludestack_on_selfuses_calc_timeuses_full_dataframe