featuretools.primitives.NMostCommonFrequency#

class featuretools.primitives.NMostCommonFrequency(n=3, skipna=True)[source]#

Determines the frequency of the n most common items.

Parameters:
  • n (int) – defines “n” in “n most common”. Defaults to 3.

  • skipna (bool) – Determines if to use NA/null values. Defaults to True to skip NA/null.

Description:

Given a list, find the n most common items, and return a series showing the frequency of each item. If the list has less than n unique values, the resulting series will be padded with nan.

Examples

>>> n_most_common_frequency = NMostCommonFrequency()
>>> n_most_common_frequency([1, 1, 1, 2, 2, 3, 4, 4]).to_list()
[3, 2, 2]

We can increase n to include more items.

>>> n_most_common_frequency = NMostCommonFrequency(4)
>>> n_most_common_frequency([1, 1, 1, 2, 2, 3, 4, 4]).to_list()
[3, 2, 2, 1]

NaNs are skipped by default.

>>> n_most_common_frequency = NMostCommonFrequency(3)
>>> n_most_common_frequency([1, 1, 1, 2, 2, 3, 4, 4, None, None, None]).to_list()
[3, 2, 2]

However, the way NaNs are treated can be controlled.

>>> n_most_common_frequency = NMostCommonFrequency(3, skipna=False)
>>> n_most_common_frequency([1, 1, 1, 2, 2, 3, 4, 4, None, None, None]).to_list()
[3, 3, 2]
__init__(n=3, skipna=True)[source]#

Methods

__init__([n, skipna])

flatten_nested_input_types(input_types)

Flattens nested column schema inputs into a single list.

generate_name(base_feature_names, ...)

generate_names(base_feature_names, ...)

get_args_string()

get_arguments()

get_description(input_column_descriptions[, ...])

get_filepath(filename)

get_function()

Attributes

base_of

base_of_exclude

commutative

default_value

Default value this feature returns if no data found.

description_template

input_types

woodwork.ColumnSchema types of inputs

max_stack_depth

name

Name of the primitive

number_output_features

Number of columns in feature matrix associated with this feature

return_type

ColumnSchema type of return

stack_on

stack_on_exclude

stack_on_self

uses_calc_time