API Reference¶
Demo Datasets¶
|
Returns the retail entityset example. |
|
Return dataframes of mock customer data |
|
Download, clean, and filter flight data from 2017. |
|
Load the Australian daily-min-temperatures weather dataset. |
Deep Feature Synthesis¶
|
Calculates a feature matrix and features given a dictionary of dataframes and a list of relationships. |
|
Returns two lists of primitives (transform and aggregation) containing primitives that can be applied to the specific target dataframe to create features. |
Timedelta¶
|
Represents differences in time. |
Time utils¶
|
Makes a set of equally spaced cutoff times prior to a set of input cutoffs and instance ids. |
Feature Primitives¶
A list of all Featuretools primitives can be obtained by visiting primitives.featurelabs.com.
Primitive Types¶
Feature for dataframe that is a based off one or more other features in that dataframe. |
|
Aggregation Primitives¶
|
Determines the total number of values, excluding NaN. |
|
Computes the average for a list of values. |
|
Calculates the total addition, ignoring NaN. |
|
Calculates the smallest value, ignoring NaN values. |
|
Calculates the highest value, ignoring NaN values. |
|
Computes the dispersion relative to the mean value, ignoring NaN. |
|
Determines the middlemost number in a list of values. |
|
Determines the most commonly repeated value. |
|
Computes the average number of seconds between consecutive events. |
|
Calculates the time elapsed since the last datetime (default in seconds). |
|
Calculates the time elapsed since the first datetime (in seconds). |
Determines the number of distinct values, ignoring NaN values. |
|
Determines the percent of True values. |
|
|
Calculates if all values are 'True' in a list. |
|
Determines if any value is 'True' in a list. |
|
Determines the first value in a list. |
|
Determines the last value in a list. |
|
Computes the extent to which a distribution differs from a normal distribution. |
|
Calculates the trend of a column over time. |
|
Calculates the entropy for a categorical column |
Transform Primitives¶
Combine features¶
|
Determines whether a value is present in a provided list. |
|
Element-wise logical AND of two lists. |
|
Element-wise logical OR of two lists. |
|
Negates a boolean value. |
General Transform Primitives¶
|
Computes the absolute value of a number. |
Computes the square root of a number. |
|
Computes the natural logarithm of a number. |
|
|
Computes the sine of a number. |
|
Computes the cosine of a number. |
|
Computes the tangent of a number. |
Determines the percentile rank for each value in a list. |
|
|
Calculates time from a value to a specified cutoff datetime. |
Datetime Transform Primitives¶
|
Determines the seconds value of a datetime. |
|
Determines the minutes value of a datetime. |
|
Determines the day of the week from a datetime. |
Determines the is_leap_year attribute of a datetime column. |
|
Determines the is_month_end attribute of a datetime column. |
|
Determines the is_month_start attribute of a datetime column. |
|
Determines the is_quarter_end attribute of a datetime column. |
|
Determines the is_quarter_start attribute of a datetime column. |
|
Determines if a date falls on a weekend. |
|
Determines if a date falls on the end of a year. |
|
Determines if a date falls on the start of a year. |
|
|
Determines the hour value of a datetime. |
|
Determines the day of the month from a datetime. |
Determines the ordinal day of the year from the given datetime |
|
Determines the day of the month from a datetime. |
|
|
Determines the week of the year from a datetime. |
|
Determines the month value of a datetime. |
Determines the part of day of a datetime. |
|
|
Determines the quarter a datetime column falls into (1, 2, 3, 4) |
|
Determines the year value of a datetime. |
Rolling Transform Primitives¶
|
Determines a rolling count of events over a given window. |
|
Determines the maximum of entries over a given window. |
|
Calculates the mean of entries over a given window. |
|
Determines the minimum of entries over a given window. |
|
Calculates the standard deviation of entries over a given window. |
NaturalLanguage Transform Primitives¶
Calculates the number of characters in a string. |
|
|
Determines the number of words in a string by counting the spaces. |
Location Transform Primitives¶
|
Calculates the distance between points in a city road grid. |
Determines the geographic center of two coordinates. |
|
|
Calculates the approximate haversine distance between two LatLong columns. |
|
Determines if coordinates are inside a box defined by two corner coordinate points. |
|
Returns the first tuple value in a list of LatLong tuples. |
Returns the second tuple value in a list of LatLong tuples. |
Cumulative Transform Primitives¶
|
Compute the difference between the value in a list and the previous value in that list. |
|
Compute the time since the previous entry in a list. |
|
Calculates the cumulative count. |
|
Calculates the cumulative sum. |
|
Calculates the cumulative mean. |
|
Calculates the cumulative minimum. |
|
Calculates the cumulative maximum. |
Natural Language Processing Primitives¶
Natural Language Processing primitives create features for textual data. For more information on how to use and install these primitives, see here.
Primitives in standard install¶
|
Determines how many times a given string shows up in a text field. |
Calculates the overall complexity of the text based on the total |
|
|
Calculates the Latent Semantic Analysis Values of NaturalLanguage Input |
Determines the mean number of characters per word. |
|
|
Determines the median word length. |
|
Calculates the number of unique separators. |
|
Determines the number of common words in a string. |
Calculates the occurences of each different part of speech. |
|
Calculates the polarity of a text on a scale from -1 (negative) to 1 (positive) |
|
Determines number of punctuation characters in a string. |
|
Determines number of stopwords in a string. |
|
Determines the number of title words in a string. |
|
|
Determines the total word length. |
Calculates the number of upper case letters in text. |
|
Calculates number of whitespaces in a string. |
Primitives that require installing tensorflow¶
|
Transforms a sentence or short paragraph using deep contextualized langauge representations. |
Transforms a sentence or short paragraph to a vector using [tfhub model](https://tfhub.dev/google/universal-sentence-encoder/2) |
Feature methods¶
|
Rename Feature, returns copy. |
|
Returns depth of feature |
Feature calculation¶
|
Calculates a matrix for a given set of instance ids and calculation times. |
Feature descriptions¶
|
Generates an English language description of a feature. |
Feature visualization¶
|
Generates a feature lineage graph for the given feature |
Feature encoding¶
|
Encode categorical features |
Feature Selection¶
|
Select features that have at least 2 unique values and that are not all null |
|
Removes columns in feature matrix that are highly correlated with another column. |
|
Removes columns from a feature matrix that have higher than a set threshold of null values. |
|
Removes columns in feature matrix where all the values are the same. |
Feature Matrix utils¶
|
Replace all |
Saving and Loading Features¶
|
Saves the features list as JSON to a specified filepath/S3 path, writes to an open file, or returns the serialized features as a JSON string. |
|
Loads the features from a filepath, S3 path, URL, an open file, or a JSON formatted string. |
EntitySet, Relationship¶
Constructors¶
|
Stores all actual data and typing information for an entityset |
|
Class to represent a relationship between dataframes |
EntitySet load and prepare data¶
|
Add a DataFrame to the EntitySet with Woodwork typing information. |
Find or set interesting values for categorical columns, to be used to generate "where" clauses |
|
Calculates the last time index values for each dataframe (the last time an instance or children of that instance were observed). |
|
|
Add a new relationship between dataframes in the entityset. |
|
Add multiple new relationships to a entityset |
|
Combine entityset with another to create a new entityset with the combined data of both entitysets. |
|
Create a new dataframe and relationship from unique values of an existing column. |
Set the secondary time index for a dataframe in the EntitySet using its dataframe name. |
|
|
Replace the internal dataframe of an EntitySet table, keeping Woodwork typing information the same. |
EntitySet serialization¶
|
Read entityset from disk, S3 path, or URL. |
|
Write entityset to disk in the csv format, location specified by path. |
|
Write entityset in the pickle format, location specified by path. |
|
Write entityset to disk in the parquet format, location specified by path. |
EntitySet query methods¶
|
Get dataframe instance from entityset |
Generator which yields all backward paths between a start and goal dataframe. |
|
Generator which yields all forward paths between a start and goal dataframe. |
|
|
Get dataframes that are in a forward relationship with dataframe |
|
Get dataframes that are in a backward relationship with dataframe |
|
Query instances that have column with given value |
EntitySet visualization¶
|
Create a UML diagram-ish graph of the EntitySet. |
Relationship attributes¶
Column in parent dataframe |
|
Column in child dataframe |
|
Parent dataframe object |
|
Child dataframe object |
Data Type Util Methods¶
Returns a dataframe describing all of the available Logical Types. |
|
Returns a dataframe describing all of the common semantic tags. |
Primitive Util Methods¶
Returns a DataFrame that lists and describes each built-in primitive. |
|
Returns a metrics summary DataFrame of all primitives found in list_primitives. |