API Reference¶
Demo Datasets¶
  | 
Returns the retail entityset example.  | 
  | 
Return dataframes of mock customer data  | 
  | 
Download, clean, and filter flight data from 2017.  | 
  | 
Load the Australian daily-min-temperatures weather dataset.  | 
Deep Feature Synthesis¶
  | 
Calculates a feature matrix and features given a dictionary of dataframes and a list of relationships.  | 
  | 
Returns two lists of primitives (transform and aggregation) containing primitives that can be applied to the specific target dataframe to create features.  | 
Timedelta¶
  | 
Represents differences in time.  | 
Time utils¶
  | 
Makes a set of equally spaced cutoff times prior to a set of input cutoffs and instance ids.  | 
Feature Primitives¶
A list of all Featuretools primitives can be obtained by visiting primitives.featurelabs.com.
Primitive Types¶
Feature for dataframe that is a based off one or more other features in that dataframe.  | 
|
Aggregation Primitives¶
  | 
Determines the total number of values, excluding NaN.  | 
  | 
Computes the average for a list of values.  | 
  | 
Calculates the total addition, ignoring NaN.  | 
  | 
Calculates the smallest value, ignoring NaN values.  | 
  | 
Calculates the highest value, ignoring NaN values.  | 
  | 
Computes the dispersion relative to the mean value, ignoring NaN.  | 
  | 
Determines the middlemost number in a list of values.  | 
  | 
Determines the most commonly repeated value.  | 
  | 
Computes the average number of seconds between consecutive events.  | 
  | 
Calculates the time elapsed since the last datetime (default in seconds).  | 
  | 
Calculates the time elapsed since the first datetime (in seconds).  | 
Determines the number of distinct values, ignoring NaN values.  | 
|
Determines the percent of True values.  | 
|
  | 
Calculates if all values are 'True' in a list.  | 
  | 
Determines if any value is 'True' in a list.  | 
  | 
Determines the first value in a list.  | 
  | 
Determines the last value in a list.  | 
  | 
Computes the extent to which a distribution differs from a normal distribution.  | 
  | 
Calculates the trend of a column over time.  | 
  | 
Calculates the entropy for a categorical column  | 
Transform Primitives¶
Combine features¶
  | 
Determines whether a value is present in a provided list.  | 
  | 
Element-wise logical AND of two lists.  | 
  | 
Element-wise logical OR of two lists.  | 
  | 
Negates a boolean value.  | 
General Transform Primitives¶
  | 
Computes the absolute value of a number.  | 
Computes the square root of a number.  | 
|
Computes the natural logarithm of a number.  | 
|
  | 
Computes the sine of a number.  | 
  | 
Computes the cosine of a number.  | 
  | 
Computes the tangent of a number.  | 
Determines the percentile rank for each value in a list.  | 
|
  | 
Calculates time from a value to a specified cutoff datetime.  | 
Datetime Transform Primitives¶
  | 
Determines the seconds value of a datetime.  | 
  | 
Determines the minutes value of a datetime.  | 
  | 
Determines the day of the week from a datetime.  | 
Determines the is_leap_year attribute of a datetime column.  | 
|
Determines the is_month_end attribute of a datetime column.  | 
|
Determines the is_month_start attribute of a datetime column.  | 
|
Determines the is_quarter_end attribute of a datetime column.  | 
|
Determines the is_quarter_start attribute of a datetime column.  | 
|
Determines if a date falls on a weekend.  | 
|
Determines if a date falls on the end of a year.  | 
|
Determines if a date falls on the start of a year.  | 
|
  | 
Determines the hour value of a datetime.  | 
  | 
Determines the day of the month from a datetime.  | 
Determines the ordinal day of the year from the given datetime  | 
|
Determines the day of the month from a datetime.  | 
|
  | 
Determines the week of the year from a datetime.  | 
  | 
Determines the month value of a datetime.  | 
Determines the part of day of a datetime.  | 
|
  | 
Determines the quarter a datetime column falls into (1, 2, 3, 4)  | 
  | 
Determines the year value of a datetime.  | 
Rolling Transform Primitives¶
  | 
Determines a rolling count of events over a given window.  | 
  | 
Determines the maximum of entries over a given window.  | 
  | 
Calculates the mean of entries over a given window.  | 
  | 
Determines the minimum of entries over a given window.  | 
  | 
Calculates the standard deviation of entries over a given window.  | 
NaturalLanguage Transform Primitives¶
Calculates the number of characters in a string.  | 
|
  | 
Determines the number of words in a string by counting the spaces.  | 
Location Transform Primitives¶
  | 
Calculates the distance between points in a city road grid.  | 
Determines the geographic center of two coordinates.  | 
|
  | 
Calculates the approximate haversine distance between two LatLong columns.  | 
  | 
Determines if coordinates are inside a box defined by two corner coordinate points.  | 
  | 
Returns the first tuple value in a list of LatLong tuples.  | 
Returns the second tuple value in a list of LatLong tuples.  | 
Cumulative Transform Primitives¶
  | 
Compute the difference between the value in a list and the previous value in that list.  | 
  | 
Compute the time since the previous entry in a list.  | 
  | 
Calculates the cumulative count.  | 
  | 
Calculates the cumulative sum.  | 
  | 
Calculates the cumulative mean.  | 
  | 
Calculates the cumulative minimum.  | 
  | 
Calculates the cumulative maximum.  | 
Natural Language Processing Primitives¶
Natural Language Processing primitives create features for textual data. For more information on how to use and install these primitives, see here.
Primitives in standard install¶
  | 
Determines how many times a given string shows up in a text field.  | 
Calculates the overall complexity of the text based on the total  | 
|
  | 
Calculates the Latent Semantic Analysis Values of NaturalLanguage Input  | 
Determines the mean number of characters per word.  | 
|
  | 
Determines the median word length.  | 
  | 
Calculates the number of unique separators.  | 
  | 
Determines the number of common words in a string.  | 
Calculates the occurences of each different part of speech.  | 
|
Calculates the polarity of a text on a scale from -1 (negative) to 1 (positive)  | 
|
Determines number of punctuation characters in a string.  | 
|
Determines number of stopwords in a string.  | 
|
Determines the number of title words in a string.  | 
|
  | 
Determines the total word length.  | 
Calculates the number of upper case letters in text.  | 
|
Calculates number of whitespaces in a string.  | 
Primitives that require installing tensorflow¶
  | 
Transforms a sentence or short paragraph using deep contextualized langauge representations.  | 
Transforms a sentence or short paragraph to a vector using [tfhub model](https://tfhub.dev/google/universal-sentence-encoder/2)  | 
Feature methods¶
  | 
Rename Feature, returns copy.  | 
  | 
Returns depth of feature  | 
Feature calculation¶
  | 
Calculates a matrix for a given set of instance ids and calculation times.  | 
Feature descriptions¶
  | 
Generates an English language description of a feature.  | 
Feature visualization¶
  | 
Generates a feature lineage graph for the given feature  | 
Feature encoding¶
  | 
Encode categorical features  | 
Feature Selection¶
  | 
Select features that have at least 2 unique values and that are not all null  | 
  | 
Removes columns in feature matrix that are highly correlated with another column.  | 
  | 
Removes columns from a feature matrix that have higher than a set threshold of null values.  | 
  | 
Removes columns in feature matrix where all the values are the same.  | 
Feature Matrix utils¶
  | 
Replace all   | 
Saving and Loading Features¶
  | 
Saves the features list as JSON to a specified filepath/S3 path, writes to an open file, or returns the serialized features as a JSON string.  | 
  | 
Loads the features from a filepath, S3 path, URL, an open file, or a JSON formatted string.  | 
EntitySet, Relationship¶
Constructors¶
  | 
Stores all actual data and typing information for an entityset  | 
  | 
Class to represent a relationship between dataframes  | 
EntitySet load and prepare data¶
  | 
Add a DataFrame to the EntitySet with Woodwork typing information.  | 
Find or set interesting values for categorical columns, to be used to generate "where" clauses  | 
|
Calculates the last time index values for each dataframe (the last time an instance or children of that instance were observed).  | 
|
  | 
Add a new relationship between dataframes in the entityset.  | 
  | 
Add multiple new relationships to a entityset  | 
  | 
Combine entityset with another to create a new entityset with the combined data of both entitysets.  | 
  | 
Create a new dataframe and relationship from unique values of an existing column.  | 
Set the secondary time index for a dataframe in the EntitySet using its dataframe name.  | 
|
  | 
Replace the internal dataframe of an EntitySet table, keeping Woodwork typing information the same.  | 
EntitySet serialization¶
  | 
Read entityset from disk, S3 path, or URL.  | 
  | 
Write entityset to disk in the csv format, location specified by path.  | 
  | 
Write entityset in the pickle format, location specified by path.  | 
  | 
Write entityset to disk in the parquet format, location specified by path.  | 
EntitySet query methods¶
  | 
Get dataframe instance from entityset  | 
Generator which yields all backward paths between a start and goal dataframe.  | 
|
Generator which yields all forward paths between a start and goal dataframe.  | 
|
  | 
Get dataframes that are in a forward relationship with dataframe  | 
  | 
Get dataframes that are in a backward relationship with dataframe  | 
  | 
Query instances that have column with given value  | 
EntitySet visualization¶
  | 
Create a UML diagram-ish graph of the EntitySet.  | 
Relationship attributes¶
Column in parent dataframe  | 
|
Column in child dataframe  | 
|
Parent dataframe object  | 
|
Child dataframe object  | 
Data Type Util Methods¶
Returns a dataframe describing all of the available Logical Types.  | 
|
Returns a dataframe describing all of the common semantic tags.  | 
Primitive Util Methods¶
Returns a DataFrame that lists and describes each built-in primitive.  | 
|
Returns a metrics summary DataFrame of all primitives found in list_primitives.  |