NOTICE

The upcoming release of Featuretools 1.0.0 contains several breaking changes. Users are encouraged to test this version prior to release by installing from GitHub:

pip install https://github.com/alteryx/featuretools/archive/woodwork-integration.zip

For details on migrating to the new version, refer to Transitioning to Featuretools Version 1.0. Please report any issues in the Featuretools GitHub repo or by messaging in Alteryx Open Source Slack.

featuretools.EntitySet¶

class featuretools.EntitySet(id=None, dataframes=None, relationships=None)[source]¶

Stores all actual data and typing information for an entityset

id¶

dataframe_dict¶

relationships¶

time_type¶

Properties:: metadata

__init__(id=None, dataframes=None, relationships=None)[source]¶

Creates EntitySet

Parameters

id (str) – Unique identifier to associate with this instance
dataframes (dict[str -> tuple(DataFrame, str, str, dict[str -> str/Woodwork.LogicalType], dict[str->str/set], boolean)]) – Dictionary of DataFrames. Entries take the format {dataframe name -> (dataframe, index column, time_index, logical_types, semantic_tags, make_index)}. Note that only the dataframe is required. If a Woodwork DataFrame is supplied, any other parameters will be ignored.
relationships (list[(str, str, str, str)]) – List of relationships between dataframes. List items are a tuple with the format (parent dataframe name, parent column, child dataframe name, child column).

Example

dataframes = {
    "cards" : (card_df, "id"),
    "transactions" : (transactions_df, "id", "transaction_time")
}

relationships = [("cards", "id", "transactions", "card_id")]

ft.EntitySet("my-entity-set", dataframes, relationships)

Methods

`__init__`([id, dataframes, relationships])	Creates EntitySet
`add_dataframe`(dataframe[, dataframe_name, …])	Add a DataFrame to the EntitySet with Woodwork typing information.
`add_interesting_values`([max_values, …])	Find or set interesting values for categorical columns, to be used to generate “where” clauses
`add_last_time_indexes`([updated_dataframes])	Calculates the last time index values for each dataframe (the last time an instance or children of that instance were observed).
`add_relationship`([parent_dataframe_name, …])	Add a new relationship between dataframes in the entityset.
`add_relationships`(relationships)	Add multiple new relationships to a entityset
`concat`(other[, inplace])	Combine entityset with another to create a new entityset with the combined data of both entitysets.
`find_backward_paths`(start_dataframe_name, …)	Generator which yields all backward paths between a start and goal dataframe.
`find_forward_paths`(start_dataframe_name, …)	Generator which yields all forward paths between a start and goal dataframe.
`get_backward_dataframes`(dataframe_name[, deep])	Get dataframes that are in a backward relationship with dataframe
`get_backward_relationships`(dataframe_name)	get relationships where dataframe “dataframe_name” is the parent.
`get_forward_dataframes`(dataframe_name[, deep])	Get dataframes that are in a forward relationship with dataframe
`get_forward_relationships`(dataframe_name)	Get relationships where dataframe “dataframe_name” is the child
`has_unique_forward_path`(…)	Is the forward path from start to end unique?
`normalize_dataframe`(base_dataframe_name, …)	Create a new dataframe and relationship from unique values of an existing column.
`plot`([to_file])	Create a UML diagram-ish graph of the EntitySet.
`query_by_values`(dataframe_name, instance_vals)	Query instances that have column with given value
`replace_dataframe`(dataframe_name, df[, …])	Replace the internal dataframe of an EntitySet table, keeping Woodwork typing information the same.
`reset_data_description`()
`set_secondary_time_index`(dataframe_name, …)	Set the secondary time index for a dataframe in the EntitySet using its dataframe name.
`to_csv`(path[, sep, encoding, engine, …])	Write entityset to disk in the csv format, location specified by path.
`to_dictionary`()
`to_parquet`(path[, engine, compression, …])	Write entityset to disk in the parquet format, location specified by path.
`to_pickle`(path[, compression, profile_name])	Write entityset in the pickle format, location specified by path.

Attributes

`dataframe_type`	String specifying the library used for the dataframes.
`dataframes`
`metadata`	Returns the metadata for this EntitySet.

featuretools.load_features featuretools.Relationship