featuretools.
EntitySet
Stores all actual data and typing information for an entityset
id
dataframe_dict
relationships
time_type
metadata
__init__
Creates EntitySet
id (str) – Unique identifier to associate with this instance
dataframes (dict[str -> tuple(DataFrame, str, str, dict[str -> str/Woodwork.LogicalType], dict[str->str/set], boolean)]) – Dictionary of DataFrames. Entries take the format {dataframe name -> (dataframe, index column, time_index, logical_types, semantic_tags, make_index)}. Note that only the dataframe is required. If a Woodwork DataFrame is supplied, any other parameters will be ignored.
relationships (list[(str, str, str, str)]) – List of relationships between dataframes. List items are a tuple with the format (parent dataframe name, parent column, child dataframe name, child column).
Example
dataframes = { "cards" : (card_df, "id"), "transactions" : (transactions_df, "id", "transaction_time") } relationships = [("cards", "id", "transactions", "card_id")] ft.EntitySet("my-entity-set", dataframes, relationships)
Methods
__init__([id, dataframes, relationships])
add_dataframe(dataframe[, dataframe_name, …])
add_dataframe
Add a DataFrame to the EntitySet with Woodwork typing information.
add_interesting_values([max_values, …])
add_interesting_values
Find or set interesting values for categorical columns, to be used to generate “where” clauses
add_last_time_indexes([updated_dataframes])
add_last_time_indexes
Calculates the last time index values for each dataframe (the last time an instance or children of that instance were observed).
add_relationship([parent_dataframe_name, …])
add_relationship
Add a new relationship between dataframes in the entityset.
add_relationships(relationships)
add_relationships
Add multiple new relationships to a entityset
concat(other[, inplace])
concat
Combine entityset with another to create a new entityset with the combined data of both entitysets.
find_backward_paths(start_dataframe_name, …)
find_backward_paths
Generator which yields all backward paths between a start and goal dataframe.
find_forward_paths(start_dataframe_name, …)
find_forward_paths
Generator which yields all forward paths between a start and goal dataframe.
get_backward_dataframes(dataframe_name[, deep])
get_backward_dataframes
Get dataframes that are in a backward relationship with dataframe
get_backward_relationships(dataframe_name)
get_backward_relationships
get relationships where dataframe “dataframe_name” is the parent.
get_forward_dataframes(dataframe_name[, deep])
get_forward_dataframes
Get dataframes that are in a forward relationship with dataframe
get_forward_relationships(dataframe_name)
get_forward_relationships
Get relationships where dataframe “dataframe_name” is the child
has_unique_forward_path(…)
has_unique_forward_path
Is the forward path from start to end unique?
normalize_dataframe(base_dataframe_name, …)
normalize_dataframe
Create a new dataframe and relationship from unique values of an existing column.
plot([to_file])
plot
Create a UML diagram-ish graph of the EntitySet.
query_by_values(dataframe_name, instance_vals)
query_by_values
Query instances that have column with given value
replace_dataframe(dataframe_name, df[, …])
replace_dataframe
Replace the internal dataframe of an EntitySet table, keeping Woodwork typing information the same.
reset_data_description()
reset_data_description
set_secondary_time_index(dataframe_name, …)
set_secondary_time_index
Set the secondary time index for a dataframe in the EntitySet using its dataframe name.
to_csv(path[, sep, encoding, engine, …])
to_csv
Write entityset to disk in the csv format, location specified by path.
to_dictionary()
to_dictionary
to_parquet(path[, engine, compression, …])
to_parquet
Write entityset to disk in the parquet format, location specified by path.
to_pickle(path[, compression, profile_name])
to_pickle
Write entityset in the pickle format, location specified by path.
Attributes
dataframe_type
String specifying the library used for the dataframes.
dataframes
Returns the metadata for this EntitySet.