featuretools.EntitySet.normalize_dataframe

EntitySet.normalize_dataframe(base_dataframe_name, new_dataframe_name, index, additional_columns=None, copy_columns=None, make_time_index=None, make_secondary_time_index=None, new_dataframe_time_index=None, new_dataframe_secondary_time_index=None)[source]

Create a new dataframe and relationship from unique values of an existing column.

Parameters
  • base_dataframe_name (str) – Datarame name from which to split.

  • new_dataframe_name (str) – Name of the new dataframe.

  • index (str) – Column in old dataframe that will become index of new dataframe. Relationship will be created across this column.

  • additional_columns (list[str]) – List of column names to remove from base_dataframe and move to new dataframe.

  • copy_columns (list[str]) – List of column names to copy from old dataframe and move to new dataframe.

  • make_time_index (bool or str, optional) – Create time index for new dataframe based on time index in base_dataframe, optionally specifying which column in base_dataframe to use for time_index. If specified as True without a specific column name, uses the primary time index. Defaults to True if base dataframe has a time index.

  • make_secondary_time_index (dict[str -> list[str]], optional) – Create a secondary time index from key. Values of dictionary are the columns to associate with a secondary time index. Only one secondary time index is allowed. If None, only associate the time index.

  • new_dataframe_time_index (str, optional) – Rename new dataframe time index.

  • new_dataframe_secondary_time_index (str, optional) – Rename new dataframe secondary time index.