NOTICE

The upcoming release of Featuretools 1.0.0 contains several breaking changes. Users are encouraged to test this version prior to release by installing from GitHub:

pip install https://github.com/alteryx/featuretools/archive/woodwork-integration.zip

For details on migrating to the new version, refer to Transitioning to Featuretools Version 1.0. Please report any issues in the Featuretools GitHub repo or by messaging in Alteryx Open Source Slack.


featuretools.selection.remove_highly_null_features

featuretools.selection.remove_highly_null_features(feature_matrix, features=None, pct_null_threshold=0.95)[source]

Removes columns from a feature matrix that have higher than a set threshold of null values.

Parameters
  • feature_matrix (pd.DataFrame) – DataFrame whose columns are feature names and rows are instances.

  • features (list[featuretools.FeatureBase] or list[str], optional) – List of features to select.

  • pct_null_threshold (float) – If the percentage of NaN values in an input feature exceeds this amount, that feature will be considered highly-null. Defaults to 0.95.

Returns

The feature matrix and the list of generated feature definitions. Matches dfs output. If no feature list is provided as input, the feature list will not be returned.

Return type

pd.DataFrame, list[FeatureBase]