featuretools.selection.remove_highly_null_features#

featuretools.selection.remove_highly_null_features(feature_matrix, features=None, pct_null_threshold=0.95)[source]#

Removes columns from a feature matrix that have higher than a set threshold of null values.

Parameters:
  • feature_matrix (pd.DataFrame) – DataFrame whose columns are feature names and rows are instances.

  • features (list[featuretools.FeatureBase] or list[str], optional) – List of features to select.

  • pct_null_threshold (float) – If the percentage of NaN values in an input feature exceeds this amount, that feature will be considered highly-null. Defaults to 0.95.

Returns:

The feature matrix and the list of generated feature definitions. Matches dfs output. If no feature list is provided as input, the feature list will not be returned.

Return type:

pd.DataFrame, list[FeatureBase]