featuretools.selection.remove_highly_null_features

featuretools.selection.remove_highly_null_features(feature_matrix, features=None, pct_null_threshold=0.95)[source]

Removes columns from a feature matrix that have higher than a set threshold of null values.

Parameters
  • feature_matrix (pd.DataFrame) – DataFrame whose columns are feature names and rows are instances.

  • features (list[featuretools.FeatureBase] or list[str], optional) – List of features to select.

  • pct_null_threshold (float) – If the percentage of NaN values in an input feature exceeds this amount, that feature will be considered highly-null. Defaults to 0.95.

Returns

The feature matrix and the list of generated feature definitions. Matches dfs output. If no feature list is provided as input, the feature list will not be returned.

Return type

pd.DataFrame, list[FeatureBase]