featuretools.selection.remove_highly_null_features#
- featuretools.selection.remove_highly_null_features(feature_matrix, features=None, pct_null_threshold=0.95)[source]#
Removes columns from a feature matrix that have higher than a set threshold of null values.
- Parameters:
feature_matrix (
pd.DataFrame
) – DataFrame whose columns are feature names and rows are instances.features (list[
featuretools.FeatureBase
] or list[str], optional) – List of features to select.pct_null_threshold (float) – If the percentage of NaN values in an input feature exceeds this amount, that feature will be considered highly-null. Defaults to 0.95.
- Returns:
The feature matrix and the list of generated feature definitions. Matches dfs output. If no feature list is provided as input, the feature list will not be returned.
- Return type:
pd.DataFrame, list[
FeatureBase
]