featuretools.primitives.NumberOfHashtags#
- class featuretools.primitives.NumberOfHashtags[source]#
- Determines the number of hashtags in a string. - Description:
- Given a list of strings, determine the number of hashtags in each string. - A hashtag is defined as a string that meets the following criteria:
- Starts with a ‘#’ character, followed by a sequence of alphanumeric characters containing at least one alphabetic character 
- Present at the start of a string or after whitespace 
- Terminated by the end of the string, a whitespace, or a punctuation character other than ‘#’
- e.g. The string ‘#yes-no’ contains a valid hashtag (‘#yes’) 
- e.g. The string ‘#yes#’ does not contain a valid hashtag 
 
 
 
 - This implementation handles Unicode characters. - This implementation does not impose any character limit on hashtags. - If a string is missing, return NaN. 
 - Examples - >>> x = ['#regular #expression', 'this is a string', '###__regular#1and_0#expression'] >>> number_of_hashtags = NumberOfHashtags() >>> number_of_hashtags(x).tolist() [2.0, 0.0, 0.0] - Methods - __init__()- flatten_nested_input_types(input_types)- Flattens nested column schema inputs into a single list. - generate_name(base_feature_names)- generate_names(base_feature_names)- get_args_string()- get_arguments()- get_description(input_column_descriptions[, ...])- get_filepath(filename)- get_function()- process_text(text)- Attributes - base_of- base_of_exclude- commutative- default_value- Default value this feature returns if no data found. - description_template- input_types- woodwork.ColumnSchema types of inputs - max_stack_depth- name- Name of the primitive - number_output_features- Number of columns in feature matrix associated with this feature - return_type- ColumnSchema type of return - stack_on- stack_on_exclude- stack_on_self- uses_calc_time- uses_full_dataframe