featuretools.primitives.NumUniqueSeparators#

class featuretools.primitives.NumUniqueSeparators(separators=[' ', '.', ',', '!', '?', ';', '\n'])[源码]#

计算唯一分隔符的数量。

说明: 给定一个字符串和一个分隔符列表，确定每个字符串中唯一分隔符的数量。如果字符串由 pd.isnull 确定为 null，则返回 pd.NA。

参数:: separators (list, optional) – 要计数的分割符字符列表。默认使用 [" ", ".", ",", "!", "?", ";", "\n"]。

示例

>>> x = ["First. Line.", "This. is the second, line!", "notinlist@#$%^%&"]
>>> num_unique_separators = NumUniqueSeparators([".", ",", "!"])
>>> num_unique_separators(x).tolist()
[1, 3, 0]

__init__(separators=[' ', '.', ',', '!', '?', ';', '\n'])[源码]#

方法

`__init__`([separators])
`flatten_nested_input_types`(input_types)	将嵌套的列 schema 输入展平为单个列表。
`generate_name`(base_feature_names)
`generate_names`(base_feature_names)
`get_args_string`()
`get_arguments`()
`get_description`(input_column_descriptions[, ...])
`get_filepath`(filename)
`get_function`()

属性

`base_of`
`base_of_exclude`
`commutative`
`default_value`	如果未找到数据，此特征返回的默认值。
`description_template`
`input_types`	woodwork.ColumnSchema 输入类型
`max_stack_depth`
`name`	原始方法的名称
`number_output_features`	与此特征相关的特征矩阵中的列数
`return_type`	ColumnSchema 返回类型
`stack_on`
`stack_on_exclude`
`stack_on_self`
`uses_calc_time`
`uses_full_dataframe`