featuretools.primitives.NumberOfUniqueWords#

类 featuretools.primitives.NumberOfUniqueWords(case_insensitive=False)[源]#

确定字符串中的唯一词数量。

描述: 确定给定字符串中的唯一词数量。包含区分大小写或不区分大小写的选项。

参数：

case_insensitive (bool, 可选) – 指定搜索唯一词时是否不区分大小写。
example (例如) –
having (将其设置为 True 意味着 "WORD word" 将被视为) –
False. (一个唯一词。默认为) –

示例

>>> x = ['Word word Word', 'This is a SENTENCE.', 'green red green']
>>> number_of_unique_words = NumberOfUniqueWords()
>>> number_of_unique_words(x).tolist()
[2, 4, 2]

>>> x = ['word WoRD WORD worD', 'dog dog dog', 'catt CAT caT']
>>> number_of_unique_words = NumberOfUniqueWords(case_insensitive=True)
>>> number_of_unique_words(x).tolist()
[1, 1, 2]

__init__(case_insensitive=False)[源]#

方法

`__init__`([case_insensitive])
`flatten_nested_input_types`(input_types)	将嵌套的列模式输入展平为单个列表。
`generate_name`(base_feature_names)
`generate_names`(base_feature_names)
`get_args_string`()
`get_arguments`()
`get_description`(input_column_descriptions[, ...])
`get_filepath`(filename)
`get_function`()

属性

`base_of`
`base_of_exclude`
`commutative`
`default_value`	如果未找到数据，此特征返回的默认值。
`description_template`
`input_types`	输入的 woodwork.ColumnSchema 类型
`max_stack_depth`
`name`	基本类型的名称
`number_output_features`	与此特征关联的特征矩阵中的列数
`return_type`	返回值的 ColumnSchema 类型
`stack_on`
`stack_on_exclude`
`stack_on_self`
`uses_calc_time`
`uses_full_dataframe`

目录

上一主题

下一主题

本页

featuretools.primitives.NumberOfUniqueWords#

目录

上一主题

下一主题

本页

快速搜索

featuretools.primitives.NumberOfUniqueWords#