TextSimilarity usecases

class previsionio.text_similarity.DescriptionsColumnConfig(content_column, id_column)

Bases: previsionio.usecase_config.UsecaseConfig

Description Column configuration for starting a usecase: this object defines the role of specific columns in the dataset.

Parameters:
  • content_column (str, required) – Name of the content column in the description dataset
  • id_column (str, optional) – Name of the id column in the description dataset
class previsionio.text_similarity.ModelsParameters(model_embedding='tf_idf', preprocessing=<previsionio.text_similarity.Preprocessing object>, models=['brute_force'])

Bases: previsionio.usecase_config.UsecaseConfig

Training configuration that holds the relevant data for a usecase description: the wanted feature engineering, the selected models, the training speed…

Args:

class previsionio.text_similarity.QueriesColumnConfig(queries_dataset_content_column, queries_dataset_matching_id_description_column, queries_dataset_id_column=None)

Bases: previsionio.usecase_config.UsecaseConfig

Description Column configuration for starting a usecase: this object defines the role of specific columns in the dataset.

Parameters:
  • content_column (str, required) – Name of the content column in the description dataset
  • id_column (str, optional) – Name of the id column in the description dataset
class previsionio.text_similarity.TextSimilarity(**usecase_info)

Bases: previsionio.usecase_version.BaseUsecaseVersion

new_version(description: str = None, dataset: previsionio.dataset.Dataset = None, description_column_config: previsionio.text_similarity.DescriptionsColumnConfig = None, metric: previsionio.metrics.TextSimilarity = None, top_k: int = None, lang: str = 'auto', queries_dataset: previsionio.dataset.Dataset = None, queries_column_config: Optional[previsionio.text_similarity.QueriesColumnConfig] = None, models_parameters: previsionio.text_similarity.ListModelsParameters = None, **kwargs) → previsionio.text_similarity.TextSimilarity

Start a text similarity usecase training to create a new version of the usecase (on the platform): the training configs are copied from the current version and then overridden for the given parameters.

Parameters:
  • description (str, optional) – additional description of the version
  • dataset (Dataset, DatasetImages, optional) – Reference to the dataset object to use for as training dataset
  • description_column_config (DescriptionsColumnConfig, optional) – Column configuration for the usecase (see the documentation of the ColumnConfig resource for more details on each possible column types)
  • metric (metrics.TextSimilarity, optional) – Specific metric to use for the usecase (default: None)
  • holdout_dataset (Dataset, optional) – Reference to a dataset object to use as a holdout dataset (default: None)
  • training_config (TrainingConfig, optional) – Specific training configuration (see the documentation of the TrainingConfig resource for more details on all the parameters)
Returns:

Newly created text similarity usecase version object (new version)

Return type:

TextSimilarity