Base usecase

class previsionio.usecase.BaseUsecase(**usecase_info)

Bases: previsionio.api_resource.ApiResource

Base parent class for all usecases objects.

best_model

Get the model with the best predictive performance over all models (including Blend models), where the best performance corresponds to a minimal loss.

Returns:Model with the best performance in the usecase, or None if no model matched the search filter.
Return type:(Model, None)
best_single

Get the model with the best predictive performance (the minimal loss) over single models (excluding Blend models), where the best performance corresponds to a minimal loss.

Returns:Single (non-blend) model with the best performance in the usecase, or None if no model matched the search filter.
Return type:Model
correlation_matrix

Get the correlation matrix of the features (those constitute the dataset on which the usecase was trained).

Returns:Correlation matrix as a pandas dataframe
Return type:pd.DataFrame
delete()

Delete a usecase from the actual [client] workspace.

Returns:Deletion process results
Return type:dict
delete_prediction(prediction_id)

Delete a prediction in the list for the current usecase from the actual [client] workspace.

Parameters:prediction_id (str) – Unique id of the prediction to delete
Returns:Deletion process results
Return type:dict
delete_predictions()

Delete all predictions in the list for the current usecase from the actual [client] workspace.

Returns:Deletion process results
Return type:dict
drop_list

Get the list of drop columns in the usecase.

Returns:Names of the columns dropped from the dataset
Return type:list(str)
fastest_model

Returns the model that predicts with the lowest response time

Returns:Model object – corresponding to the fastest model
fe_selected_list

Get the list of selected feature engineering modules in the usecase.

Returns:Names of the feature engineering modules selected for the usecase
Return type:list(str)
feature_stats
  • feature types distribution
  • feature information list
  • list of dropped features
Returns:General features information
Return type:dict
Type:Get the general description of the usecase’s features, such as
classmethod from_id(_id, version=1)

Get a usecase from the platform by its unique id.

Parameters:
  • _id (str) – Unique id of the usecase to retrieve
  • version (int, optional) – Specific version of the usecase to retrieve (default: 1)
Returns:

Fetched usecase

Return type:

BaseUsecase

Raises:

PrevisionException – Any error while fetching data from the platform or parsing result

get_cv(use_best_single=False) → pandas.core.frame.DataFrame

Get the cross validation dataset from the best model of the usecase.

Parameters:use_best_single (bool, optional) – Whether to use the best single model instead of the best model overall (default: False)
Returns:Cross validation dataset
Return type:pd.DataFrame
get_feature_info(feature_name=None)

Return some information about the given feature, such as:

  • name: the name of the feature as it was given in the feature_name parameter

  • type: linear, categorical, ordinal…

  • stats: some basic statistics such as number of missing values, (non missing) values count, plus additional information depending on the feature type:

    • for a linear feature: min, max, mean, std and median
    • for a categorical/textual feature: modalities/words frequencies, list of the most frequent tokens
  • role: whether or not the feature is a target/fold/weight or id feature (and for time series usecases, whether or not it is a group/apriori feature - check the Prevision.io’s timeseries documentation)

  • importance_value: scores reflecting the importance of the given feature

Parameters:
  • feature_name (str) – Name of the feature to get informations about
  • warning:: (.) – The feature_name is case-sensitive, so “age” and “Age” are different features!
Returns:

Dictionary containing the feature information

Return type:

dict

Raises:

PrevisionException – If the given feature name does not match any feaure

get_model_from_id(id)

Get a model of the usecase by its unique id.

Note

The model is only searched through the models that are done training.

Parameters:id (str) – Unique id of the model resource to search for
Returns:Matching model resource, or None if none with the given id could be found
Return type:(Model, None)
get_model_from_name(model_name=None)

Get a model of the usecase by its name.

Note

The model is only searched through the models that are done training.

Parameters:name (str) – Name of the model resource to search for
Returns:Matching model resource, or None if none with the given name could be found
Return type:(Model, None)
get_predictions(full=False)

Retrieves the list of predictions for the current usecase from client workspace (with the full predictions object if necessary) :param full: If true, return full prediction objects (else only metadata) :type full: boolean

lite_models_list

Get the list of selected lite models in the usecase.

Returns:Names of the lite models selected for the usecase
Return type:list(str)
models

Get the list of models generated for the current use case. Only the models that are done training are retrieved.

Returns:List of models found by the platform for the usecase
Return type:list(Model)
normal_models_list

Get the list of selected normal models in the usecase.

Returns:Names of the normal models selected for the usecase
Return type:list(str)
predict(df, confidence=False, use_best_single=False) → pandas.core.frame.DataFrame

Get the predictions for a dataset stored in the current active [client] workspace using the best model of the usecase with a Scikit-learn style blocking prediction mode.

Warning

For large dataframes and complex (blend) models, this can be slow (up to 1-2 hours). Prefer using this for simple models and small dataframes, or use option use_best_single = True.

Parameters:
  • df (pd.DataFrame) – pandas DataFrame containing the test data
  • confidence (bool, optional) – Whether to predict with confidence values (default: False)
  • use_best_single (bool, optional) – Whether to use the best single model instead of the best model overall (default: False)
Returns:

Prediction data (as pandas dataframe) and prediction job ID.

Return type:

tuple(pd.DataFrame, str)

predict_from_dataset(dataset, use_best_single=False, confidence=False, dataset_folder=None) → pandas.core.frame.DataFrame

Get the predictions for a dataset stored in the current active [client] workspace using the best model of the usecase.

Parameters:
  • dataset (Dataset) – Reference to the dataset object to make predictions for
  • use_best_single (bool, optional) – Whether to use the best single model instead of the best model overall (default: False)
  • confidence (bool, optional) – Whether to predict with confidence values (default: False)
  • dataset_folder (Dataset) – Matching folder dataset for the predictions, if necessary
Returns:

Predictions as a pandas dataframe

Return type:

pd.DataFrame

predict_proba(df, confidence=False, use_best_single=False) → pandas.core.frame.DataFrame

Get the predictions for a dataset stored in the current active [client] workspace using the best model of the usecase with a Scikit-learn style blocking prediction mode, and returns the probabilities.

Warning

For large dataframes and complex (blend) models, this can be slow (up to 1-2 hours). Prefer using this for simple models and small dataframes, or use option use_best_single = True.

Parameters:
  • df (pd.DataFrame) – pandas DataFrame containing the test data
  • confidence (bool, optional) – Whether to predict with confidence values (default: False)
  • use_best_single (bool, optional) – Whether to use the best single model instead of the best model overall (default: False)
Returns:

Prediction probabilities data (as pandas dataframe) and prediction job ID.

Return type:

tuple(pd.DataFrame, str)

predict_single(use_best_single=False, confidence=False, explain=False, **predict_data)

Get a prediction on a single instance using the best model of the usecase.

Parameters:
  • use_best_single (bool, optional) – Whether to use the best single model instead of the best model overall (default: False)
  • confidence (bool, optional) – Whether to predict with confidence values (default: False)
  • explain (bool) – Whether to explain prediction (default: False)
Returns:

Dictionary containing the prediction.

Note

The format of the predictions dictionary depends on the problem type (regression, classification…)

Return type:

dict

print_info()

Print all info on the usecase.

running

Get a flag indicating whether or not the usecase is currently running.

Returns:Running status
Return type:bool
schema

Get the data schema of the usecase.

Returns:Usecase schema
Return type:pd.DataFrame
score

Get the current score of the usecase (i.e. the score of the model that is currently considered the best performance-wise for this usecase).

Returns:Usecase score (or infinity if not available).
Return type:float
share(with_users)

Share a usecase in the actual [client] workspace with other users (specified via their emails).

Parameters:with_users (list(str)) – List of emails of the users to share the usecase with
simple_models_list

Get the list of selected simple models in the usecase.

Returns:Names of the simple models selected for the usecase
Return type:list(str)
stop()

Stop a usecase (stopping all nodes currently in progress).

train_dataset

Get the Dataset object corresponding to the training dataset of the usecase.

Returns:Associated training dataset
Return type:Dataset
unshare(from_users)

Unshare a usecase in the actual [client] workspace from other users (specified via their emails).

Parameters:from_users (list(str)) – List of emails of the users to unshare the usecase from
update_status()

Get an update on the status of a resource.

Parameters:specific_url (str, optional) – Specific (already parametrized) url to fetch the resource from (otherwise the url is built from the resource type and unique _id)
Returns:Updated status info
Return type:dict
versions

Get the list of all versions for the current use case.

Returns:List of the usecase versions (as JSON metadata)
Return type:list(dict)
wait_until(condition, raise_on_error=True, timeout=10800)

Wait until condition is fulfilled, then break.

Parameters:
  • (func (condition) – (BaseUsecase) -> bool.): Function to use to check the break condition
  • raise_on_error (bool, optional) – If true then the function will stop on error, otherwise it will continue waiting (default: True)
  • timeout (float, optional) – Maximal amount of time to wait before forcing exit
Raises:PrevisionException – If the resource could not be fetched or there was a timeout.