External experiments¶

class previsionio.external_experiment_version.ExternalExperimentVersion(**experiment_version_info)¶

Bases: previsionio.experiment_version.ClassicExperimentVersion

best_model¶

Get the model with the best predictive performance over all models, where the best performance corresponds to a minimal loss.

Returns:	Model with the best performance in the experiment, or `None` if no model matched the search filter.
Return type:	(`Model`, None)

correlation_matrix¶

Get the correlation matrix of the features (those constitute the dataset on which the experiment was trained).

Returns:	Correlation matrix as a `pandas` dataframe
Return type:	`pd.DataFrame`

dataset¶

Get the Dataset object corresponding to the training dataset of this experiment version.

Returns:	Associated training dataset
Return type:	`Dataset`

delete()¶

Delete an experiment version from the actual [client] workspace.

Raises:	`PrevisionException` – If the experiment version does not exist `requests.exceptions.ConnectionError` – Error processing the request

delete_prediction(prediction_id: str)¶

Delete a prediction in the list for the current experiment from the actual [client] workspace.

Parameters:	prediction_id (str) – Unique id of the prediction to delete
Returns:	Deletion process results
Return type:	dict

delete_predictions()¶

Delete all predictions in the list for the current experiment from the actual [client] workspace.

Returns:	Deletion process results
Return type:	dict

done¶

Get a flag indicating whether or not the experiment is currently done.

Returns:	done status
Return type:	bool

fastest_model¶

Returns the model that predicts with the lowest response time

Returns:	Model object – corresponding to the fastest model

features¶

feature types distribution
feature information list
list of dropped features

Returns:	General features information
Return type:	dict
Type:	Get the general description of the experiment’s features, such as

features_stats¶

feature types distribution
feature information list
list of dropped features

Returns:	General features information
Return type:	dict
Type:	Get the general description of the experiment’s features, such as

get_feature_info(feature_name: str) → Dict¶

Return some information about the given feature, such as:

name: the name of the feature as it was given in the feature_name parameter
type: linear, categorical, ordinal…
stats: some basic statistics such as number of missing values, (non missing) values count, plus additional information depending on the feature type:
- for a linear feature: min, max, mean, std and median
- for a categorical/textual feature: modalities/words frequencies, list of the most frequent tokens
role: whether or not the feature is a target/fold/weight or id feature (and for time series experiments, whether or not it is a group/apriori feature - check the Prevision.io’s timeseries documentation)
importance_value: scores reflecting the importance of the given feature

Parameters:	feature_name (str) – Name of the feature to get informations about warning:: (.) – The `feature_name` is case-sensitive, so “age” and “Age” are different features!
Returns:	Dictionary containing the feature information
Return type:	dict
Raises:	`PrevisionException` – If the given feature name does not match any feaure

get_holdout_predictions(full: bool = False)¶: Retrieves the list of holdout predictions for the current experiment from client workspace (with the full predictions object if necessary) :param full: If true, return full holdout prediction objects (else only metadata) :type full: boolean

get_predictions(full: bool = False)¶: Retrieves the list of predictions for the current experiment from client workspace (with the full predictions object if necessary) :param full: If true, return full prediction objects (else only metadata) :type full: boolean

holdout_dataset¶

Get the Dataset object corresponding to the holdout dataset of this experiment version.

Returns:	Associated holdout dataset
Return type:	`Dataset`

models¶

Get the list of models generated for the current experiment version. Only the models that are done training are retrieved.

Returns:	List of models found by the platform for the experiment
Return type:	list(`Model`)

new_version()¶

Create a new external experiment version from this version (on the platform). The external_models parameter is mandatory. The other parameters are copied from the current version and then overridden for those provided.

Parameters:	external_models (list(tuple)) – The external models to add in the experiment version to create. Each tuple contains 3 items describing an external model as follows: The name you want to give to the model The path to the model in onnx format The path to a yaml file containing metadata about the model holdout_dataset (`Dataset`, optional) – Reference to the holdout dataset object to use for as holdout dataset target_column (str, optional) – The name of the target column for this experiment version metric (metrics.Enum, optional) – Specific metric to use for the experiment version dataset (`Dataset`, optional) – Reference to the dataset object that has been used to train the model (default: `None`) description (str, optional) – The description of this experiment version (default: `None`)
Returns:	Newly created external experiment object (new version)
Return type:	`ExternalExperimentVersion`

predict(df, prediction_dataset_name=None) → pandas.core.frame.DataFrame¶

Get the predictions for a dataset stored in the current active [client] workspace using the best model of the experiment with a Scikit-learn style blocking prediction mode.

Warning

For large dataframes and complex (blend) models, this can be slow (up to 1-2 hours). Prefer using this for simple models and small dataframes, or use option use_best_single = True.

Parameters:	df (`pd.DataFrame`) – `pandas` DataFrame containing the test data
Returns:	Prediction data (as `pandas` dataframe) and prediction job ID.
Return type:	tuple(pd.DataFrame, str)

predict_from_dataset(dataset, dataset_folder=None) → pandas.core.frame.DataFrame¶

Get the predictions for a dataset stored in the current active [client] workspace using the best model of the experiment.

Parameters:	dataset (`Dataset`) – Reference to the dataset object to make predictions for dataset_folder (`Dataset`) – Matching folder dataset for the predictions, if necessary
Returns:	Predictions as a `pandas` dataframe
Return type:	`pd.DataFrame`

print_info()¶: Print all info on the experiment.

running¶

Get a flag indicating whether or not the experiment is currently running.

Returns:	Running status
Return type:	bool

schema¶

Get the data schema of the experiment.

Returns:	Experiment schema
Return type:	dict

score¶

Get the current score of the experiment (i.e. the score of the model that is currently considered the best performance-wise for this experiment).

Returns:	Experiment score (or infinity if not available).
Return type:	float

status¶

Get a flag indicating whether or not the experiment is currently running.

Returns:	Running status
Return type:	bool

stop()¶: Stop an experiment (stopping all nodes currently in progress).

update_status()¶

Get an update on the status of a resource.

Parameters:	specific_url (str, optional) – Specific (already parametrized) url to fetch the resource from (otherwise the url is built from the resource type and unique `_id`)
Returns:	Updated status info
Return type:	dict

wait_until(condition, raise_on_error: bool = True, timeout: float = 3600.0)¶

Wait until condition is fulfilled, then break.

Parameters:	(func (condition) – (`BaseExperimentVersion`) -> bool.): Function to use to check the break condition raise_on_error (bool, optional) – If true then the function will stop on error, otherwise it will continue waiting (default: `True`) timeout (float, optional) – Maximal amount of time to wait before forcing exit

Example:

experiment.wait_until(lambda experimentv: len(experimentv.models) > 3)

Raises:	`PrevisionException` – If the resource could not be fetched or there was a timeout.