hypermodel.hml package

Submodules

hypermodel.hml.model_container module

class hypermodel.hml.model_container.ModelContainer(name: str, project_name: str, features_numeric: List[str], features_categorical: List[str], target: str, services: hypermodel.platform.abstract.services.PlatformServicesBase)

Bases: object

The ModelContainer class provides a wrapper for a Machine Learning model, detailing information about Features (numeric & categorical), information about the distributions of feature columns and potentially a reference to the current version of the model’s .joblib file.

analyze_distributions(data_frame: pandas.core.frame.DataFrame)

Given a dataframe, find all the unique values for categorical features and the distribution of all the numerical features and store them within this object.

Parameters:data_frame (pd.DataFrame) – The dataframe to analyze
Returns:A reference to self
bind_model(model)
build_training_matrix(data_frame: pandas.core.frame.DataFrame)

Convert the provided data_frame to a matrix after one-hot encoding all the categorical features, using the currently cached feature_uniques

Parameters:data_frame (pd.DataFrame) – The pandas dataframe to encode
Returns:A numpy array of the encoded data
create_merge_request(reference, description='New models!')
dump_distributions()

Write information about the distributions of features to the local filesystem

Returns:The path to the file that was written
dump_model()
dump_reference(reference)
get_bucket_path(filename)
get_local_path(filename)
load(reference_file=None)

Given the provided reference file, look up the location of the model in the DataLake and load it into memory. This will load the .joblib file, as well as any distributions / unique values associeated with this model reference

Parameters:reference_file (str) – The path of the reference json file
Returns:None
load_distributions(file_path: str)
load_model()
publish()

Publish the model (as a Joblib)

Module contents