hypermodel.hml package

Submodules

hypermodel.hml.decorators module

hypermodel.hml.hml_app module

hypermodel.hml.hml_container_op module

class hypermodel.hml.hml_container_op.HmlContainerOp(func, kwargs)

Bases: object

HmlContainerOp defines the base functionality for a Kubeflow Pipeline Operation which is executed as a simple command line application (assuming that the package) has been installed, and has a script based entrypoint

invoke()

Actually invoke the function that this ContainerOp refers to (for testing / execution in the container)

Returns:A reference to the current HmlContainerOp (self)
with_command(container_command: str, container_args: List[str]) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Set the command / arguments to execute within the container as a part of this job.

Parameters:
  • container_command (str) – The command to execute
  • container_args (List[str]) – The arguments to pass the executable
Returns:

A reference to the current HmlContainerOp (self)

with_empty_dir(name: str, mount_path: str) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Create an empy, writable volume with the given name mounted to the specified mount_path

Parameters:
  • name (str) – The name of the volume to mount
  • mount_path (str) – The path to mount the empty volume
Returns:

A reference to the current HmlContainerOp (self)

with_env(variable_name, value) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Bind an environment variable with the name variable_name and value specified

Parameters:
  • variable_name (str) – The name of the environment variable
  • value (str) – The value to bind to the variable
Returns:

A reference to the current HmlContainerOp (self)

with_gcp_auth(secret_name: str) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Use the secret given in secret_name as the service account to use for GCP related SDK api calls (e.g. mount the secret to a path, then bind an environment variable GOOGLE_APPLICATION_CREDENTIALS to point to that path)

Parameters:secret_name (str) – The name of the secret with the Google Service Account json file.
Returns:A reference to the current HmlContainerOp (self)
with_image(container_image_url: str) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Set information about which container to use

Parameters:
  • container_image_url (str) – The url and tags for where we can find the container
  • container_command (str) – The command to execute
  • container_args (List[str]) – The arguments to pass the executable
Returns:

A reference to the current HmlContainerOp (self)

with_secret(secret_name: str, mount_path: str) → Optional[hypermodel.hml.hml_container_op.HmlContainerOp]

Bind a secret given by secret_name to the local path defined in mount_path

Parameters:
  • secret_name (str) – The name of the secret (in the same namespace)
  • mount_path (str) – The path to mount the secret locally
Returns:

A reference to the current HmlContainerOp (self)

hypermodel.hml.hml_inference_app module

class hypermodel.hml.hml_inference_app.HmlInferenceApp(name: str, cli: click.core.Group, image_url: str, package_entrypoint: str, port, k8s_namespace)

Bases: object

The host of the Flask app used for predictions for models

apply_deployment(k8s_deployment: kubernetes.client.models.extensions_v1beta1_deployment.ExtensionsV1beta1Deployment)
apply_service(k8s_service: kubernetes.client.models.v1_service.V1Service)
cli_inference_group = <click.core.Group object>
deploy()
get_model(name: str)

Get a reference to a model with the given name, retuning None if it cannot be found. :param name: The name of the model :type name: str

Returns:The ModelContainer object of the model if it can be found, or None if it cannot be found.
on_deploy(func: Callable[[hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment], None])
on_init(func: Callable)
register_model(model_container: hypermodel.hml.model_container.ModelContainer)

Load the Model (its JobLib and Summary statistics) using an empy ModelContainer object, and bind it to our internal dictionary of models. :param model_container: The container wrapping the model :type model_container: ModelContainer

Returns:The model container passed in, having been loaded.
start_dev()

Start the Flask App in development mode

start_prod()

Start the Flask App in Production mode (via Waitress)

hypermodel.hml.hml_inference_deployment module

class hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment(name: str, image_url: str, package_entrypoint: str, port, k8s_namespace)

Bases: object

The HmlInferenceDeployment class provides functionality for managing deployments of the HmlInferenceApp to Kubernetes. This provides the ability to build and configure the required Kubernetes Deployments (Pods & Containers) along with a NodePort Service suitable for use with an Ingress (not created by this).

get_yaml()

Get the YAML like definition of the K8s Deployment and Service

with_empty_dir(name: str, mount_path: str) → Optional[hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment]

Create an empy, writable volume with the given name mounted to the specified mount_path

Parameters:
  • name (str) – The name of the volume to mount
  • mount_path (str) – The path to mount the empty volume
Returns:

A reference to the current HmlInferenceDeployment (self)

with_env(variable_name, value) → Optional[hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment]

Bind an environment variable with the name variable_name and value specified

Parameters:
  • variable_name (str) – The name of the environment variable
  • value (str) – The value to bind to the variable
Returns:

A reference to the current HmlInferenceDeployment (self)

with_gcp_auth(secret_name: str) → Optional[hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment]

Use the secret given in secret_name as the service account to use for GCP related SDK api calls (e.g. mount the secret to a path, then bind an environment variable GOOGLE_APPLICATION_CREDENTIALS to point to that path)

Parameters:secret_name (str) – The name of the secret with the Google Service Account json file.
Returns:A reference to the current HmlInferenceDeployment (self)
with_resources(limit_cpu: str, limit_memory: str, request_cpu: str, request_memory: str) → Optional[hypermodel.hml.hml_inference_deployment.HmlInferenceDeployment]

Set the Resource Limits and Requests for the Container running the HmlInferenceApp

Parameters:
  • limit_cpu (str) – Maximum amount of CPU to use
  • limit_memory (str) – Maximum amount of Memory to use
  • request_cpu (str) – The desired amount of CPU to reserve
  • request_memory (str) – The desired amount of Memory to reserve
Returns:

A reference to the current HmlInferenceDeployment (self)

hypermodel.hml.hml_pipeline module

class hypermodel.hml.hml_pipeline.HmlPipeline(cli: click.core.Group, pipeline_func: Callable, image_url: str, package_entrypoint: str, op_builders: List[Callable[[hypermodel.hml.hml_container_op.HmlContainerOp], hypermodel.hml.hml_container_op.HmlContainerOp]])

Bases: object

apply_deploy_options(func)
Bind additional command line arguments for the deployment step, including:
–host: Endpoint of the KFP API service to use –client-id: Client ID for IAP protected endpoint. –namespace: Kubernetes namespace to we want to deploy to
Parameters:func (Callable) – The Click decorated function to bind options to
Returns:The current HmlPipeline (self)
get_dag()

Get the calculated Argo Workflow Directed Acyclic Graph created by the Kubeflow Pipeline.ArithmeticError

Returns:The “dag” object from the Argo workflow template.
run_all(**kwargs)

Run all the steps in the pipeline

run_task(task_name: str, run_log: Dict[str, bool], kwargs)

Execute the Kubelow Operation for real, and mark the task as executed in the dict run_log so that we don’t re-execute tasks that have already been executed.

Parameters:
  • task_name (str) – The name of the task/op to execute
  • run_log (Dict[str, bool]) – A dictionary of all the tasks/ops we have already run
  • kwargs – Additional keywork arguments to pass into the execution of the task
Returns:

None

with_cron(cron: str) → Optional[hypermodel.hml.hml_pipeline.HmlPipeline]

Bind a cron expression to the Pipeline, telling Kubeflow to execute the Pipeline on the specified schedule

Parameters:[str] (cron) – The crontab expression to schedule execution
Returns:The current HmlPipeline (self)
with_experiment(experiment: str) → Optional[hypermodel.hml.hml_pipeline.HmlPipeline]

Bind execution jobs to the specified experiment (only one).

Parameters:experiment (str) – The name of the experiment
Returns:The current HmlPipeline (self)

hypermodel.hml.hml_pipeline_app module

class hypermodel.hml.hml_pipeline_app.HmlPipelineApp(name: str, cli: click.core.Group, image_url: str, package_entrypoint: str)

Bases: object

on_deploy(func: Callable[[hypermodel.hml.hml_container_op.HmlContainerOp], hypermodel.hml.hml_container_op.HmlContainerOp])

Registers a function to be called for each ContainerOp defined in the Pipeline to enable us to configure the Operations within the container with secrets, environment variables and whatever else may be required.

Parameters:func (Callable) – The function (accepting a HmlContainerOp as its only parameter) which configure the supplied HmlContainerOp
register_pipeline(pipeline_func, cron: str, experiment: str)

Register a Kubeflow Pipeline (e.g. a function decorated with @hml.pipeline)

Parameters:
  • pipeline_func (Callable) – The function defining the pipline
  • cron (str) – A cron expression for the default job executing this pipelines
  • experiment (str) – The kubeflow experiment to deploy the job to
Returns:

Nonw

hypermodel.hml.model_container module

class hypermodel.hml.model_container.ModelContainer(name: str, project_name: str, features_numeric: List[str], features_categorical: List[str], target: str, services: hypermodel.platform.abstract.services.PlatformServicesBase)

Bases: object

The ModelContainer class provides a wrapper for a Machine Learning model, detailing information about Features (numeric & categorical), information about the distributions of feature columns and potentially a reference to the current version of the model’s .joblib file.

analyze_distributions(data_frame: pandas.core.frame.DataFrame)

Given a dataframe, find all the unique values for categorical features and the distribution of all the numerical features and store them within this object.

Parameters:data_frame (pd.DataFrame) – The dataframe to analyze
Returns:A reference to self
bind_model(model)
build_training_matrix(data_frame: pandas.core.frame.DataFrame)

Convert the provided data_frame to a matrix after one-hot encoding all the categorical features, using the currently cached feature_uniques

Parameters:data_frame (pd.DataFrame) – The pandas dataframe to encode
Returns:A numpy array of the encoded data
create_merge_request(reference, description='New models!')
dump_distributions()

Write information about the distributions of features to the local filesystem

Returns:The path to the file that was written
dump_model()
dump_reference(reference)
get_bucket_path(filename)
get_local_path(filename)
load(reference_file=None)

Given the provided reference file, look up the location of the model in the DataLake and load it into memory. This will load the .joblib file, as well as any distributions / unique values associeated with this model reference

Parameters:reference_file (str) – The path of the reference json file
Returns:None
load_distributions(file_path: str)
load_model()
publish()

Publish the model (as a Joblib)

Module contents