hypermodel.platform.gcp package

Submodules

hypermodel.platform.gcp.config module

class hypermodel.platform.gcp.config.GooglePlatformConfig

Bases: hypermodel.platform.abstract.platform_config.PlatformConfig

hypermodel.platform.gcp.data_lake module

class hypermodel.platform.gcp.data_lake.DataLake(config: hypermodel.platform.gcp.config.GooglePlatformConfig)

Bases: hypermodel.platform.abstract.data_lake.DataLakeBase

download(bucket_path: str, destination_local_path: str, bucket_name: str = None) → bool
upload(bucket_path: str, local_path: str, bucket_name: str = None) → bool

hypermodel.platform.gcp.data_warehouse module

class hypermodel.platform.gcp.data_warehouse.DataWarehouse(config: hypermodel.platform.gcp.config.GooglePlatformConfig)

Bases: hypermodel.platform.abstract.data_warehouse.DataWarehouseBase

dataframe_from_query(query: str) → pandas.core.frame.DataFrame
dataframe_from_table(dataset: str, table: str) → pandas.core.frame.DataFrame
dry_run(query: str) → List[hypermodel.model.table_schema.SqlColumn]
import_csv(bucket_path: str, dataset: str, table: str) → bool
select_into(query: str, output_dataset: str, output_table: str) → bool
table_schema(dataset: str, table: str) → hypermodel.model.table_schema.SqlTable

hypermodel.platform.gcp.gcp_base_op module

class hypermodel.platform.gcp.gcp_base_op.GcpBaseOp(config: hypermodel.platform.gcp.config.GooglePlatformConfig, pipeline_name: str, op_name: str)

Bases: object

GcpBaseOp defines the base functionality for a Kubeflow Pipeline Operation providing a convenient wrapper over Kubeflow’s ContainerOp for use within the Google Kubernetes Engine (GKE) on Google Cloud Platform

bind_env(variable_name: str, value: str)

Create an environment variable for the container with the given value

Parameters:
  • variable_name (str) – The name of the variable in the container
  • value (str) – The value to bind to the variable
Returns:

A reference to the current GcpBaseOp (for chaining)

bind_gcp_auth(gcp_auth_secret: str)

Bind the gcp_auth_secret that contains the Service Account that this container should use to authenticate and authorise itself.

Parameters:gcp_auth_secret (str) – The name of the secret containing the service account this container should use
Returns:A reference to the current GcpBaseOp (for chaining)
bind_output_artifact_path(name: str, path: str)

Add an artifact to the Kubeflow Pipeline Operation using the name provided with the content from the path provided

Parameters:
  • name (str) – The name of the output artifact
  • path (str) – The path to find the content for the artifact
Returns:

A reference to the current GcpBaseOp (for chaining)

bind_output_file_path(name, path)

Add an output file to the Kubeflow Pipeline Operation using the name provided with the content from the path provided

Parameters:
  • name (str) – The name of the output file
  • path (str) – The path to find the content for the file
Returns:

A reference to the current GcpBaseOp (for chaining)

bind_secret(secret_name: str, mount_path: str)

Bind a secret with the name secret_name from Kubernetes (in the same namespace as the container) to the specified mount_path

Parameters:
  • secret_name (str) – The name of the secret to mount
  • mount_path (str) – The path to mount the secret to
Returns:

A reference to the current GcpBaseOp (for chaining)

get(key: str)

Get the value of a variable bound to this Operation, returning None if the key is not found.

Parameters:key (str) – The key to get the value of
Returns
The value of the given key, or None if the key is not found in currently bound values.
op(overrides={})

Generate a ContainerOp object from all the configuration stored as a part of this Op.

Parameters:overrides (Dict[str,str]) – Override the bound variables with these values
Returns:ContainerOp using settins from this op
with_container(container_image_url: str, container_command: str, container_args: List[str])

Set information about which container to use, and the command in that container to execute as a part of this job.

Parameters:
  • container_image_url (str) – The url and tags for where we can find the container
  • container_command (str) – The command to execute
  • container_args (List[str]) – The arguments to pass the executable

hypermodel.platform.gcp.services module

class hypermodel.platform.gcp.services.GooglePlatformServices

Bases: hypermodel.platform.abstract.services.PlatformServicesBase

Services related to our Google Platform / Gitlab technology stack, including:

config

An object containing configuration information

Type:GooglePlatformConfig
lake

A reference to DataLake functionality, implemented through Google Cloud Storage

Type:DataLake
warehouse

A reference to DataWarehouse functionality implemented through BigQuery

Type:DataWarehouse
config
git
lake
warehouse

Module contents