Skip to content

WANNA Docker#

Multiple resources created by WANNA rely on Docker containers. We make it easy for you to build your images either locally or using GCP Cloud Build.

Types of docker images#

We currently support three types of docker images:

  • provided_image - you supply a link to the docker image in the registry. We don't build anything, just redirect this link to GCP.
  • local_build_image - you supply a Dockerfile with a context directory and additional information. We build the image for you on your machine or in the cloud.
  • notebook_ready_image - you supply a list of pip requirements to install in your Jupyter Notebook. This is useful if you want to start a notebook with custom libraries, but you don't want to handle Dockerfile information.

Referencing docker images#

Each docker image must have a name. By this name, you can later reference it in resource configuration, usually as docker_image_ref.

Example:

docker:
  images:
    - build_type: local_build_image
      name: custom-notebook-container-julia
      context_dir: .
      dockerfile: Dockerfile.notebook
  repository: wanna-samples
  cloud_build: true

notebooks:
  - name: wanna-notebook-julia
    environment:
      docker_image_ref: custom-notebook-container-julia

Local build vs GCP Cloud Build#

By default, all docker images are built locally on your machine and then pushed to the registry. For faster testing lifecycle you can build images directly using GCP Cloud Build. The only needed change is to set cloud_build: true in docker section of the WANNA yaml config or set WANNA_DOCKER_BUILD_IN_CLOUD=true (env variable takes precedence).

Building in the cloud is generally faster as the docker images are automatically already in the registry and there is no need to push the images over the network. That makes it suitable for fast testing. However, building images in the cloud is not allowed for production.

Build configuration#

When building locally, we offer you a way to set additional build parameters. These parameters must be specified in a separate yaml file in path WANNA_DOCKER_BUILD_CONFIG. If this is not set, it defaults to the dockerbuild.yaml in the working directory.

These parameters refer to standard docker build parameters.

One example use case can be when you want to git clone your internal repository during the docker build.

In the dockerbuild.yaml:

ssh: github=~/.ssh/id_rsa

In the Dockerfile:

RUN mkdir -m 700 /root/.ssh; \
  touch -m 600 /root/.ssh/known_hosts; \
  ssh-keyscan git.int.avast.com > /root/.ssh/known_hosts

RUN --mount=type=ssh,id=github git clone git@git.your.company.com:your_profile/your_repo.git

Parameters for docker section#

class wanna.core.models.docker.DockerModel(*, images=[], repository=None, registry=None, cloud_build_timeout=12000, cloud_build=False, cloud_build_workerpool=None, cloud_build_workerpool_location=None, cloud_build_kaniko_version='latest', cloud_build_kaniko_flags=['--cache=true', '--compressed-caching=false', '--cache-copy-layers=true'])
  • images- [List[Union[LocalBuildImageModel, ProvidedImageModel, NotebookReadyImageModel]]] Docker images that will be used in wanna-ml resources
  • repository - [str] (optional) GCP Artifact Registry repository for pushing images
  • registry - [str] (optional) GCP Artifact Registry, when not set it defaults to {gcp_profile.region}-docker.pkg.dev
  • cloud_build_timeout - [int] 12000 how many seconds before cloud build timeout
  • cloud_build - [str] (optional) false (default) to build locally, true to use GCP Cloud Build
  • cloud_build_workerpool - [str] (optional) Name of the GCP Cloud Build workerpool if you want to use one
  • cloud_build_workerpool_location - [str] (optional) Location of the GCP Cloud Build workerpool. Must be specified if cloud_build_workerpool is set.
  • cloud_build_kaniko_version - [str] (optional) which https://github.com/GoogleContainerTools/kaniko/ version to use
  • cloud_build_kaniko_flags - [str] (optional) which https://github.com/GoogleContainerTools/kaniko/ flags to use
class wanna.core.models.docker.ProvidedImageModel(*, name, build_type, image_url)
  • build_type - [str] "provided_image"
  • name - [str] This will later be used in docker_image_ref in other resources
  • image_url - [str] URL link to the image
class wanna.core.models.docker.LocalBuildImageModel(*, name, build_type, build_args=None, context_dir, dockerfile)
  • build_type - [str] "local_build_image"
  • name - [str] This will later be used in docker_image_ref in other resources
  • build_args [Dict[str, str]] - (optional) docker build args
  • context_dir [Path] - Path to the docker build context directory
  • dockerfile [Path] - Path to the Dockerfile
class wanna.core.models.docker.NotebookReadyImageModel(*, name, build_type, build_args=None, base_image='gcr.io/deeplearning-platform-release/base-cpu', requirements_txt)
  • build_type - [str] "notebook_ready_image"
  • name - [str] This will later be used in docker_image_ref in other resources
  • build_args [Dict[str, str]] - (optional) docker build args
  • base_image [str] - (optional) base notebook docker image, you can check available images https://cloud.google.com/deep-learning-vm/docs/images when not set, it defaults to standard base CPU notebook.
  • requirements_txt [Path] - Path to the requirements.txt file containing python packages that will be installed

Roles and permissions#

Permission and suggested roles (applying the principle of least privilege) required for docker images manipulation:

WANNA action Permissions Suggested Roles
build in Cloud Build cloudbuild.builds.create and more roles/cloudbuild.builds.builder
push artifactregistry.repositories.uploadArtifacts, artifactregistry.tags.create, artifactregistry.tags.update roles/artifactregistry.writer

For building the docker images locally, you will need permission to push to GCP as described above and running local Docker daemon. You also have to authenticate docker with GCP, detailed documentation is here. But generally, you should be fine with running:

gcloud auth login

gcloud auth configure-docker europe-west1-docker.pkg.dev # Add more comma-separated repository hostnames if you wish

Full list of available roles and permission.