WANNA Docker#
Multiple resources created by WANNA rely on Docker containers. We make it easy for you to build your images either locally or using GCP Cloud Build.
Types of docker images#
We currently support three types of docker images:
provided_image
- you supply a link to the docker image in the registry. We don't build anything, just redirect this link to GCP.local_build_image
- you supply a Dockerfile with a context directory and additional information. We build the image for you on your machine or in the cloud.notebook_ready_image
- you supply a list of pip requirements to install in your Jupyter Notebook. This is useful if you want to start a notebook with custom libraries, but you don't want to handle Dockerfile information.
Referencing docker images#
Each docker image must have a name
. By this name, you can later reference it in
resource configuration, usually as docker_image_ref
.
Example:
docker:
images:
- build_type: local_build_image
name: custom-notebook-container-julia
context_dir: .
dockerfile: Dockerfile.notebook
repository: wanna-samples
cloud_build: true
notebooks:
- name: wanna-notebook-julia
environment:
docker_image_ref: custom-notebook-container-julia
Local build vs GCP Cloud Build#
By default, all docker images are built locally on your machine and then pushed to the registry.
For faster testing lifecycle you can build images directly using GCP Cloud Build.
The only needed change is to set cloud_build: true
in docker
section of the WANNA yaml config
or set WANNA_DOCKER_BUILD_IN_CLOUD=true
(env variable takes precedence).
Building in the cloud is generally faster as the docker images are automatically already in the registry and there is no need to push the images over the network. That makes it suitable for fast testing. However, building images in the cloud is not allowed for production.
Build configuration#
When building locally, we offer you a way to set additional build parameters. These parameters
must be specified in a separate yaml file in path WANNA_DOCKER_BUILD_CONFIG
. If this is not set,
it defaults to the dockerbuild.yaml
in the working directory.
These parameters refer to standard docker build parameters.
One example use case can be when you want to git clone your internal repository during the docker build.
In the dockerbuild.yaml
:
ssh: github=~/.ssh/id_rsa
In the Dockerfile
:
RUN mkdir -m 700 /root/.ssh; \
touch -m 600 /root/.ssh/known_hosts; \
ssh-keyscan git.int.avast.com > /root/.ssh/known_hosts
RUN --mount=type=ssh,id=github git clone git@git.your.company.com:your_profile/your_repo.git
Parameters for docker section#
wanna.core.models.docker.DockerModel
(*, images=[], repository=None, registry=None, cloud_build_timeout=12000, cloud_build=False, cloud_build_workerpool=None, cloud_build_workerpool_location=None, cloud_build_kaniko_version='latest', cloud_build_kaniko_flags=['--cache=true', '--compressed-caching=false', '--cache-copy-layers=true'])images
- [List[Union[LocalBuildImageModel, ProvidedImageModel, NotebookReadyImageModel]]] Docker images that will be used in wanna-ml resourcesrepository
- [str] (optional) GCP Artifact Registry repository for pushing imagesregistry
- [str] (optional) GCP Artifact Registry, when not set it defaults to{gcp_profile.region}-docker.pkg.dev
cloud_build_timeout
- [int]12000
how many seconds before cloud build timeoutcloud_build
- [str] (optional)false
(default) to build locally,true
to use GCP Cloud Buildcloud_build_workerpool
- [str] (optional) Name of the GCP Cloud Build workerpool if you want to use onecloud_build_workerpool_location
- [str] (optional) Location of the GCP Cloud Build workerpool. Must be specified if cloud_build_workerpool is set.cloud_build_kaniko_version
- [str] (optional) which https://github.com/GoogleContainerTools/kaniko/ version to usecloud_build_kaniko_flags
- [str] (optional) which https://github.com/GoogleContainerTools/kaniko/ flags to use
wanna.core.models.docker.ProvidedImageModel
(*, name, build_type, image_url)build_type
- [str] "provided_image"name
- [str] This will later be used indocker_image_ref
in other resourcesimage_url
- [str] URL link to the image
wanna.core.models.docker.LocalBuildImageModel
(*, name, build_type, build_args=None, context_dir, dockerfile)build_type
- [str] "local_build_image"name
- [str] This will later be used indocker_image_ref
in other resourcesbuild_args
[Dict[str, str]] - (optional) docker build argscontext_dir
[Path] - Path to the docker build context directorydockerfile
[Path] - Path to the Dockerfile
wanna.core.models.docker.NotebookReadyImageModel
(*, name, build_type, build_args=None, base_image='gcr.io/deeplearning-platform-release/base-cpu', requirements_txt)build_type
- [str] "notebook_ready_image"name
- [str] This will later be used indocker_image_ref
in other resourcesbuild_args
[Dict[str, str]] - (optional) docker build argsbase_image
[str] - (optional) base notebook docker image, you can check available images https://cloud.google.com/deep-learning-vm/docs/images when not set, it defaults to standard base CPU notebook.requirements_txt
[Path] - Path to therequirements.txt
file containing python packages that will be installed
Roles and permissions#
Permission and suggested roles (applying the principle of least privilege) required for docker images manipulation:
WANNA action | Permissions | Suggested Roles |
---|---|---|
build in Cloud Build | cloudbuild.builds.create and more |
roles/cloudbuild.builds.builder |
push | artifactregistry.repositories.uploadArtifacts , artifactregistry.tags.create , artifactregistry.tags.update |
roles/artifactregistry.writer |
For building the docker images locally, you will need permission to push to GCP as described above and running local Docker daemon. You also have to authenticate docker with GCP, detailed documentation is here. But generally, you should be fine with running:
gcloud auth login
gcloud auth configure-docker europe-west1-docker.pkg.dev # Add more comma-separated repository hostnames if you wish