Skip to main content

Container Image Specification

The container image specification in this SDK is built around a fluent, immutable API that allows you to define environment requirements programmatically. Instead of writing manual Dockerfiles, you construct an Image object by selecting a base and layering modifications on top of it.

The Fluent API Pattern

The Image class (defined in src/flyte/_image.py) follows a two-step construction pattern:

  1. Base Selection: Start with a from_* class method to establish the foundation.
  2. Customization: Use with_* methods to add layers. Each with_* call returns a new, immutable Image instance by calling the internal clone() method.
from flyte import Image
from pathlib import Path

image = (
Image.from_debian_base(python_version=(3, 12))
.with_apt_packages("curl", "git")
.with_pip_packages("pandas", "numpy")
.with_env_vars({"APP_STAGE": "prod"})
)

Immutability and Cloning

The Image class is designed to be immutable. When you call a customization method like with_pip_packages, the SDK uses Image.clone() to create a copy of the image, appending a new Layer to the _layers tuple. This ensures that base image definitions can be reused across different tasks without side effects.

Base Image Constructors

The SDK provides several entry points for creating an image:

  • from_debian_base(): The recommended starting point. It uses a Flyte-optimized Debian-based image. You can specify python_version, flyte_version, and whether to install_flyte (defaults to True).
  • from_base(image_uri: str): Starts from any pre-existing image URI (e.g., "python:3.11-slim").
  • from_dockerfile(file, registry, name): Uses a custom Dockerfile. Note: Images created this way are not extendable (you cannot use with_* methods) because the SDK does not parse the Dockerfile to determine its internal state.
  • from_uv_script(script, name, ...): A specialized constructor that parses a uv-compatible script header to determine dependencies and Python version.
  • from_ref_name(name): References an image defined in the Flyte configuration or CLI arguments.

Image Layers and Customization

Customizations are represented by the Layer class. Every modification—whether adding packages, files, or environment variables—is stored as a layer in the Image._layers attribute.

The "No Lists" Rule

A critical implementation detail in Layer (and its subclasses like PipPackages and AptPackages) is that they do not accept lists. All collections must be passed as tuples or separate arguments. This is enforced in Layer.__post_init__ to ensure that the layer objects are hashable, which is required for generating consistent image tags.

# Correct: Using separate arguments or tuples
image.with_pip_packages("pandas", "scikit-learn")

# Incorrect: This will raise a TypeError
# image.with_pip_packages(["pandas", "scikit-learn"])

Common Customization Methods

  • with_pip_packages(*packages, ...): Adds a PipPackages layer. Supports index_url, extra_index_urls, and pre (for pre-releases).
  • with_apt_packages(*packages, ...): Adds an AptPackages layer for system-level dependencies.
  • with_requirements(file): Adds a Requirements layer, which is a specialized PipPackages layer that reads from a .txt file.
  • with_source_folder(src, dst): Copies a local directory into the image using a CopyConfig layer.
  • with_commands(commands): Runs arbitrary shell commands during the build process.

Hashing and Tagging

The SDK automatically manages image tagging through a hashing mechanism. The Image._get_hash_digest() method calculates an MD5 hash based on:

  1. The base_image URI.
  2. The contents of the dockerfile (if provided).
  3. The properties and contents of every Layer in the _layers tuple.

Each layer implements update_hash(hasher, ignore). For example, the Requirements layer includes the hash of the actual requirements file content:

# src/flyte/_image.py

class Requirements(PipPackages):
file: Path

def update_hash(self, hasher: hashlib._Hash, ignore: Optional[Any] = None):
from ._utils import filehash_update
super().update_hash(hasher, ignore=ignore)
filehash_update(self.file, hasher)

This ensures that if you change a single dependency in your requirements.txt, the resulting image URI will change, triggering a new build.

Secrets and Private Dependencies

For builds requiring access to private repositories or registries, the SDK supports secret_mounts. These can be used with with_pip_packages, with_apt_packages, and with_commands.

from flyte import Secret

image = Image.from_debian_base().with_pip_packages(
"private-pkg",
secret_mounts=[Secret(key="GITHUB_PAT", as_env_var="GITHUB_PAT")]
)

When the image is built, these secrets are mounted into the build context, allowing tools like pip to authenticate against private indexes.

Integration with Tasks

Images are typically associated with tasks via the TaskEnvironment. When a task is decorated, the TaskEnvironment ensures the task runs within the specified container.

# examples/image/base_image.py

image = Image.from_debian_base().with_pip_packages("httpx")
env = flyte.TaskEnvironment(name="my_env", image=image)

@env.task
async def my_task(data: str) -> str:
import httpx
# Task logic here
return data

When you call flyte.build(image), the SDK resolves the final URI (e.g., registry/name:hash) and prepares the image for deployment.