Skip to main content

Execution Environments

Execution environments serve as the primary mechanism for defining and grouping infrastructure requirements. By encapsulating settings like container images, compute resources, and secrets into reusable objects, the SDK allows developers to separate the "what" (business logic) from the "how" (infrastructure configuration).

The Environment Hierarchy

The SDK implements a hierarchical approach to infrastructure management, rooted in the base Environment class found in src/flyte/_environment.py. This base class defines the common denominator for all execution contexts:

  • Infrastructure Metadata: Name, description, and deployment dependencies (depends_on).
  • Resource Allocation: CPU, memory, GPU, and disk requirements via the Resources class.
  • Security and Context: Secrets and environment variables.
  • Containerization: Docker image specifications and Kubernetes PodTemplate references.

This design allows the SDK to provide specialized behaviors for different execution patterns—specifically batch processing via TaskEnvironment and long-running services via AppEnvironment.

Task Environments and Grouping

The TaskEnvironment class (in src/flyte/_task_environment.py) is designed for batch-oriented workflows. Its most significant design choice is the use of the @env.task decorator to establish a strong relationship between a function and its execution context.

Namespace and Identity

When a task is defined using @env.task, its identity is inextricably linked to the environment. The fully-qualified name (FQN) of a task is constructed as <env_name>.<function_name>. For example:

env = flyte.TaskEnvironment(name="analytics_env")

@env.task
async def process_data():
...

In this case, the task's FQN becomes analytics_env.process_data. This naming convention enforces organizational discipline, ensuring that tasks are logically grouped by their infrastructure needs rather than just their Python module path.

Configuration Overrides

The SDK implements a three-tier override system to balance consistency with flexibility:

  1. Environment Level: Sets defaults for all tasks (e.g., a shared base image).
  2. Decorator Level: Overrides specific settings for one task (e.g., retries or timeout).
  3. Invocation Level: Using task.override(), settings can be changed at the moment the task is called within a workflow.

Optimization via Container Reuse

A key performance feature in TaskEnvironment is the reusable parameter, which utilizes ReusePolicy. This allows containers to stay "warm" between task executions, significantly reducing startup latency for high-frequency tasks.

However, this optimization introduces specific architectural constraints visible in src/flyte/_task_environment.py:

  • Async Requirement: Reusable environments with a concurrency greater than 1 are restricted to async tasks. The SDK raises a ValueError if a synchronous function is decorated in a high-concurrency reusable environment, as sync functions would block the shared container process.
  • Override Restrictions: When reusable is enabled, certain infrastructure settings like resources, env_vars, and secrets cannot be overridden at invocation time unless the user explicitly passes reusable="off". This prevents runtime changes that would require a container restart, which would defeat the purpose of the reuse policy.

Long-Running App Environments

For services that need to persist beyond a single execution—such as APIs or dashboards—the SDK provides AppEnvironment (in src/flyte/app/_app_environment.py). While it inherits the core infrastructure settings of the base Environment, it shifts the focus toward networking and lifecycle management.

Serving Logic

Unlike TaskEnvironment, which uses decorators to define discrete units of work, AppEnvironment uses decorators like @app_env.server to define the entry point of a service. It also manages:

  • Port Management: Defaulting to 8080, with strict validation to prevent the use of reserved system ports (e.g., 8012, 9090).
  • Scaling: Integration with a Scaling object to control replicas and autoscaling behavior.
  • Authentication: A requires_auth toggle that defaults to True, ensuring services are secure by default.

The fserve Runtime

A critical implementation detail of AppEnvironment is its reliance on the fserve runtime. When parameters are defined for an app, the SDK automatically constructs a container command starting with fserve. This runtime is responsible for materializing parameters and managing the app's lifecycle. If a developer provides a custom command that bypasses fserve while still trying to use parameters, the SDK raises a ValueError to prevent silent configuration failures.

Environment Evolution via Cloning

To avoid repetitive configuration, both TaskEnvironment and AppEnvironment implement a clone_with method. This pattern allows developers to define a "base" environment and then create variations for specific use cases:

base_env = flyte.TaskEnvironment(
name="base",
image="my-custom-image:v1",
resources=flyte.Resources(cpu="2", memory="4Gi")
)

# Create a high-memory variant for a specific task
high_mem_env = base_env.clone_with(
name="high-mem",
resources=flyte.Resources(memory="16Gi")
)

This approach ensures that infrastructure remains "DRY" (Don't Repeat Yourself) while still allowing for the fine-grained tuning required by complex production workloads.