Skip to main content

Dynamic Code Sandboxes

Dynamic Code Sandboxes allow you to execute arbitrary code strings or shell commands in isolated, ephemeral environments. This is useful for running dynamically generated logic (e.g., from an LLM), executing scripts with specific system dependencies, or running non-Python binaries without pre-building Docker images.

Creating Sandboxes with flyte.sandbox.create

The flyte.sandbox.create function is the primary entry point for creating containerized sandboxes. It supports three distinct execution modes.

Auto-IO Mode (Default)

In Auto-IO mode, you provide a Python code snippet. Flyte automatically generates an argparse preamble to inject your declared inputs as local variables and an epilogue to collect your declared scalar outputs.

import flyte.sandbox
import datetime

# Define the sandbox
sandbox = flyte.sandbox.create(
name="compute-stats",
code="""
import numpy as np
nums = np.array([float(v) for v in values.split(",")])
mean = float(np.mean(nums))
window_end = dt + delta
""",
inputs={
"values": str,
"dt": datetime.datetime,
"delta": datetime.timedelta,
},
outputs={
"mean": float,
"window_end": datetime.datetime,
},
packages=["numpy"],
)

# Execute the sandbox
mean, window_end = await sandbox.run.aio(
values="10,20,30",
dt=datetime.datetime(2024, 1, 1),
delta=datetime.timedelta(days=7),
)

Verbatim Mode

If you need full control over the Python script (e.g., manual file handling or complex CLI argument parsing), set auto_io=False. In this mode, inputs are still forwarded as CLI arguments, but you must read them from /var/inputs/ and write outputs to /var/outputs/ manually.

from flyte.io import File
import flyte.sandbox

etl_sandbox = flyte.sandbox.create(
name="manual-etl",
code="""
import json, pathlib
# Inputs are materialized in /var/inputs/
payload = json.loads(pathlib.Path("/var/inputs/payload").read_text())
total = sum(payload["values"])

# Outputs must be written to /var/outputs/
pathlib.Path("/var/outputs/total").write_text(str(total))
""",
inputs={"payload": File},
outputs={"total": int},
auto_io=False,
)

Command Mode

Command mode allows you to run arbitrary shell commands or binaries. This is useful for running tools like pytest or custom C++ binaries.

from flyte.io import File
import flyte.sandbox

test_runner = flyte.sandbox.create(
name="pytest-sandbox",
command=["/bin/bash", "-c", "pytest /var/inputs/tests.py -q"],
inputs={"tests.py": File},
outputs={"exit_code": str},
)

Configuring the Runtime Environment

Sandboxes automatically build a Docker image at runtime based on your requirements. You can customize this environment using the packages, system_packages, and image_config arguments.

from flyte.sandbox import ImageConfig
import flyte.sandbox

config = ImageConfig(
registry="my-registry.io/project",
python_version=(3, 11)
)

sandbox = flyte.sandbox.create(
name="custom-env",
code="import pandas; print(pandas.__version__)",
packages=["pandas==2.1.0"],
system_packages=["libpq-dev", "gcc"],
additional_commands=["RUN echo 'Custom build step'"],
image_config=config,
)

Image Configuration Fields

The ImageConfig class (found in flyte.sandbox._code_sandbox) supports:

  • registry: The Docker registry to push the built image to.
  • registry_secret: The name of the Flyte secret containing registry credentials.
  • python_version: A tuple specifying the Python version (e.g., (3, 10)).

Dynamic Orchestration with orchestrator_from_str

For scenarios where you need to orchestrate multiple Flyte tasks using a dynamic expression (e.g., logic generated by an LLM), use orchestrator_from_str. This uses the Monty runtime for microsecond-startup, side-effect-free execution.

import flyte.sandbox
from my_tasks import add, multiply

# Create a task template from a code string
pipeline = flyte.sandbox.orchestrator_from_str(
source="multiply(add(x, y), 2)",
inputs={"x": int, "y": int},
output=int,
tasks=[add, multiply],
)

# Run the dynamic pipeline
result = flyte.run(pipeline, x=10, y=5) # Returns 30

The CodeTaskTemplate returned by orchestrator_from_str behaves like any other Flyte task and can be passed to flyte.run().

Deploying Sandboxes as Tasks

If you want to register a sandbox as a permanent part of your Flyte project rather than running it one-shot, use the .as_task() method to convert it into a ContainerTask.

import flyte.sandbox

sandbox = flyte.sandbox.create(
name="reusable-task",
code="result = x + 1",
inputs={"x": int},
outputs={"result": int},
)

# Convert to a deployable ContainerTask
container_task = await sandbox.as_task.aio()

# Now container_task can be used in a @workflow
@flyte.workflow
def my_workflow(val: int) -> int:
return container_task(x=val)

Troubleshooting and Constraints

Supported I/O Types

Sandboxes only support a specific subset of types for inputs and outputs:

  • Primitives: int, float, str, bool
  • Date/Time: datetime.datetime, datetime.timedelta
  • IO Handles: flyte.io.File, flyte.io.Dir

Mutually Exclusive Arguments

In flyte.sandbox.create, the code and command arguments are mutually exclusive. You must provide exactly one.

Output Collection in Auto-IO

In Auto-IO mode, scalar outputs (like int or float) must be assigned to a Python variable that exactly matches the key in the outputs dictionary. For File and Dir outputs, your code must write the data to /var/outputs/<name> regardless of the auto_io setting.

Network Access

By default, sandboxes have network access. To isolate the container completely, pass block_network=True. This applies network_mode=none in local Docker and a NetworkPolicy on-cluster.