Resource Management

To allocate compute resources such as CPU, Memory, GPU, and TPU to your tasks, use the Resources class within a TaskEnvironment or as an override during task execution.

import flyte

# Define resources for all tasks in an environment
env = flyte.TaskEnvironment(
    name="ml-env",
    resources=flyte.Resources(
        cpu=2,
        memory="4Gi",
        gpu="A100:1",
        shm="auto"
    ),
)

@env.task
async def train_model():
    ...

# Or override resources for a specific task call
await train_model.override(
    resources=flyte.Resources(cpu=4, memory="16Gi", gpu="A100:2")
)()

Specifying CPU and Memory

The Resources class (defined in src/flyte/_resources.py) allows you to specify CPU and Memory using either single values or request/limit ranges.

CPU: Accepts int, float, or Kubernetes-style strings (e.g., "500m").
Memory: Accepts strings with Kubernetes units (e.g., "1Gi", "512Mi").

To set separate requests and limits, provide a tuple:

# Request 1 CPU (limit 2) and 2Gi memory (limit 4Gi)
flyte.Resources(
    cpu=(1, 2),
    memory=("2Gi", "4Gi")
)

Allocating Accelerators (GPU, TPU, Neuron)

You can allocate accelerators using three different formats for the gpu parameter:

1. Simple Count

Pass an int to request a generic GPU.

flyte.Resources(gpu=1)

2. String Format (Type and Quantity)

Pass a string in the format "Type:Quantity". The type must match one of the supported Accelerators literals in src/flyte/_resources.py (e.g., T4, L4, A100, H100, V100).

flyte.Resources(gpu="A100 80G:8")

3. Advanced Device Configuration

For complex requirements like MIG partitioning or TPU slices, use the Device helper functions: GPU, TPU, Neuron, AMD_GPU, or HABANA_GAUDI.

# A100 with MIG partitioning (1g.5gb slice)
flyte.Resources(
    gpu=flyte.GPU(device="A100", quantity=1, partition="1g.5gb")
)

# Google Cloud TPU v5p with a specific slice
flyte.Resources(
    gpu=flyte.TPU(device="V5P", partition="2x2x1")
)

Shared Memory and Disk

Disk: Use the disk parameter to request ephemeral storage.
Shared Memory (shm): Useful for ML data loading. Setting shm="auto" automatically requests the maximum shared memory available on the node.

flyte.Resources(
    disk="100Gi",
    shm="16Gi"  # Or "auto"
)

Troubleshooting and Constraints

GPU Quantity: When using the Device class or GPU() helper directly, the quantity must be at least 1.
Validation: Resources validates that CPU and Memory tuples contain exactly two elements.
String Literals: If using the string format for GPUs (e.g., "H100:1"), the device name must exactly match the supported types defined in the Accelerators literal in src/flyte/_resources.py.
OOM Recovery: You can use .override() inside a try/except block to retry a task with more memory if it fails with an OOMError.

try:
    await my_task()
except flyte.errors.OOMError:
    # Retry with more memory
    await my_task.override(resources=flyte.Resources(memory="16Gi"))()

Specifying CPU and Memory​

Allocating Accelerators (GPU, TPU, Neuron)​

1. Simple Count​

2. String Format (Type and Quantity)​

3. Advanced Device Configuration​

Shared Memory and Disk​

Troubleshooting and Constraints​