Skip to main content

Resource Management

To allocate compute resources such as CPU, Memory, GPU, and TPU to your tasks, use the Resources class within a TaskEnvironment or as an override during task execution.

import flyte

# Define resources for all tasks in an environment
env = flyte.TaskEnvironment(
name="ml-env",
resources=flyte.Resources(
cpu=2,
memory="4Gi",
gpu="A100:1",
shm="auto"
),
)

@env.task
async def train_model():
...

# Or override resources for a specific task call
await train_model.override(
resources=flyte.Resources(cpu=4, memory="16Gi", gpu="A100:2")
)()

Specifying CPU and Memory

The Resources class (defined in src/flyte/_resources.py) allows you to specify CPU and Memory using either single values or request/limit ranges.

  • CPU: Accepts int, float, or Kubernetes-style strings (e.g., "500m").
  • Memory: Accepts strings with Kubernetes units (e.g., "1Gi", "512Mi").

To set separate requests and limits, provide a tuple:

# Request 1 CPU (limit 2) and 2Gi memory (limit 4Gi)
flyte.Resources(
cpu=(1, 2),
memory=("2Gi", "4Gi")
)

Allocating Accelerators (GPU, TPU, Neuron)

You can allocate accelerators using three different formats for the gpu parameter:

1. Simple Count

Pass an int to request a generic GPU.

flyte.Resources(gpu=1)

2. String Format (Type and Quantity)

Pass a string in the format "Type:Quantity". The type must match one of the supported Accelerators literals in src/flyte/_resources.py (e.g., T4, L4, A100, H100, V100).

flyte.Resources(gpu="A100 80G:8")

3. Advanced Device Configuration

For complex requirements like MIG partitioning or TPU slices, use the Device helper functions: GPU, TPU, Neuron, AMD_GPU, or HABANA_GAUDI.

# A100 with MIG partitioning (1g.5gb slice)
flyte.Resources(
gpu=flyte.GPU(device="A100", quantity=1, partition="1g.5gb")
)

# Google Cloud TPU v5p with a specific slice
flyte.Resources(
gpu=flyte.TPU(device="V5P", partition="2x2x1")
)

Shared Memory and Disk

  • Disk: Use the disk parameter to request ephemeral storage.
  • Shared Memory (shm): Useful for ML data loading. Setting shm="auto" automatically requests the maximum shared memory available on the node.
flyte.Resources(
disk="100Gi",
shm="16Gi" # Or "auto"
)

Troubleshooting and Constraints

  • GPU Quantity: When using the Device class or GPU() helper directly, the quantity must be at least 1.
  • Validation: Resources validates that CPU and Memory tuples contain exactly two elements.
  • String Literals: If using the string format for GPUs (e.g., "H100:1"), the device name must exactly match the supported types defined in the Accelerators literal in src/flyte/_resources.py.
  • OOM Recovery: You can use .override() inside a try/except block to retry a task with more memory if it fails with an OOMError.
try:
await my_task()
except flyte.errors.OOMError:
# Retry with more memory
await my_task.override(resources=flyte.Resources(memory="16Gi"))()