LLM Agent Framework

The LLM Agent Framework in this codebase provides a bridge between Flyte's structured task execution and the autonomous capabilities of Large Language Models (LLMs). By leveraging Flyte's type engine and task templates, the framework allows developers to expose Flyte tasks as tools that agents can invoke to perform complex, multi-step operations.

The framework is primarily implemented across two provider-specific packages:

flyteplugins.anthropic.agents._function_tools
flyteplugins.gemini.agents._function_tools

Core Concepts

The framework revolves around three main components: the FunctionTool, the Agent configuration, and the run_agent execution loop.

Bridging Tasks and Tools with `function_tool`

The function_tool utility is the entry point for creating agent-compatible tools. It accepts either a standard Python callable or a Flyte task (represented as an AsyncFunctionTaskTemplate).

A key feature of this implementation is its integration with the Flyte type engine. When a Flyte task is passed to function_tool, it uses the task's existing json_schema (derived from NativeInterface). This ensures that complex types like dataclasses, enums, and FlyteFile are correctly represented in the JSON schema sent to the LLM.

# From plugins/anthropic/src/flyteplugins/anthropic/agents/_function_tools.py
def function_tool(
    func: AsyncFunctionTaskTemplate | typing.Callable | None = None,
    *,
    name: str | None = None,
    description: str | None = None,
) -> "FunctionTool | partial[FunctionTool]":
    # ... logic to extract schema and metadata ...
    return FunctionTool(
        name=tool_name,
        description=tool_description.strip(),
        input_schema=input_schema,
        func=actual_func,
        task=task,
        # ...
    )

Agent Configuration

The Agent class serves as a configuration container for the LLM's behavior. While both Anthropic and Gemini implementations share similar structures, they differ in their model-specific parameters.

Anthropic Agent (flyteplugins.anthropic.agents._function_tools.Agent):

model: Defaults to "claude-sonnet-4-20250514".
max_tokens: Controls the response length.
instructions: The system prompt.

Gemini Agent (flyteplugins.gemini.agents._function_tools.Agent):

model: Defaults to "gemini-2.5-flash".
max_output_tokens: Controls the response length (note the naming difference from Anthropic).
instructions: The system prompt.

Both classes include a max_iterations attribute (defaulting to 10) to prevent infinite tool-call loops.

The Execution Loop: `run_agent`

The run_agent function implements the core conversation loop. It manages the state of the conversation, handles the back-and-forth between the LLM and tool execution, and returns the final response.

The loop follows this general flow:

Send the current message history to the LLM.
If the LLM returns a text response, return it as the final result.
If the LLM requests tool calls:
- Iterate through requested tools.
- Execute tools using FunctionTool.execute().
- Append tool results to the conversation history.
Repeat until a final response is reached or max_iterations is exceeded.

Tool Execution Logic

The FunctionTool.execute method is designed to work seamlessly within a Flyte environment. If the tool wraps a Flyte task, it uses task.aio(**kwargs). In a live Flyte execution context, this routes the call through the Flyte controller, providing full observability and tracking for each tool invocation.

# From plugins/anthropic/src/flyteplugins/anthropic/agents/_function_tools.py
async def execute(self, **kwargs) -> typing.Any:
    if self.task is not None:
        # Routes through Flyte controller in task context
        return await self.task.aio(**kwargs)
    if self.is_async:
        return await self.func(**kwargs)
    # Sync functions are run in a thread to avoid blocking
    return await asyncio.to_thread(self.func, **kwargs)

Integration with Flyte Workflows

Agents are typically executed within a Flyte task. This allows the agent to access secrets (like ANTHROPIC_API_KEY or GOOGLE_API_KEY) and benefit from Flyte's resource management.

The following example from examples/genai/anthropic_pbj_agent.py demonstrates how multiple Flyte tasks are converted into tools and used by an agent:

@agent_env.task
async def sandwich_agent(goals: list[str]) -> list[str]:
    # Create tools from Flyte tasks
    tools = [
        function_tool(get_bread),
        function_tool(get_peanut_butter),
        function_tool(get_jelly),
        function_tool(spread_ingredient),
        function_tool(assemble_sandwich),
        function_tool(eat_sandwich),
    ]

    async def run_single_goal(goal: str, index: int) -> str:
        # flyte.group provides logical grouping in the UI
        with flyte.group(f"sandwich-maker-{index}"):
            result = await run_agent(
                prompt=goal,
                tools=tools,
                system="You are a sandwich-making assistant.",
                model="claude-sonnet-4-20250514",
            )
            return result

    tasks = [run_single_goal(goal, idx) for idx, goal in enumerate(goals, start=1)]
    results = await asyncio.gather(*tasks)
    return list(results)

Key Integration Points

Secret Management: API keys are retrieved from environment variables (ANTHROPIC_API_KEY, GOOGLE_API_KEY) which can be populated via Flyte's Secret mechanism or TaskEnvironment.
Observability: By using flyte.group, tool calls made by the agent are grouped together in the Flyte console, making it easier to debug the agent's decision-making process.
Type Safety: The use of function_tool ensures that the LLM receives accurate schemas for Flyte-native types, reducing the likelihood of malformed tool arguments.

Core Concepts​

Bridging Tasks and Tools with function_tool​

Agent Configuration​

The Execution Loop: run_agent​

Tool Execution Logic​

Integration with Flyte Workflows​

Key Integration Points​