Skip to main content

App Serving

Deploying and managing long-running applications in Flyte is handled through the serving infrastructure. This system allows you to run applications—such as web servers, model endpoints, or dashboards—both on your local machine for development and on a remote Flyte cluster for production, using a unified interface.

In this tutorial, you will learn how to define an application, serve it locally to verify its behavior, and then configure it for remote deployment.

Prerequisites

To follow this tutorial, you need the flyte SDK installed and a basic understanding of AppEnvironment. You will also need fastapi and uvicorn installed if you want to run the web server examples.

1. Define an App Environment

The first step is to define what your application looks like using an AppEnvironment. For this tutorial, we will use the FastAPIAppEnvironment helper, which simplifies running FastAPI applications.

from flyte.app.extras import FastAPIAppEnvironment

# Define the application environment
app_env = FastAPIAppEnvironment(
name="hello-world-app",
port=8080,
)

@app_env.server
def run_server():
from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/")
def read_root():
return {"message": "Hello from Flyte App!"}

uvicorn.run(app, host="0.0.0.0", port=8080)

The @app_env.server decorator marks the function that Flyte will execute to start your application.

2. Serve Locally for Development

Before deploying to a cluster, you can serve your app locally. This starts the application in a background thread or subprocess on your machine.

import flyte
import httpx

# Serve the app locally
handle = flyte.serve(app_env)

# Wait for the app to become active
handle.activate(wait=True)

print(f"App is active at: {handle.endpoint}")

# Verify the app is running
response = httpx.get(handle.endpoint)
print(response.json())

# Stop the app
handle.deactivate(wait=True)

When you call flyte.serve(app_env), it returns an AppHandle. This handle is your primary interface for managing the application's lifecycle. The activate(wait=True) call ensures the server is fully started and passing health checks before your code continues.

3. Manage Lifecycle with Context Managers

Manually calling activate and deactivate can be error-prone. The AppHandle provides an ephemeral_ctx_sync context manager (and an async ephemeral_ctx) that automatically handles the startup and shutdown for you.

import flyte
import httpx

handle = flyte.serve(app_env)

# Use the context manager for automatic lifecycle management
with handle.ephemeral_ctx_sync():
print(f"App is ready at {handle.endpoint}")
response = httpx.get(handle.endpoint)
assert response.status_code == 200

# The app is automatically deactivated after the block ends
assert handle.is_deactivated()

This pattern is particularly useful for integration tests where you need a running instance of your app for the duration of a test suite.

4. Configure Advanced Serving Options

For production deployments or specific local testing scenarios, you may need to override environment variables, set timeouts, or specify a cluster pool. You can do this using with_servecontext.

import flyte

# Configure a custom serve context
serve_ctx = flyte.with_servecontext(
mode="local",
env_vars={"DEBUG": "True", "API_KEY": "secret-123"},
activate_timeout=30.0,
health_check_path="/health"
)

# Serve the app with the custom configuration
handle = serve_ctx.serve(app_env)

The with_servecontext function returns a _Serve object that holds your configuration. Common parameters include:

  • env_vars: A dictionary of environment variables to inject into the app.
  • activate_timeout: How long to wait (in seconds) for the app to pass health checks.
  • health_check_path: The URL path Flyte should poll to determine if the app is healthy (defaults to /health).

5. Deploy Remotely

To deploy your application to the Flyte backend, simply change the mode to "remote". This will build a code bundle, containerize your application, and deploy it to the configured Flyte cluster.

import flyte

# Deploy to the Flyte backend
remote_handle = flyte.with_servecontext(
mode="remote",
project="my-flyte-project",
domain="development",
cluster_pool="standard-pool",
env_vars={"STAGE": "prod"}
).serve(app_env)

print(f"App deployed! Monitor it at: {remote_handle.url}")
print(f"Public endpoint: {remote_handle.endpoint}")

Because AppHandle is a protocol, the code you wrote to interact with handle.endpoint or handle.activate() works exactly the same way for a remote deployment as it did for your local server.

Complete Example

Here is the complete code for defining, configuring, and serving an application locally using the patterns described above:

import flyte
import httpx
from flyte.app.extras import FastAPIAppEnvironment

# 1. Define
app_env = FastAPIAppEnvironment(name="tutorial-app", port=8080)

@app_env.server
def run_server():
from fastapi import FastAPI
import uvicorn
app = FastAPI()
@app.get("/health")
def health(): return {"status": "ok"}
@app.get("/")
def index(): return {"data": "success"}
uvicorn.run(app, host="0.0.0.0", port=8080)

# 2. Configure and Serve
if __name__ == "__main__":
handle = flyte.with_servecontext(
mode="local",
env_vars={"APP_VERSION": "1.0.0"},
health_check_path="/health"
).serve(app_env)

# 3. Use
with handle.ephemeral_ctx_sync():
print(f"Calling {handle.endpoint}...")
resp = httpx.get(handle.endpoint)
print(f"Response: {resp.json()}")

print("App shut down successfully.")

Next Steps

  • Explore flyte.app.AppEnvironment to learn how to define dependencies and resource requirements for your apps.
  • Check the flyte.app.ctx module to see how your application can access Flyte-specific context like raw_data_path during execution.