How to Build a Serverless Data API with DuckDB and FastAPI

How to Build a Serverless Data API with DuckDB and FastAPI
Photo by Bernd đź“· Dittrich / Unsplash

In the modern data stack, speed and cost-efficiency are paramount. CTOs and engineering leaders are continuously seeking to eliminate anagement overhead without sacrificing performance. This article details a high-concurrency, low-cost pattern for serving analytical data by combining the in-process speed of DuckDB with the asynchronous power of FastAPI, deployed in a fully serverless environment.

We will move beyond trivial in-memory examples to build a production-grade, read-only analytical API. The core architectural challenge we solve is stateful, in-process database (DuckDB) in a stateless, ephemeral compute environment (serverless).

The solution involves treating the DuckDB database file as an immutable, read-only artifact, built by an upstream process and consumed at scale by our API.

On-Demand Shared Software Engineering Team, By Suscription.

Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.

Try 4Geeks Teams

State in a Stateless World

A serverless function (like AWS Lambda or Google Cloud Run) is ephemeral. Its local filesystem is temporary and not shared between concurrent instances.

  • Using :memory:: duckdb.connect(':memory:') creates an in-memory database. This is useless for a serverless API, as the database is empty and will be destroyed the moment the function terminates.
  • Using local.db: duckdb.connect('local.db') writes to the function's ephemeral disk. This is also useless, as the data is not persistent and, critically, not shared. Every concurrent execution of your function would have a different, empty database.
  • The "Concurrent Write" Fallacy: One might attempt to mount a shared file system (like EFS) and have all functions connect to shared/data.duckdb. This will fail. DuckDB is an in-process database; it uses a write-ahead log (WAL) and file-level locks to manage writes. Multiple processes writing to the same file simultaneously will lead to database corruption.

The correct serverless pattern, therefore, is to decouple the data-writing process from the data-reading API.

  1. Data Preparation (The "Write" Path): A separate, scheduled process (e.g., a dbt run, a GitHub Action, or a CI/CD pipeline) is responsible for data ingestion. It runs, gathers data from sources, and builds a new DuckDB database file (analytics.db).
  2. Artifact Storage: This file is then uploaded to cloud object storage (e.g., AWS S3 or Google Cloud Storage).
  3. Serverless API (The "Read" Path): Our FastAPI application, running on Lambda or Cloud Run, is configured to connect to this S3/GCS file in read-only mode.

This architecture allows for massive read concurrency. One million concurrent users are handled by one million separate serverless instances, each with its own read-only handle to the database file. There is no central database bottleneck.

Step 1: The FastAPI Application Code

Our FastAPI application will be simple. The key is where and how it connects to DuckDB. We will use DuckDB's httpfs extension, which allows it to directly read from S3 and GCS.

First, let's define the project structure:

/fastapi-duckdb
|-- app/
|   |-- main.py
|-- Dockerfile
|-- requirements.txt

requirements.txt

We need FastAPI, an ASGI server (Uvicorn), and DuckDB. For AWS Lambda, we also need mangum to adapt the ASGI app.

fastapi
uvicorn
duckdb
mangum  # Required only for AWS Lambda

app/main.py

This code initializes the DuckDB connection on application startup. The startup event is key, as it allows the connection to be established once when the serverless instance boots (a "cold start") and reused for all subsequent "warm" requests.

import os
import duckdb
from fastapi import FastAPI, Depends
from contextlib import asynccontextmanager

# Retrieve the database path from environment variables
# Example: 's3://my-data-bucket/analytics.db'
# Example: 'gs://my-data-bucket/analytics.db'
DB_PATH = os.environ.get("DUCKDB_PATH")
READ_ONLY_MODE = True

class DuckDBConnection:
    """
    A class to manage the DuckDB connection lifecycle.
    
    It initializes the connection on startup and closes it on shutdown.
    It installs and loads necessary extensions (httpfs) and sets
    credentials for S3/GCS if provided via environment variables.
    """
    def __init__(self):
        self.connection = None

    def connect(self):
        print(f"Connecting to DuckDB at: {DB_PATH}")
        self.connection = duckdb.connect(database=DB_PATH, read_only=READ_ONLY_MODE)
        
        # Install and load the httpfs extension to read from S3/GCS
        self.connection.install_extension("httpfs")
        self.connection.load_extension("httpfs")
        
        # Configure credentials for GCS
        if DB_PATH.startswith("gs://"):
            gcs_key_id = os.environ.get("GCS_KEY_ID")
            gcs_secret = os.environ.get("GCS_SECRET")
            if gcs_key_id and gcs_secret:
                print("Setting GCS credentials...")
                self.connection.execute("CREATE SECRET gcs_secret (TYPE GCS, KEY_ID ?, SECRET ?)", [gcs_key_id, gcs_secret])
                self.connection.execute("SET s3_region='auto'") # Use 'auto' for GCS
                self.connection.execute("SET s3_url_style='path'")
            
        # Configure credentials for S3
        elif DB_PATH.startswith("s3://"):
            aws_key_id = os.environ.get("AWS_ACCESS_KEY_ID")
            aws_secret = os.environ.get("AWS_SECRET_ACCESS_KEY")
            aws_region = os.environ.get("AWS_REGION", "us-east-1")
            
            if aws_key_id and aws_secret:
                print(f"Setting S3 credentials for region: {aws_region}")
                self.connection.execute(f"SET s3_region='{aws_region}'")
                self.connection.execute(f"SET s3_access_key_id='{aws_key_id}'")
                self.connection.execute(f"SET s3_secret_access_key='{aws_secret}'")
            else:
                print("Using IAM role for S3 access.")

        print("DuckDB connection established.")

    def close(self):
        if self.connection:
            self.connection.close()
            print("DuckDB connection closed.")

    def get_connection(self):
        return self.connection

# Initialize the connection manager
db_manager = DuckDBConnection()

@asynccontextmanager
async def lifespan(app: FastAPI):
    # On startup
    db_manager.connect()
    yield
    # On shutdown
    db_manager.close()

app = FastAPI(lifespan=lifespan)

# Dependency to get the DB connection
def get_db():
    return db_manager.get_connection()


@app.get("/")
async def root():
    return {"message": "Serverless DuckDB API is running"}

@app.get("/query")
async def query_db(sql: str, db=Depends(get_db)):
    """
    Execute a SQL query.
    WARNING: In production, NEVER pass raw SQL from a user.
    This is for demonstration. Build specific endpoints.
    """
    try:
        # Execute the query and fetch results as a dictionary
        result = db.sql(sql).to_df().to_dict('records')
        return {"data": result}
    except Exception as e:
        return {"error": str(e)}

@app.get("/tables")
async def show_tables(db=Depends(get_db)):
    """A safe endpoint to list all tables."""
    try:
        tables = db.sql("SHOW TABLES").to_df().to_dict('records')
        return {"tables": tables}
    except Exception as e:
        return {"error": str(e)}

# --- AWS Lambda Specific ---
# If deploying to Lambda, wrap the app with Mangum
# This part is ignored by Uvicorn/Cloud Run
try:
    from mangum import Mangum
    handler = Mangum(app)
except ImportError:
    pass

Step 2: Deployment to Google Cloud Run

Cloud Run is the simplest environment for this stack. It builds a container and runs it, handling scaling from zero to N.

Dockerfile

# Use an official Python runtime as a parent image
FROM python:3.11-slim

# Set the working directory in the container
WORKDIR /code

# Copy the requirements file and install dependencies
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

# Copy the application code
COPY ./app /code/app

# Command to run the application using uvicorn
# Uvicorn will look for the 'app' variable in /code/app/main.py
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080"]

Deployment Steps:

  1. Grant Permissions: Ensure your Cloud Run service account has "Storage Object Viewer" permissions on the GCS bucket containing your DuckDB file.
  2. Set Environment Variables: When deploying the Cloud Run service, set the following:
    • DUCKDB_PATH: gs://your-bucket-name/analytics.db
    • (Optional) If not using the service account's default permissions, set GCS_KEY_ID and GCS_SECRET.

On-Demand Shared Software Engineering Team, By Suscription.

Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.

Try 4Geeks Teams

Build and Deploy:

gcloud run deploy fastapi-duckdb-api \
  --source . \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars="DUCKDB_PATH=gs://your-bucket-name/analytics.db"

Cloud Run will automatically build the container, push it to the Artifact Registry, and deploy it.

Step 3: Deployment to AWS Lambda

Deploying to Lambda is more complex due to its runtime environment. DuckDB is a compiled binary, and the one you pip install on a Mac (ARM64) will not run on Lambda (x86_64 or ARM64 Linux).

You have two primary solutions:

  1. Container Image Deployment (Recommended): This is the modern, preferred method. You use the exact same Dockerfile as the Cloud Run example.
  2. ZIP Deployment with Lambda Layers (Traditional): If you cannot use containers, you must use a pre-compiled DuckDB Lambda Layer.

Solution A: Lambda Container Deployment

  1. Create Lambda Function:
    • In the AWS Console, create a new Lambda function.
    • Choose "Container image" as the source.
    • Browse ECR and select your fastapi-duckdb-api:latest image.
    • Permissions: Attach an IAM role to the function that has s3:GetObject permissions on your analytics.db file.
    • Environment Variables: Set DUCKDB_PATH to s3://your-bucket-name/analytics.db.
  2. Create API Gateway:
    • Create a new HTTP API in API Gateway.
    • Create a route (e.g., GET / and a proxy route {proxy+}).
    • Configure an "Integration" to point to your new Lambda function.

Build and Push to ECR:

# Authenticate Docker with ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com

# Create the ECR repository
aws ecr create-repository --repository-name fastapi-duckdb-api

# Build, tag, and push
docker build -t fastapi-duckdb-api .
docker tag fastapi-duckdb-api:latest <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/fastapi-duckdb-api:latest
docker push <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/fastapi-duckdb-api:latest

Modify Dockerfile: Add the mangum handler command.

# Use an AWS-provided base image for Python Lambda
FROM public.ecr.aws/lambda/python:3.11

# Set the working directory
WORKDIR ${LAMBDA_TASK_ROOT}

# Copy requirements and install
COPY ./requirements.txt ./requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy the application code
COPY ./app ./app

# Set the command to run the Mangum handler
# This tells Lambda to execute 'handler' in 'app.main'
CMD ["app.main.handler"]

Solution B: Lambda ZIP Deployment (with Layers)

If you must use a ZIP deployment, you cannot simply pip install duckdb. You must use a pre-compiled layer.

  1. Find a Layer: Search for a public DuckDB Lambda Layer compatible with your Python runtime (e.g., awslambda-layer-duckdb).
  2. Add Layer ARN: In your Lambda function configuration, add the Layer ARN.
  3. Package Code: Zip your app/ directory without duckdb in requirements.txt (as it's provided by the layer).
  4. Upload: Upload the ZIP file as your Lambda's code source.
  5. Configure: Set the handler to app.main.handler and configure API Gateway and environment variables as in the container solution.

On-Demand Shared Software Engineering Team, By Suscription.

Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.

Try 4Geeks Teams

Performance and Cost Implications

This architecture presents clear, C-level trade-offs:

  • Cost: Extremely low. You pay only for object storage (pennies per GB) and serverless execution time (milliseconds). There is no idle database server cost.
  • Read Scalability: Near-infinite. Your read throughput scales with your serverless concurrency limit, which is typically thousands of concurrent requests.
  • Performance:
    • Cold Starts: The first request to a new instance will have latency. The function must initialize, and DuckDB must establish its remote connection to S3/GCS. This can take 1-3 seconds.
    • Warm Reads: Once warm, the API is exceptionally fast. Queries are executed by the in-process DuckDB engine, often in single-digit milliseconds.
  • Data Freshness: This is the primary trade-off. The data is read-only and only as fresh as your last data preparation build. This pattern is unsuitable for real-time transactional data but perfect for dashboards, business intelligence (BI), and analytical endpoints where data freshness of a few minutes or hours is acceptable.

A New Standard for Analytical APIs

By combining FastAPI, DuckDB, and serverless compute, we have engineered a highly scalable, cost-effective, and operationally simple solution for serving analytical data. The key is to embrace the immutable, read-only artifact pattern. This architecture decouples data ingestion from data serving, allowing each to scale independently and effectively, eliminating the need for a costly, always-on analytical database server for many common use cases.

FAQs

How does the combination of DuckDB and FastAPI facilitate a serverless architecture?

By pairing DuckDB with FastAPI, developers can create a high-concurrency analytical API without managing a dedicated database server. DuckDB operates as an in-process SQL engine that allows the FastAPI application to read data directly from cloud object storage (like AWS S3 or Google Cloud Storage) using the httpfs extension. This setup treats the database file as a read-only artifact, enabling the API to scale infinitely across serverless instances (such as AWS Lambda or Cloud Run) while maintaining low costs and high read performance.

What is the solution for managing stateful data in a stateless serverless environment?

The core challenge in serverless environments is that compute instances are ephemeral and do not share local storage. To address this, the "read-only artifact" pattern is used. Instead of writing to a local or in-memory database, the DuckDB database file is generated by an upstream process (e.g., a CI/CD pipeline) and uploaded to immutable object storage. The serverless API then connects to this remote file in read-only mode, decoupling the data storage from the compute layer and ensuring consistency across concurrent executions.

What are the performance and cost trade-offs of using this serverless data API pattern?

This architecture offers significant cost savings, as you only pay for storage and actual execution time, eliminating the expense of idle database servers. While "warm" requests provide exceptionally fast responses (often in single-digit milliseconds), "cold starts" may introduce a few seconds of latency while the connection to object storage is established. Additionally, because the database is read-only and updated via batch processes, this solution is ideal for analytical dashboards (OLAP) rather than real-time transactional (OLTP) systems requiring instant data freshness.

Read more

Optimizing Software Delivery with Cross-Functional Team Structures in 4Geeks Teams

Optimizing Software Delivery with Cross-Functional Team Structures in 4Geeks Teams

In the competitive landscape of software development, the traditional "siloed" approach—where designers design, developers code, and testers test in isolation—is rapidly becoming obsolete. It creates bottlenecks, miscommunication, and delayed launches. To build high-quality digital products at speed, modern enterprises are shifting toward cross-functional teams: cohesive units

By Allan Porras