How to Build a Serverless Data API with DuckDB and FastAPI
In the modern data stack, speed and cost-efficiency are paramount. CTOs and engineering leaders are continuously seeking to eliminate anagement overhead without sacrificing performance. This article details a high-concurrency, low-cost pattern for serving analytical data by combining the in-process speed of DuckDB with the asynchronous power of FastAPI, deployed in a fully serverless environment.
We will move beyond trivial in-memory examples to build a production-grade, read-only analytical API. The core architectural challenge we solve is stateful, in-process database (DuckDB) in a stateless, ephemeral compute environment (serverless).
The solution involves treating the DuckDB database file as an immutable, read-only artifact, built by an upstream process and consumed at scale by our API.
On-Demand Shared Software Engineering Team, By Suscription.
Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.
State in a Stateless World
A serverless function (like AWS Lambda or Google Cloud Run) is ephemeral. Its local filesystem is temporary and not shared between concurrent instances.
- Using
:memory::duckdb.connect(':memory:')creates an in-memory database. This is useless for a serverless API, as the database is empty and will be destroyed the moment the function terminates. - Using
local.db:duckdb.connect('local.db')writes to the function's ephemeral disk. This is also useless, as the data is not persistent and, critically, not shared. Every concurrent execution of your function would have a different, empty database. - The "Concurrent Write" Fallacy: One might attempt to mount a shared file system (like EFS) and have all functions connect to
shared/data.duckdb. This will fail. DuckDB is an in-process database; it uses a write-ahead log (WAL) and file-level locks to manage writes. Multiple processes writing to the same file simultaneously will lead to database corruption.
The correct serverless pattern, therefore, is to decouple the data-writing process from the data-reading API.
- Data Preparation (The "Write" Path): A separate, scheduled process (e.g., a dbt run, a GitHub Action, or a CI/CD pipeline) is responsible for data ingestion. It runs, gathers data from sources, and builds a new DuckDB database file (
analytics.db). - Artifact Storage: This file is then uploaded to cloud object storage (e.g., AWS S3 or Google Cloud Storage).
- Serverless API (The "Read" Path): Our FastAPI application, running on Lambda or Cloud Run, is configured to connect to this S3/GCS file in read-only mode.
This architecture allows for massive read concurrency. One million concurrent users are handled by one million separate serverless instances, each with its own read-only handle to the database file. There is no central database bottleneck.
Step 1: The FastAPI Application Code
Our FastAPI application will be simple. The key is where and how it connects to DuckDB. We will use DuckDB's httpfs extension, which allows it to directly read from S3 and GCS.
First, let's define the project structure:
/fastapi-duckdb
|-- app/
| |-- main.py
|-- Dockerfile
|-- requirements.txt
requirements.txt
We need FastAPI, an ASGI server (Uvicorn), and DuckDB. For AWS Lambda, we also need mangum to adapt the ASGI app.
fastapi
uvicorn
duckdb
mangum # Required only for AWS Lambda
app/main.py
This code initializes the DuckDB connection on application startup. The startup event is key, as it allows the connection to be established once when the serverless instance boots (a "cold start") and reused for all subsequent "warm" requests.
import os
import duckdb
from fastapi import FastAPI, Depends
from contextlib import asynccontextmanager
# Retrieve the database path from environment variables
# Example: 's3://my-data-bucket/analytics.db'
# Example: 'gs://my-data-bucket/analytics.db'
DB_PATH = os.environ.get("DUCKDB_PATH")
READ_ONLY_MODE = True
class DuckDBConnection:
"""
A class to manage the DuckDB connection lifecycle.
It initializes the connection on startup and closes it on shutdown.
It installs and loads necessary extensions (httpfs) and sets
credentials for S3/GCS if provided via environment variables.
"""
def __init__(self):
self.connection = None
def connect(self):
print(f"Connecting to DuckDB at: {DB_PATH}")
self.connection = duckdb.connect(database=DB_PATH, read_only=READ_ONLY_MODE)
# Install and load the httpfs extension to read from S3/GCS
self.connection.install_extension("httpfs")
self.connection.load_extension("httpfs")
# Configure credentials for GCS
if DB_PATH.startswith("gs://"):
gcs_key_id = os.environ.get("GCS_KEY_ID")
gcs_secret = os.environ.get("GCS_SECRET")
if gcs_key_id and gcs_secret:
print("Setting GCS credentials...")
self.connection.execute("CREATE SECRET gcs_secret (TYPE GCS, KEY_ID ?, SECRET ?)", [gcs_key_id, gcs_secret])
self.connection.execute("SET s3_region='auto'") # Use 'auto' for GCS
self.connection.execute("SET s3_url_style='path'")
# Configure credentials for S3
elif DB_PATH.startswith("s3://"):
aws_key_id = os.environ.get("AWS_ACCESS_KEY_ID")
aws_secret = os.environ.get("AWS_SECRET_ACCESS_KEY")
aws_region = os.environ.get("AWS_REGION", "us-east-1")
if aws_key_id and aws_secret:
print(f"Setting S3 credentials for region: {aws_region}")
self.connection.execute(f"SET s3_region='{aws_region}'")
self.connection.execute(f"SET s3_access_key_id='{aws_key_id}'")
self.connection.execute(f"SET s3_secret_access_key='{aws_secret}'")
else:
print("Using IAM role for S3 access.")
print("DuckDB connection established.")
def close(self):
if self.connection:
self.connection.close()
print("DuckDB connection closed.")
def get_connection(self):
return self.connection
# Initialize the connection manager
db_manager = DuckDBConnection()
@asynccontextmanager
async def lifespan(app: FastAPI):
# On startup
db_manager.connect()
yield
# On shutdown
db_manager.close()
app = FastAPI(lifespan=lifespan)
# Dependency to get the DB connection
def get_db():
return db_manager.get_connection()
@app.get("/")
async def root():
return {"message": "Serverless DuckDB API is running"}
@app.get("/query")
async def query_db(sql: str, db=Depends(get_db)):
"""
Execute a SQL query.
WARNING: In production, NEVER pass raw SQL from a user.
This is for demonstration. Build specific endpoints.
"""
try:
# Execute the query and fetch results as a dictionary
result = db.sql(sql).to_df().to_dict('records')
return {"data": result}
except Exception as e:
return {"error": str(e)}
@app.get("/tables")
async def show_tables(db=Depends(get_db)):
"""A safe endpoint to list all tables."""
try:
tables = db.sql("SHOW TABLES").to_df().to_dict('records')
return {"tables": tables}
except Exception as e:
return {"error": str(e)}
# --- AWS Lambda Specific ---
# If deploying to Lambda, wrap the app with Mangum
# This part is ignored by Uvicorn/Cloud Run
try:
from mangum import Mangum
handler = Mangum(app)
except ImportError:
pass
Step 2: Deployment to Google Cloud Run
Cloud Run is the simplest environment for this stack. It builds a container and runs it, handling scaling from zero to N.
Dockerfile
# Use an official Python runtime as a parent image
FROM python:3.11-slim
# Set the working directory in the container
WORKDIR /code
# Copy the requirements file and install dependencies
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
# Copy the application code
COPY ./app /code/app
# Command to run the application using uvicorn
# Uvicorn will look for the 'app' variable in /code/app/main.py
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080"]
Deployment Steps:
- Grant Permissions: Ensure your Cloud Run service account has "Storage Object Viewer" permissions on the GCS bucket containing your DuckDB file.
- Set Environment Variables: When deploying the Cloud Run service, set the following:
DUCKDB_PATH:gs://your-bucket-name/analytics.db- (Optional) If not using the service account's default permissions, set
GCS_KEY_IDandGCS_SECRET.
On-Demand Shared Software Engineering Team, By Suscription.
Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.
Build and Deploy:
gcloud run deploy fastapi-duckdb-api \
--source . \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars="DUCKDB_PATH=gs://your-bucket-name/analytics.db"
Cloud Run will automatically build the container, push it to the Artifact Registry, and deploy it.
Step 3: Deployment to AWS Lambda
Deploying to Lambda is more complex due to its runtime environment. DuckDB is a compiled binary, and the one you pip install on a Mac (ARM64) will not run on Lambda (x86_64 or ARM64 Linux).
You have two primary solutions:
- Container Image Deployment (Recommended): This is the modern, preferred method. You use the exact same
Dockerfileas the Cloud Run example. - ZIP Deployment with Lambda Layers (Traditional): If you cannot use containers, you must use a pre-compiled DuckDB Lambda Layer.
Solution A: Lambda Container Deployment
- Create Lambda Function:
- In the AWS Console, create a new Lambda function.
- Choose "Container image" as the source.
- Browse ECR and select your
fastapi-duckdb-api:latestimage. - Permissions: Attach an IAM role to the function that has
s3:GetObjectpermissions on youranalytics.dbfile. - Environment Variables: Set
DUCKDB_PATHtos3://your-bucket-name/analytics.db.
- Create API Gateway:
- Create a new HTTP API in API Gateway.
- Create a route (e.g.,
GET /and a proxy route{proxy+}). - Configure an "Integration" to point to your new Lambda function.
Build and Push to ECR:
# Authenticate Docker with ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com
# Create the ECR repository
aws ecr create-repository --repository-name fastapi-duckdb-api
# Build, tag, and push
docker build -t fastapi-duckdb-api .
docker tag fastapi-duckdb-api:latest <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/fastapi-duckdb-api:latest
docker push <YOUR_AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com/fastapi-duckdb-api:latest
Modify Dockerfile: Add the mangum handler command.
# Use an AWS-provided base image for Python Lambda
FROM public.ecr.aws/lambda/python:3.11
# Set the working directory
WORKDIR ${LAMBDA_TASK_ROOT}
# Copy requirements and install
COPY ./requirements.txt ./requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy the application code
COPY ./app ./app
# Set the command to run the Mangum handler
# This tells Lambda to execute 'handler' in 'app.main'
CMD ["app.main.handler"]
Solution B: Lambda ZIP Deployment (with Layers)
If you must use a ZIP deployment, you cannot simply pip install duckdb. You must use a pre-compiled layer.
- Find a Layer: Search for a public DuckDB Lambda Layer compatible with your Python runtime (e.g., awslambda-layer-duckdb).
- Add Layer ARN: In your Lambda function configuration, add the Layer ARN.
- Package Code: Zip your
app/directory withoutduckdbinrequirements.txt(as it's provided by the layer). - Upload: Upload the ZIP file as your Lambda's code source.
- Configure: Set the handler to
app.main.handlerand configure API Gateway and environment variables as in the container solution.
On-Demand Shared Software Engineering Team, By Suscription.
Access a flexible, shared software product engineering team on demand through a predictable monthly subscription. Expert developers, designers, QA engineers, and a free project manager help you build MVPs, scale products, and innovate with modern technologies like React, Node.js, and more.
Performance and Cost Implications
This architecture presents clear, C-level trade-offs:
- Cost: Extremely low. You pay only for object storage (pennies per GB) and serverless execution time (milliseconds). There is no idle database server cost.
- Read Scalability: Near-infinite. Your read throughput scales with your serverless concurrency limit, which is typically thousands of concurrent requests.
- Performance:
- Cold Starts: The first request to a new instance will have latency. The function must initialize, and DuckDB must establish its remote connection to S3/GCS. This can take 1-3 seconds.
- Warm Reads: Once warm, the API is exceptionally fast. Queries are executed by the in-process DuckDB engine, often in single-digit milliseconds.
- Data Freshness: This is the primary trade-off. The data is read-only and only as fresh as your last data preparation build. This pattern is unsuitable for real-time transactional data but perfect for dashboards, business intelligence (BI), and analytical endpoints where data freshness of a few minutes or hours is acceptable.
A New Standard for Analytical APIs
By combining FastAPI, DuckDB, and serverless compute, we have engineered a highly scalable, cost-effective, and operationally simple solution for serving analytical data. The key is to embrace the immutable, read-only artifact pattern. This architecture decouples data ingestion from data serving, allowing each to scale independently and effectively, eliminating the need for a costly, always-on analytical database server for many common use cases.
FAQs
How does the combination of DuckDB and FastAPI facilitate a serverless architecture?
By pairing DuckDB with FastAPI, developers can create a high-concurrency analytical API without managing a dedicated database server. DuckDB operates as an in-process SQL engine that allows the FastAPI application to read data directly from cloud object storage (like AWS S3 or Google Cloud Storage) using the httpfs extension. This setup treats the database file as a read-only artifact, enabling the API to scale infinitely across serverless instances (such as AWS Lambda or Cloud Run) while maintaining low costs and high read performance.
What is the solution for managing stateful data in a stateless serverless environment?
The core challenge in serverless environments is that compute instances are ephemeral and do not share local storage. To address this, the "read-only artifact" pattern is used. Instead of writing to a local or in-memory database, the DuckDB database file is generated by an upstream process (e.g., a CI/CD pipeline) and uploaded to immutable object storage. The serverless API then connects to this remote file in read-only mode, decoupling the data storage from the compute layer and ensuring consistency across concurrent executions.
What are the performance and cost trade-offs of using this serverless data API pattern?
This architecture offers significant cost savings, as you only pay for storage and actual execution time, eliminating the expense of idle database servers. While "warm" requests provide exceptionally fast responses (often in single-digit milliseconds), "cold starts" may introduce a few seconds of latency while the connection to object storage is established. Additionally, because the database is read-only and updated via batch processes, this solution is ideal for analytical dashboards (OLAP) rather than real-time transactional (OLTP) systems requiring instant data freshness.