Fastapi langgraph agent production ready template
A production-ready FastAPI template for building AI agent applications with LangGraph integration. This template provides a robust foundation for building scalable, secure, and maintainable AI agent services.
A production-ready template for building AI agent backends with FastAPI and LangGraph. Handles the hard parts — stateful conversations, long-term memory, tool calling, observability, rate limiting, auth — so you can focus on your agent logic. The project is written primarily in Python, distributed under the MIT License license, first published in 2025. It has gained significant community traction with 2,339 stars and 544 forks on GitHub. Key topics include: agent, agentic-ai, docker, fastapi, fastapi-template.
FastAPI LangGraph Agent Template
A production-ready template for building AI agent backends with FastAPI and LangGraph. Handles the hard parts — stateful conversations, long-term memory, tool calling, observability, rate limiting, auth — so you can focus on your agent logic.
Built for AI engineers who want a solid foundation, not a tutorial project.
What's included
- LangGraph stateful agent with checkpointing, tool calling, and human-in-the-loop support
- Long-term memory via mem0 + pgvector — semantic search per user, cache-backed
- LLM service with circular model fallback, exponential backoff retries, and total timeout budget
- Langfuse tracing on all LLM calls; Prometheus metrics + Grafana dashboards
- JWT auth with session management; rate limiting via slowapi
- Alembic migrations; optional Valkey/Redis cache layer
- Structured logging with request/session/user context on every line
Quickstart
bashgit clone <repo-url> my-agent && cd my-agent cp .env.example .env.development # fill in your keys make install make docker-up # starts API + PostgreSQL
Open http://localhost:8000/docs to see the interactive API.
For local development without Docker see docs/getting-started.md.
Documentation
| Guide | What it covers |
|---|---|
| Getting Started | Prerequisites, local setup, first API call |
| Architecture | System design, request flow, component diagrams |
| Configuration | All environment variables with defaults |
| Authentication | JWT flow, sessions, endpoint reference |
| Database & Migrations | Schema, Alembic migrations, pgvector |
| LLM Service | Models, retries, fallback, timeout budget |
| Memory | mem0 long-term memory, cache layer |
| Observability | Langfuse, structured logging, Prometheus, profiling |
| Evaluation | Eval framework, custom metrics, reports |
| Docker | Docker, Compose, full monitoring stack |
Project structure
app/
api/v1/ # Route handlers
core/
langgraph/ # Agent graph + tools
prompts/ # System prompt template
cache.py # Valkey/Redis + in-memory fallback
config.py # Settings
middleware.py # Metrics, logging context, profiling
limiter.py # Rate limiting
models/ # SQLModel ORM models
schemas/ # Pydantic request/response schemas
services/ # LLM, database, memory services
alembic/ # Database migrations
evals/ # LLM evaluation framework
Contributing
PRs welcome. Please read docs/getting-started.md to get your environment set up, then follow the coding conventions in AGENTS.md.
Report security issues privately — see SECURITY.md.
License
See LICENSE.
FAQ
General
What is this template?
A production-ready foundation for AI agent backends built on FastAPI + LangGraph. It bundles the components you'd otherwise wire up by hand: stateful conversations, long-term memory, tool calling, observability, rate limiting, and JWT auth.
How does this differ from a basic LangGraph setup?
The base LangGraph quickstart stops at "agent runs locally". This template adds Alembic migrations, mem0 + pgvector long-term memory, Langfuse tracing, Prometheus + Grafana dashboards, JWT sessions, slowapi rate limiting, structured logging with per-request context, and a circular-fallback LLM service — production concerns you'd otherwise build separately.
Setup & Configuration
Do I need Docker?
Recommended but not required. make docker-up starts the API + PostgreSQL together. For local-only setup see docs/getting-started.md.
Which LLM providers are supported?
Today: OpenAI only via the LLMRegistry in app/services/llm/registry.py. Multi-provider support (Anthropic, Google, OpenRouter) via LangChain's init_chat_model is planned — see #51. Configure your model via DEFAULT_LLM_MODEL in .env.development.
How do I configure long-term memory?
Long-term memory is self-hosted: mem0 runs in-process and persists into your existing PostgreSQL via pgvector — there is no separate mem0 cloud account or API key. You only need a working OPENAI_API_KEY (used for fact extraction + embeddings) and the pgvector extension enabled. See docs/memory.md for details.
Development
How do I add a custom tool?
Drop a LangChain @tool-decorated function in app/core/langgraph/tools/ and register it in the tools list exported from that package. The agent picks it up on next start; no graph changes needed.
How does the LLM service handle failures?
Two layers: (1) per-call exponential-backoff retry via tenacity, (2) circular fallback — if the active model exhausts its retries, the service rotates to the next model in LLMRegistry and continues. A total timeout budget caps the whole call so latency stays bounded. See docs/llm-service.md.
Can I use this without Langfuse?
Yes. Set LANGFUSE_TRACING_ENABLED=false (or omit the Langfuse keys). The agent runs unchanged; structured logs still capture request/session/user context.
Troubleshooting
The API won't start
- Ensure PostgreSQL is running (
make docker-upbrings it up alongside the API) - Confirm
.env.developmentexists — copy from.env.exampleand fill in required keys - Apply migrations:
make migrate
Memory / semantic search returns nothing
- Verify the
pgvectorextension is enabled in your PostgreSQL instance - Confirm
OPENAI_API_KEYis valid (mem0 calls OpenAI for fact extraction + embeddings) - Check
LONG_TERM_MEMORY_MODELandLONG_TERM_MEMORY_EMBEDDER_MODELare set in.env.development
Rate limiting is too aggressive
Limits are defined in app/core/limiter.py (slowapi). Adjust per-route decorators or the default rate in that file. See docs/configuration.md for the related env vars.
Contributors
Showing top 9 contributors by commit count.