🚀 Building a FastAPI Backend for Your AI Application with ChromaDB, FAISS, and LangChain

ds66

2024-12-25

📘 Introduction

FastAPI is one of the most modern and high-performance Python web frameworks, ideal for building RESTful APIs for AI and data-intensive applications. When combined with ChromaDB, FAISS, LangChain, and LLMs like DeepSeek or OpenAI, FastAPI can serve as a scalable, secure, and flexible backend powering multiple AI frontends — including Streamlit, React, mobile apps, or bots.

This guide walks you through setting up a FastAPI backend that exposes endpoints for:

Chat with Retrieval-Augmented Generation (RAG)
File/document ingestion
Vector search using ChromaDB + FAISS
User session and chat history
Multimodal support: PDF, audio, images
Optional: JWT-based user authentication

✅ Table of Contents

Why Use FastAPI for AI Backends?
Project Structure
Setup and Installation
Defining the Vector Store Logic
Creating Endpoints with FastAPI
Uploading and Parsing Documents
Querying with LLM + LangChain
Adding Multimodal Inputs
Optional: Auth + Session History
Deploying with Uvicorn, Docker, or Cloud
Connecting Streamlit Frontend to FastAPI
Use Cases and Extensions
Final Thoughts + GitHub Template

1. 🧠 Why Use FastAPI for AI?

High Performance (based on Starlette + Pydantic)
Built-in OpenAPI and Swagger UI
Asynchronous (async/await) for scalability
Auto-generated docs
Easily integrates with LLMs, LangChain, FAISS, ChromaDB
Works well with CORS, OAuth2, JWT

2. 📂 Project Structure

css
ai-backend/
├── app/
│   ├── main.py│   ├── models/
│   │   ├── faiss_store.py│   │   ├── chroma_store.py│   │   ├── embeddings.py│   │   └── langchain_agent.py│   ├── routes/
│   │   ├── ingest.py│   │   ├── query.py│   │   └── auth.py│   ├── utils/
│   │   ├── pdf.py│   │   └── audio.py├── requirements.txt├── Dockerfile
└── README.md

3. ⚙️ Setup and Installation

Create a virtual environment:

bash
python -m venv venvsource venv/bin/activate

Install dependencies:

bash
pip install fastapi uvicorn langchain chromadb faiss-cpu sentence-transformers pydantic pypdf python-multipart

Optional for Whisper/Image support:

bash
pip install whisper transformers pillow

4. 🧬 Vector Store Logic (ChromaDB + FAISS)

`models/chroma_store.py`

python
import chromadbfrom sentence_transformers import SentenceTransformer

client = chromadb.Client()
collection = client.create_collection(name="docs")
embedder = SentenceTransformer("all-MiniLM-L6-v2")def add_to_chroma(text, doc_id, metadata=None):
    vec = embedder.encode([text])
    collection.add(
        documents=[text],
        ids=[doc_id],
        embeddings=vec,
        metadatas=[metadata or {}]
    )def query_chroma(query):    return collection.query(query_texts=[query], n_results=3)

`models/faiss_store.py`

python
import faissimport numpy as np

dimension = 384index = faiss.IndexFlatL2(dimension)
memory = []def add_to_faiss(text, embedder):
    vec = embedder.encode([text])
    memory.append(vec[0])
    index.add(np.array(memory).astype('float32'))def query_faiss(text, embedder):
    vec = embedder.encode([text])
    distances, indices = index.search(np.array(vec).astype('float32'), k=3)    return indices

5. 🧾 Creating FastAPI Endpoints

`app/main.py`

python
from fastapi import FastAPIfrom app.routes import ingest, query

app = FastAPI(title="AI API Backend")

app.include_router(ingest.router, prefix="/ingest")
app.include_router(query.router, prefix="/query")@app.get("/")def read_root():    return {"message": "Welcome to the AI Backend"}

6. 📄 Document Upload + Parsing

`routes/ingest.py`

python
from fastapi import APIRouter, UploadFile, Filefrom app.models.chroma_store import add_to_chromafrom PyPDF2 import PdfReader

router = APIRouter()@router.post("/upload")async def upload_file(file: UploadFile = File(...)):
    contents = await file.read()
    reader = PdfReader(file.file)
    text = "\n".join([page.extract_text() for page in reader.pages])
    add_to_chroma(text, doc_id=file.filename)    return {"status": "success", "filename": file.filename}

7. 🧠 LLM Query Endpoint

`routes/query.py`

python
from fastapi import APIRouterfrom pydantic import BaseModelfrom app.models.chroma_store import query_chromafrom langchain.llms import OpenAI

router = APIRouter()class QueryRequest(BaseModel):
    query: strllm = OpenAI(model_name="gpt-3.5-turbo")@router.post("/ask")def ask_q(req: QueryRequest):
    docs = query_chroma(req.query)
    context = "\n".join(docs["documents"][0])
    prompt = f"Context:\n{context}\n\nQuestion: {req.query}"
    response = llm(prompt)    return {"answer": response}

8. 🖼️ Multimodal Upload (Optional)

python
@router.post("/audio")async def transcribe_audio(file: UploadFile = File(...)):    import whisper
    model = whisper.load_model("base")
    result = model.transcribe(file.file.name)    return {"transcription": result["text"]}

9. 🔐 Optional Auth and History

Use fastapi-users or JWT tokens for user authentication and session tracking. Save chat history in SQLite or Redis for scalable, stateless deployments.

10. 🚀 Deployment

Run locally:

bash
uvicorn app.main:app --reload --port 8000

Create a Dockerfile:

dockerfile
FROM python:3.10
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Deploy on:

HuggingFace Spaces
Fly.io
Render
Railway
AWS/GCP/Azure with API Gateway + Lambda

11. 🌐 Connect to Streamlit Frontend

In streamlit/app.py:

python
import requests

query = st.text_input("Ask your question")if st.button("Submit"):
    res = requests.post("http://localhost:8000/query/ask", json={"query": query})
    st.write(res.json()["answer"])

Upload PDF:

python
file = st.file_uploader("Upload PDF")if file:
    res = requests.post("http://localhost:8000/ingest/upload", files={"file": file})
    st.success(f"Uploaded {file.name}")

12. 💼 Use Cases

Use Case	Description
Chatbot API	Power any frontend with unified AI logic
Education	Students upload notes, query content
Healthcare	Doctors upload clinical docs + voice notes
Enterprise RAG	Vector search over manuals, emails
Multilingual AI	Extend API to support different locales

13. ✅ Final Thoughts

A FastAPI backend gives your AI application:

Scalability: Serve multiple frontends
Security: Add JWT, rate limiting
Maintainability: Separation of concerns
Extensibility: Add PDF, image, audio support
Speed: Async + optimized I/O

When paired with Streamlit, LangChain, and DeepSeek, you get a production-grade RAG architecture ready to scale.

🧩 Bonus: Sample GitHub Repo Structure

css
ai-rag-fastapi/
├── app/
│   ├── main.py│   ├── routes/
│   ├── models/
│   └── utils/
├── streamlit_ui/
│   ├── app.py├── requirements.txt├── Dockerfile
└── README.md

Would you like a full repo template with Docker + Streamlit + FastAPI + ChromaDB preconfigured? I can generate one and include cloud deployment instructions (e.g., HuggingFace, Render, Railway).