🚀 Building a FastAPI Backend for Your AI Application with ChromaDB, FAISS, and LangChain

ic_writer ds66
ic_date 2024-12-25
blogs

📘 Introduction

FastAPI is one of the most modern and high-performance Python web frameworks, ideal for building RESTful APIs for AI and data-intensive applications. When combined with ChromaDB, FAISS, LangChain, and LLMs like DeepSeek or OpenAI, FastAPI can serve as a scalable, secure, and flexible backend powering multiple AI frontends — including Streamlit, React, mobile apps, or bots.

10511_cu8c_9287.jpeg

This guide walks you through setting up a FastAPI backend that exposes endpoints for:

  • Chat with Retrieval-Augmented Generation (RAG)

  • File/document ingestion

  • Vector search using ChromaDB + FAISS

  • User session and chat history

  • Multimodal support: PDF, audio, images

  • Optional: JWT-based user authentication

✅ Table of Contents

  1. Why Use FastAPI for AI Backends?

  2. Project Structure

  3. Setup and Installation

  4. Defining the Vector Store Logic

  5. Creating Endpoints with FastAPI

  6. Uploading and Parsing Documents

  7. Querying with LLM + LangChain

  8. Adding Multimodal Inputs

  9. Optional: Auth + Session History

  10. Deploying with Uvicorn, Docker, or Cloud

  11. Connecting Streamlit Frontend to FastAPI

  12. Use Cases and Extensions

  13. Final Thoughts + GitHub Template

1. 🧠 Why Use FastAPI for AI?

  • High Performance (based on Starlette + Pydantic)

  • Built-in OpenAPI and Swagger UI

  • Asynchronous (async/await) for scalability

  • Auto-generated docs

  • Easily integrates with LLMs, LangChain, FAISS, ChromaDB

  • Works well with CORS, OAuth2, JWT

2. 📂 Project Structure

css
ai-backend/
├── app/
│   ├── main.py│   ├── models/
│   │   ├── faiss_store.py│   │   ├── chroma_store.py│   │   ├── embeddings.py│   │   └── langchain_agent.py│   ├── routes/
│   │   ├── ingest.py│   │   ├── query.py│   │   └── auth.py│   ├── utils/
│   │   ├── pdf.py│   │   └── audio.py├── requirements.txt├── Dockerfile
└── README.md

3. ⚙️ Setup and Installation

Create a virtual environment:

bash
python -m venv venvsource venv/bin/activate

Install dependencies:

bash
pip install fastapi uvicorn langchain chromadb faiss-cpu sentence-transformers pydantic pypdf python-multipart

Optional for Whisper/Image support:

bash
pip install whisper transformers pillow

4. 🧬 Vector Store Logic (ChromaDB + FAISS)

models/chroma_store.py

python
import chromadbfrom sentence_transformers import SentenceTransformer

client = chromadb.Client()
collection = client.create_collection(name="docs")
embedder = SentenceTransformer("all-MiniLM-L6-v2")def add_to_chroma(text, doc_id, metadata=None):
    vec = embedder.encode([text])
    collection.add(
        documents=[text],
        ids=[doc_id],
        embeddings=vec,
        metadatas=[metadata or {}]
    )def query_chroma(query):    return collection.query(query_texts=[query], n_results=3)

models/faiss_store.py

python
import faissimport numpy as np

dimension = 384index = faiss.IndexFlatL2(dimension)
memory = []def add_to_faiss(text, embedder):
    vec = embedder.encode([text])
    memory.append(vec[0])
    index.add(np.array(memory).astype('float32'))def query_faiss(text, embedder):
    vec = embedder.encode([text])
    distances, indices = index.search(np.array(vec).astype('float32'), k=3)    return indices

5. 🧾 Creating FastAPI Endpoints

app/main.py

python
from fastapi import FastAPIfrom app.routes import ingest, query

app = FastAPI(title="AI API Backend")

app.include_router(ingest.router, prefix="/ingest")
app.include_router(query.router, prefix="/query")@app.get("/")def read_root():    return {"message": "Welcome to the AI Backend"}

6. 📄 Document Upload + Parsing

routes/ingest.py

python
from fastapi import APIRouter, UploadFile, Filefrom app.models.chroma_store import add_to_chromafrom PyPDF2 import PdfReader

router = APIRouter()@router.post("/upload")async def upload_file(file: UploadFile = File(...)):
    contents = await file.read()
    reader = PdfReader(file.file)
    text = "\n".join([page.extract_text() for page in reader.pages])
    add_to_chroma(text, doc_id=file.filename)    return {"status": "success", "filename": file.filename}

7. 🧠 LLM Query Endpoint

routes/query.py

python
from fastapi import APIRouterfrom pydantic import BaseModelfrom app.models.chroma_store import query_chromafrom langchain.llms import OpenAI

router = APIRouter()class QueryRequest(BaseModel):
    query: strllm = OpenAI(model_name="gpt-3.5-turbo")@router.post("/ask")def ask_q(req: QueryRequest):
    docs = query_chroma(req.query)
    context = "\n".join(docs["documents"][0])
    prompt = f"Context:\n{context}\n\nQuestion: {req.query}"
    response = llm(prompt)    return {"answer": response}

8. 🖼️ Multimodal Upload (Optional)

python
@router.post("/audio")async def transcribe_audio(file: UploadFile = File(...)):    import whisper
    model = whisper.load_model("base")
    result = model.transcribe(file.file.name)    return {"transcription": result["text"]}

9. 🔐 Optional Auth and History

Use fastapi-users or JWT tokens for user authentication and session tracking. Save chat history in SQLite or Redis for scalable, stateless deployments.

10. 🚀 Deployment

Run locally:

bash
uvicorn app.main:app --reload --port 8000

Create a Dockerfile:

dockerfile
FROM python:3.10
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Deploy on:

  • HuggingFace Spaces

  • Fly.io

  • Render

  • Railway

  • AWS/GCP/Azure with API Gateway + Lambda

11. 🌐 Connect to Streamlit Frontend

In streamlit/app.py:

python
import requests

query = st.text_input("Ask your question")if st.button("Submit"):
    res = requests.post("http://localhost:8000/query/ask", json={"query": query})
    st.write(res.json()["answer"])

Upload PDF:

python
file = st.file_uploader("Upload PDF")if file:
    res = requests.post("http://localhost:8000/ingest/upload", files={"file": file})
    st.success(f"Uploaded {file.name}")

12. 💼 Use Cases

Use CaseDescription
Chatbot APIPower any frontend with unified AI logic
EducationStudents upload notes, query content
HealthcareDoctors upload clinical docs + voice notes
Enterprise RAGVector search over manuals, emails
Multilingual AIExtend API to support different locales

13. ✅ Final Thoughts

A FastAPI backend gives your AI application:

  • Scalability: Serve multiple frontends

  • Security: Add JWT, rate limiting

  • Maintainability: Separation of concerns

  • Extensibility: Add PDF, image, audio support

  • Speed: Async + optimized I/O

When paired with Streamlit, LangChain, and DeepSeek, you get a production-grade RAG architecture ready to scale.

🧩 Bonus: Sample GitHub Repo Structure

css
ai-rag-fastapi/
├── app/
│   ├── main.py│   ├── routes/
│   ├── models/
│   └── utils/
├── streamlit_ui/
│   ├── app.py├── requirements.txt├── Dockerfile
└── README.md

Would you like a full repo template with Docker + Streamlit + FastAPI + ChromaDB preconfigured? I can generate one and include cloud deployment instructions (e.g., HuggingFace, Render, Railway).