الملفات
rag-app/README.md

1.7 KiB

Ghaymah Docs RAG API

This project implements a Retrieval-Augmented Generation (RAG) API using FastAPI to answer questions about Ghaymah Cloud documentation. It features a sophisticated two-stage retrieval process involving an initial vector search followed by a more precise re-ranking step to ensure high-quality answers.

Key Features

  • FastAPI Backend: A robust and fast API for serving the RAG pipeline.
  • Two-Stage Retrieval:
    1. Initial Search: Uses sentence-transformers to perform a broad vector search and retrieve an initial set of candidate documents.
    2. Re-ranking: Employs a CrossEncoder model to re-rank the initial candidates for greater relevance and precision.
  • Dockerized: Comes with a Dockerfile for easy, repeatable deployment on any platform that supports containers.
  • Visualization: Includes a rerank_test.html page to visually compare the results before and after the re-ranking step.

Getting Started

Prerequisites

  • Docker
  • A Git client

Deployment

This application is designed to be deployed as a Docker container. It can be deployed via a Git-based workflow on a platform like Ghaymah Cloud.

  1. Push to Git: Push the code to a GitHub or GitLab repository.
  2. Connect Platform: Connect your cloud platform to the Git repository.
  3. Build and Deploy: The platform will use the included Dockerfile to automatically build and deploy the application.

Configuration

The application requires the following environment variables to be set in the deployment environment:

  • GITPASHA_HOST: The URL for the remote vector store (GitPasha).
  • OPENAI_API_KEY: Your API key for the LLM provider (e.g., OpenAI).