This project implements a Retrieval-Augmented Generation (RAG) API using FastAPI to answer questions about Ghaymah Cloud documentation. It features a sophisticated two-stage retrieval process involving an initial vector search followed by a more precise re-ranking step to ensure high-quality answers.

Key Features

FastAPI Backend: A robust and fast API for serving the RAG pipeline.
Two-Stage Retrieval:
1. Initial Search: Uses sentence-transformers to perform a broad vector search and retrieve an initial set of candidate documents.
2. Re-ranking: Employs a CrossEncoder model to re-rank the initial candidates for greater relevance and precision.
Dockerized: Comes with a Dockerfile for easy, repeatable deployment on any platform that supports containers.
Visualization: Includes a rerank_test.html page to visually compare the results before and after the re-ranking step.

Getting Started

Prerequisites

Docker
A Git client

Deployment

This application is designed to be deployed as a Docker container. It can be deployed via a Git-based workflow on a platform like Ghaymah Cloud.

Push to Git: Push the code to a GitHub or GitLab repository.
Connect Platform: Connect your cloud platform to the Git repository.
Build and Deploy: The platform will use the included Dockerfile to automatically build and deploy the application.

Configuration

The application requires the following environment variables to be set in the deployment environment:

GITPASHA_HOST: The URL for the remote vector store (GitPasha).
OPENAI_API_KEY: Your API key for the LLM provider (e.g., OpenAI).