Deployment Guide

This guide describes how to deploy the application to a new server.

Prerequisites

Linux Server (Ubuntu 22.04+ recommended)
NVIDIA GPU: Required for translation, embeddings, and NER services.
NVIDIA Container Toolkit: Must be installed to allow Docker to access the GPU.
Docker & Docker Compose: Latest versions.
Git: To clone the repository.
External Service: An instance of AllTalk running externally or on the host (port 7851 by default).

Clone the Repository

git clone <your-repo-url>
cd <your-repo-name>

Configure Environment Variables Copy the example configuration file:
```
cp .env.example .env
```
Edit .env and set secure passwords and configuration:
```
nano .env
```
- Change POSTGRES_PASSWORD and DB_PASS to a strong unique password.
- Change SECRET_KEY to a long random string.
- Verify ALLTALK_URL points to your AllTalk instance (default assumes host machine access).
Start the Services Run the following command to build and start the application:
```
docker compose up -d --build
```
Database Initialization The database will automatically initialize on the first run using the scripts in init-db/. This may take a few minutes. Check logs with:
```
docker compose logs -f db
```
Verify Deployment Access the application at http://<your-server-ip>:8001.

Models: The application mounts ./models and ./hf_cache to persist AI models. On the first run, it will attempt to download necessary models (NLLB, BERT, etc.), which requires significant bandwidth and time.
Data Persistence: Database data is stored in ./pgdata (mapped in docker-compose). Ensure this directory is backed up.
Security: Ensure port 5432 (Postgres) and 6379 (Redis) are firewall-protected and not exposed to the public internet unless intended (Docker maps them to the host network).