📄 Local Contract Triage (Multi-Pass AI)
A privacy-first, fully local web application designed to automatically scan, extract, and audit legal contracts for potential red flags. Built with Python and Streamlit, this tool leverages open-source Large Language Models (LLMs) to ensure your sensitive legal documents never leave your machine.
🌟 Overview
Reviewing contracts is often tedious and error-prone, but uploading sensitive agreements to cloud-based AI tools poses a massive security risk. Local Contract Triage solves this by running powerful AI models entirely on your local hardware.
By utilizing a Multi-Pass AI Architecture, the application minimizes hallucinations and ensures that any identified “Red Flags” are backed by exact, verbatim quotes from the original document.
✨ Key Features
- Privacy First: 100% local processing. No data is sent to external APIs (OpenAI, Anthropic, etc.).
- Multi-Pass AI Architecture: Splits the AI’s workload into smaller, specialized tasks (Extraction vs. Auditing) for significantly higher accuracy.
- Verbatim Sourcing: The AI is constrained to provide exact quotes from the contract to prove its risk analysis.
- Instant PDF Parsing: Rapidly extracts raw text from complex PDF documents using PyMuPDF.
- Clean UI: An intuitive, responsive interface built with Streamlit.
🧠 The Multi-Pass Architecture
Instead of asking an LLM to read a massive contract and find issues in one go (which often leads to missed clauses or hallucinations), this application uses a two-step pipeline:
-
Pass 1: Pure Extraction (The Gatherer)
- Objective: Act as an objective data extractor.
- Action: Scans the text strictly for clauses related to automatic renewals, early termination, and limits of liability.
- Output: Returns only the verbatim text. No analysis.
-
Pass 2: Risk Evaluation (The Auditor)
- Objective: Act as a senior corporate paralegal.
- Action: Reviews the narrowed-down clauses from Pass 1 and evaluates them for predatory risks or hidden costs.
- Output: A formatted Red Flag report detailing the risk category, an analysis of why it’s harmful, and the verbatim quote as proof.
🛠️ Tech Stack
| Component | Technology |
|---|---|
| Language | Python |
| Frontend | Streamlit |
| PDF Parsing | PyMuPDF (fitz) |
| Local AI Engine | Ollama |
| LLM Model | qwen2.5-coder:14b (Customizable based on your local hardware) |
🚀 Getting Started
Prerequisites
- Python 3.8+ installed on your machine.
- Ollama installed and running in the background. Download Ollama here
Installation
-
Clone the repository:
git clone https://github.com/sarun1220/ai-contract-analyzer.git cd ai-contract-analyzer -
Install the required Python packages:
pip install streamlit pymupdf ollama -
Pull the Local LLM via Ollama: Note: This downloads the 14-billion parameter Qwen coder model. Ensure you have sufficient RAM/VRAM.
ollama pull qwen2.5-coder:14b
Running the App
Start the Streamlit server:
streamlit run app.py
Navigate to the localhost URL provided in your terminal, upload a PDF contract, and click Scan Contract to see the magic happen!
Built as a Proof of Concept (PoC) to demonstrate secure, locally hosted AI workflows in the legal tech space.