What We're Building
In this guide, you'll build a sophisticated AI-powered contract analyzer that helps legal professionals and businesses quickly understand complex legal documents. The application will accept PDF uploads, extract text content, analyze it with Claude's powerful language understanding, and deliver actionable insights.
Your contract analyzer will include:
- Drag-and-drop file upload - Intuitive interface for uploading PDF contracts
- PDF text extraction - Robust parsing that handles multi-page documents
- Intelligent chunking - Smart document splitting for large contracts that exceed token limits
- Clause extraction - Automatic identification of key clauses (indemnification, termination, liability, etc.)
- Risk identification - AI-powered detection of potentially problematic terms
- Summary reports - Clear, actionable summaries with highlighted concerns
- Contract storage - Persistent storage of uploaded documents and analysis results
Claude excels at legal document analysis due to its strong reasoning capabilities, nuanced understanding of context, and ability to follow complex instructions. Claude 3 Opus is particularly well-suited for this task as it can handle lengthy documents and provide detailed, accurate analysis.
Prerequisites
Before starting this guide, make sure you have the following:
- Node.js 18+ installed on your machine
- Anthropic API key with access to Claude 3 Opus (get one at console.anthropic.com)
- Supabase account for database and file storage (free tier works)
- Basic knowledge of React, Next.js, and TypeScript
- Understanding of REST APIs and async/await patterns
Tech Stack Specification
Here's the technology stack we'll use for this build:
| Layer | Technology | Why This Choice |
|---|---|---|
| Frontend | Next.js 14, Tailwind | Document UI with rapid styling for legal interfaces |
| Backend | Next.js API Routes | Serverless document processing pipeline |
| Database | Supabase | Contract storage with full-text search capabilities |
| AI | Claude API | Superior document analysis and legal reasoning |
| pdf-parse | Reliable PDF text extraction for document processing | |
| Hosting | Vercel | Zero-config deployment with edge functions |
AI Agent Workflow
Here's how to leverage AI tools throughout this build to maximize productivity:
Project Scaffolding with Claude Code
Use Claude Code to generate the initial project structure, including the Next.js setup, Supabase configuration, and API route boilerplate. This saves hours of manual configuration.
# Example prompt for Claude Code
Create a Next.js 14 project with TypeScript for a contract analyzer:
- App Router with /upload and /analysis/[id] routes
- Supabase client configuration
- API routes for /api/upload and /api/analyze
- Environment variable setup for ANTHROPIC_API_KEY and SUPABASE_URL
- Tailwind CSS with a dark theme
- File upload component with drag-and-drop
UI Generation with v0.dev
Generate the contract analysis dashboard UI with v0.dev. Request a professional legal-themed interface with document preview, risk indicators, and clause highlighting.
When prompting v0.dev, specify "legal tech" or "document analysis" aesthetics. Ask for risk severity indicators (red/yellow/green), expandable clause sections, and a professional color scheme appropriate for legal professionals.
Development with Cursor
Use Cursor's AI features for implementing the PDF parsing logic and Claude API integration. The inline code completion excels at TypeScript types for API responses and error handling patterns.
Step-by-Step Build Guide
Phase 1: PDF Upload and Text Extraction
Start by creating a polished drag-and-drop upload interface with PDF text extraction. This component handles PDF validation, shows upload progress, and extracts text content for analysis.
// components/ContractUploader.tsx
import { useCallback, useState } from 'react';
import { useDropzone } from 'react-dropzone';
interface ContractUploaderProps {
onUpload: (file: File) => Promise<void>;
}
export function ContractUploader({ onUpload }: ContractUploaderProps) {
const [uploading, setUploading] = useState(false);
const [progress, setProgress] = useState(0);
const onDrop = useCallback(async (acceptedFiles: File[]) => {
const file = acceptedFiles[0];
if (!file || file.type !== 'application/pdf') {
alert('Please upload a PDF file');
return;
}
setUploading(true);
try {
await onUpload(file);
} finally {
setUploading(false);
setProgress(0);
}
}, [onUpload]);
const { getRootProps, getInputProps, isDragActive } = useDropzone({
onDrop,
accept: { 'application/pdf': ['.pdf'] },
maxFiles: 1,
maxSize: 10 * 1024 * 1024, // 10MB limit
});
return (
<div
{...getRootProps()}
className={`
border-2 border-dashed rounded-xl p-12 text-center
transition-colors cursor-pointer
${isDragActive ? 'border-blue-500 bg-blue-500/10' : 'border-gray-600 hover:border-gray-500'}
`}
>
<input {...getInputProps()} />
{uploading ? (
<div className="space-y-4">
<div className="animate-spin w-8 h-8 border-2 border-blue-500 border-t-transparent rounded-full mx-auto" />
<p>Uploading... {progress}%</p>
</div>
) : (
<div className="space-y-4">
<div className="text-4xl">📄</div>
<p className="text-lg">
{isDragActive ? 'Drop your contract here' : 'Drag & drop a PDF contract, or click to browse'}
</p>
<p className="text-sm text-gray-500">Maximum file size: 10MB</p>
</div>
)}
</div>
);
}
Phase 2: Contract Parsing with Claude
Use pdf-parse to extract text from uploaded PDFs, then send the content to Claude for intelligent parsing. This phase establishes the core document processing pipeline.
// lib/pdf-parser.ts
import pdf from 'pdf-parse';
export interface ParsedContract {
text: string;
pageCount: number;
metadata: {
title?: string;
author?: string;
creationDate?: string;
};
}
export async function parseContractPDF(buffer: Buffer): Promise<ParsedContract> {
const data = await pdf(buffer);
return {
text: data.text,
pageCount: data.numpages,
metadata: {
title: data.info?.Title,
author: data.info?.Author,
creationDate: data.info?.CreationDate,
},
};
}
// API Route: /api/upload/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { parseContractPDF } from '@/lib/pdf-parser';
import { createClient } from '@supabase/supabase-js';
export async function POST(request: NextRequest) {
const formData = await request.formData();
const file = formData.get('file') as File;
if (!file) {
return NextResponse.json({ error: 'No file provided' }, { status: 400 });
}
const buffer = Buffer.from(await file.arrayBuffer());
const parsed = await parseContractPDF(buffer);
// Store in Supabase
const supabase = createClient(
process.env.SUPABASE_URL!,
process.env.SUPABASE_SERVICE_KEY!
);
// Upload PDF to storage
const fileName = `contracts/${Date.now()}-${file.name}`;
await supabase.storage.from('contracts').upload(fileName, buffer);
// Save metadata to database
const { data, error } = await supabase
.from('contracts')
.insert({
filename: file.name,
storage_path: fileName,
page_count: parsed.pageCount,
text_content: parsed.text,
status: 'pending_analysis',
})
.select()
.single();
if (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
return NextResponse.json({ contractId: data.id, pageCount: parsed.pageCount });
}
Phase 3: Key Clause Identification
Large contracts can exceed Claude's context window. Implement smart chunking that preserves clause boundaries and enables accurate extraction of key contractual clauses.
// lib/chunker.ts
export interface DocumentChunk {
index: number;
content: string;
startPage?: number;
endPage?: number;
}
const MAX_CHUNK_SIZE = 100000; // ~25k tokens, safe for Claude
const OVERLAP_SIZE = 500; // Overlap to maintain context
export function chunkDocument(text: string): DocumentChunk[] {
// If document fits in one chunk, return as-is
if (text.length <= MAX_CHUNK_SIZE) {
return [{ index: 0, content: text }];
}
const chunks: DocumentChunk[] = [];
let currentPosition = 0;
let chunkIndex = 0;
while (currentPosition < text.length) {
let endPosition = currentPosition + MAX_CHUNK_SIZE;
// Try to break at paragraph or section boundary
if (endPosition < text.length) {
const breakPoints = [
text.lastIndexOf('\n\n', endPosition), // Paragraph break
text.lastIndexOf('.\n', endPosition), // Sentence + newline
text.lastIndexOf('. ', endPosition), // Sentence break
];
const bestBreak = breakPoints
.filter(bp => bp > currentPosition + MAX_CHUNK_SIZE * 0.5)
.sort((a, b) => b - a)[0];
if (bestBreak) {
endPosition = bestBreak + 1;
}
}
chunks.push({
index: chunkIndex,
content: text.slice(
Math.max(0, currentPosition - (chunkIndex > 0 ? OVERLAP_SIZE : 0)),
Math.min(text.length, endPosition)
),
});
currentPosition = endPosition;
chunkIndex++;
}
return chunks;
}
Phase 4: Risk Assessment Scoring
The core of your analyzer: carefully crafted prompts that guide Claude to perform thorough risk assessment and score potential issues. This prompt structure ensures consistent, comprehensive results with actionable risk ratings.
// lib/contract-analyzer.ts
import Anthropic from '@anthropic-ai/sdk';
import { chunkDocument } from './chunker';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
export interface ContractAnalysis {
summary: string;
clauses: ExtractedClause[];
risks: IdentifiedRisk[];
recommendations: string[];
}
export interface ExtractedClause {
type: string;
title: string;
content: string;
location: string;
}
export interface IdentifiedRisk {
severity: 'high' | 'medium' | 'low';
title: string;
description: string;
clauseReference: string;
recommendation: string;
}
const LEGAL_ANALYSIS_PROMPT = `You are an expert legal analyst specializing in contract review.
Analyze the following contract text and provide a comprehensive analysis.
Your analysis must include:
1. **EXECUTIVE SUMMARY** (2-3 paragraphs)
- Contract type and parties involved
- Key terms and obligations
- Overall assessment
2. **KEY CLAUSES** - Extract and categorize important clauses:
- Indemnification clauses
- Limitation of liability
- Termination provisions
- Confidentiality/NDA terms
- Intellectual property rights
- Payment terms
- Governing law and jurisdiction
- Force majeure
- Assignment restrictions
- Non-compete/non-solicitation
3. **RISK ANALYSIS** - Identify potential issues with severity ratings:
- HIGH: Terms that could cause significant financial or legal exposure
- MEDIUM: Terms that are unfavorable but manageable
- LOW: Minor concerns or areas for negotiation
4. **RECOMMENDATIONS** - Actionable suggestions for:
- Terms to negotiate or modify
- Missing protections to add
- Clarifications needed
Respond in valid JSON format matching this structure:
{
"summary": "Executive summary text...",
"clauses": [
{
"type": "indemnification",
"title": "Clause title",
"content": "Relevant clause text",
"location": "Section X.X"
}
],
"risks": [
{
"severity": "high|medium|low",
"title": "Risk title",
"description": "Detailed description",
"clauseReference": "Section X.X",
"recommendation": "Suggested action"
}
],
"recommendations": ["Recommendation 1", "Recommendation 2"]
}
CONTRACT TEXT:
`;
export async function analyzeContract(contractText: string): Promise<ContractAnalysis> {
const chunks = chunkDocument(contractText);
if (chunks.length === 1) {
// Single chunk - analyze directly
return analyzeSingleChunk(chunks[0].content);
}
// Multiple chunks - analyze each, then synthesize
const chunkAnalyses = await Promise.all(
chunks.map(chunk => analyzeChunk(chunk.content, chunk.index, chunks.length))
);
return synthesizeAnalyses(chunkAnalyses);
}
async function analyzeSingleChunk(text: string): Promise<ContractAnalysis> {
const response = await anthropic.messages.create({
model: 'claude-3-opus-20240229',
max_tokens: 4096,
messages: [{
role: 'user',
content: LEGAL_ANALYSIS_PROMPT + text,
}],
});
const content = response.content[0];
if (content.type !== 'text') {
throw new Error('Unexpected response type');
}
// Parse JSON from response
const jsonMatch = content.text.match(/\{[\s\S]*\}/);
if (!jsonMatch) {
throw new Error('Failed to parse analysis response');
}
return JSON.parse(jsonMatch[0]);
}
Phase 5: Contract Comparison
Enable side-by-side comparison of multiple contracts to identify differences, missing clauses, and variations in terms. This feature is invaluable for contract negotiations and due diligence.
// lib/contract-comparison.ts
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
export interface ContractDifference {
clauseType: string;
contractA: string | null;
contractB: string | null;
significance: 'critical' | 'important' | 'minor';
analysis: string;
}
export interface ComparisonResult {
summary: string;
differences: ContractDifference[];
missingInA: string[];
missingInB: string[];
recommendation: string;
}
const COMPARISON_PROMPT = `You are a legal contract analyst. Compare these two contracts
and identify all meaningful differences in terms, conditions, and protections.
Focus on:
1. Terms that differ between contracts
2. Clauses present in one but missing in the other
3. Variations in liability limits, indemnification, termination rights
4. Payment terms and conditions differences
Rate each difference as: critical, important, or minor.
CONTRACT A:
{contractA}
CONTRACT B:
{contractB}
Respond in JSON format:
{
"summary": "Brief overview of key differences",
"differences": [...],
"missingInA": ["clauses in B but not A"],
"missingInB": ["clauses in A but not B"],
"recommendation": "Which contract is more favorable and why"
}`;
export async function compareContracts(
contractA: string,
contractB: string
): Promise<ComparisonResult> {
const prompt = COMPARISON_PROMPT
.replace('{contractA}', contractA)
.replace('{contractB}', contractB);
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 4096,
messages: [{ role: 'user', content: prompt }],
});
const content = response.content[0];
if (content.type !== 'text') {
throw new Error('Unexpected response type');
}
const jsonMatch = content.text.match(/\{[\s\S]*\}/);
return JSON.parse(jsonMatch![0]);
}
Phase 6: Summary Report Generation
Generate a professional, actionable summary report that presents the analysis in a clear, digestible format for stakeholders.
// components/AnalysisReport.tsx
import { ContractAnalysis, IdentifiedRisk } from '@/lib/contract-analyzer';
interface AnalysisReportProps {
analysis: ContractAnalysis;
contractName: string;
}
export function AnalysisReport({ analysis, contractName }: AnalysisReportProps) {
const riskCounts = {
high: analysis.risks.filter(r => r.severity === 'high').length,
medium: analysis.risks.filter(r => r.severity === 'medium').length,
low: analysis.risks.filter(r => r.severity === 'low').length,
};
return (
<div className="space-y-8">
{/* Header */}
<div className="border-b border-gray-700 pb-6">
<h1 className="text-2xl font-bold">Contract Analysis Report</h1>
<p className="text-gray-400 mt-2">{contractName}</p>
<p className="text-sm text-gray-500">Generated on {new Date().toLocaleDateString()}</p>
</div>
{/* Risk Overview */}
<div className="grid grid-cols-3 gap-4">
<RiskBadge count={riskCounts.high} severity="high" />
<RiskBadge count={riskCounts.medium} severity="medium" />
<RiskBadge count={riskCounts.low} severity="low" />
</div>
{/* Executive Summary */}
<section>
<h2 className="text-xl font-semibold mb-4">Executive Summary</h2>
<div className="bg-gray-800 rounded-lg p-6">
<p className="text-gray-300 whitespace-pre-wrap">{analysis.summary}</p>
</div>
</section>
{/* Identified Risks */}
<section>
<h2 className="text-xl font-semibold mb-4">Identified Risks</h2>
<div className="space-y-4">
{analysis.risks.map((risk, index) => (
<RiskCard key={index} risk={risk} />
))}
</div>
</section>
{/* Key Clauses */}
<section>
<h2 className="text-xl font-semibold mb-4">Key Clauses</h2>
<div className="space-y-4">
{analysis.clauses.map((clause, index) => (
<ClauseCard key={index} clause={clause} />
))}
</div>
</section>
{/* Recommendations */}
<section>
<h2 className="text-xl font-semibold mb-4">Recommendations</h2>
<ul className="space-y-2">
{analysis.recommendations.map((rec, index) => (
<li key={index} className="flex items-start gap-3">
<span className="text-blue-400">→</span>
<span className="text-gray-300">{rec}</span>
</li>
))}
</ul>
</section>
</div>
);
}
function RiskBadge({ count, severity }: { count: number; severity: string }) {
const colors = {
high: 'bg-red-500/20 text-red-400 border-red-500/30',
medium: 'bg-yellow-500/20 text-yellow-400 border-yellow-500/30',
low: 'bg-green-500/20 text-green-400 border-green-500/30',
};
return (
<div className={`rounded-lg border p-4 text-center ${colors[severity]}`}>
<div className="text-3xl font-bold">{count}</div>
<div className="text-sm capitalize">{severity} Risk</div>
</div>
);
}
function RiskCard({ risk }: { risk: IdentifiedRisk }) {
const severityColors = {
high: 'border-l-red-500',
medium: 'border-l-yellow-500',
low: 'border-l-green-500',
};
return (
<div className={`bg-gray-800 rounded-lg p-4 border-l-4 ${severityColors[risk.severity]}`}>
<div className="flex items-center justify-between mb-2">
<h3 className="font-semibold">{risk.title}</h3>
<span className="text-xs uppercase px-2 py-1 rounded bg-gray-700">
{risk.severity}
</span>
</div>
<p className="text-gray-400 text-sm mb-3">{risk.description}</p>
<div className="text-xs text-gray-500 mb-2">{risk.clauseReference}</div>
<div className="bg-blue-500/10 text-blue-400 text-sm p-2 rounded">
💡 {risk.recommendation}
</div>
</div>
);
}
Common Issues & Solutions
Here are some common issues you might encounter and how to solve them:
Some PDFs (especially scanned documents) may not extract text properly. Consider adding OCR support with Tesseract.js for image-based PDFs, or validate that extracted text has sufficient content before analysis.
If you encounter "context length exceeded" errors, adjust the MAX_CHUNK_SIZE in the chunker. Start with smaller chunks (80,000 characters) and increase if the analysis seems fragmented.
Claude occasionally outputs malformed JSON. Implement retry logic with a more explicit prompt, or use a JSON repair library like jsonrepair to fix common issues.
Additional troubleshooting tips:
- Rate limiting: Implement exponential backoff for API calls to handle rate limits gracefully
- Timeout errors: Large documents may timeout on Vercel's free tier. Consider upgrading or using background jobs
- Memory issues: pdf-parse can be memory-intensive. Use streaming for very large files
Next Steps
Congratulations on building your AI contract analyzer! Here are some enhancements to consider:
- Add user authentication - Implement Supabase Auth for secure multi-user access
- Contract comparison - Compare two contracts to identify differences
- Custom clause templates - Let users define their own clause detection patterns
- Export functionality - Generate PDF or Word reports of the analysis
- Batch processing - Analyze multiple contracts simultaneously
- Historical analysis - Track changes in contract terms over time
- Integration with DocuSign - Connect to e-signature workflows
This tool is designed to assist with contract review, not replace professional legal advice. Always have important contracts reviewed by a qualified attorney before signing.
Follow the Vibe Coding Enthusiast
Follow JD — product updates on LinkedIn, personal takes on X.