Build an AI Contract Analyzer

What We're Building

In this guide, you'll build a sophisticated AI-powered contract analyzer that helps legal professionals and businesses quickly understand complex legal documents. The application will accept PDF uploads, extract text content, analyze it with Claude's powerful language understanding, and deliver actionable insights.

Your contract analyzer will include:

Drag-and-drop file upload - Intuitive interface for uploading PDF contracts
PDF text extraction - Robust parsing that handles multi-page documents
Intelligent chunking - Smart document splitting for large contracts that exceed token limits
Clause extraction - Automatic identification of key clauses (indemnification, termination, liability, etc.)
Risk identification - AI-powered detection of potentially problematic terms
Summary reports - Clear, actionable summaries with highlighted concerns
Contract storage - Persistent storage of uploaded documents and analysis results

Why Claude for Legal Analysis?

Claude excels at legal document analysis due to its strong reasoning capabilities, nuanced understanding of context, and ability to follow complex instructions. Claude 3 Opus is particularly well-suited for this task as it can handle lengthy documents and provide detailed, accurate analysis.

Prerequisites

Before starting this guide, make sure you have the following:

Node.js 18+ installed on your machine
Anthropic API key with access to Claude 3 Opus (get one at console.anthropic.com)
Supabase account for database and file storage (free tier works)
Basic knowledge of React, Next.js, and TypeScript
Understanding of REST APIs and async/await patterns

Tech Stack Specification

Here's the technology stack we'll use for this build:

Layer	Technology	Why This Choice
Frontend	Next.js 14, Tailwind	Document UI with rapid styling for legal interfaces
Backend	Next.js API Routes	Serverless document processing pipeline
Database	Supabase	Contract storage with full-text search capabilities
AI	Claude API	Superior document analysis and legal reasoning
PDF	pdf-parse	Reliable PDF text extraction for document processing
Hosting	Vercel	Zero-config deployment with edge functions

AI Agent Workflow

Here's how to leverage AI tools throughout this build to maximize productivity:

Project Scaffolding with Claude Code

Use Claude Code to generate the initial project structure, including the Next.js setup, Supabase configuration, and API route boilerplate. This saves hours of manual configuration.

# Example prompt for Claude Code
Create a Next.js 14 project with TypeScript for a contract analyzer:
- App Router with /upload and /analysis/[id] routes
- Supabase client configuration
- API routes for /api/upload and /api/analyze
- Environment variable setup for ANTHROPIC_API_KEY and SUPABASE_URL
- Tailwind CSS with a dark theme
- File upload component with drag-and-drop

UI Generation with v0.dev

Generate the contract analysis dashboard UI with v0.dev. Request a professional legal-themed interface with document preview, risk indicators, and clause highlighting.

Pro Tip

When prompting v0.dev, specify "legal tech" or "document analysis" aesthetics. Ask for risk severity indicators (red/yellow/green), expandable clause sections, and a professional color scheme appropriate for legal professionals.

Development with Cursor

Use Cursor's AI features for implementing the PDF parsing logic and Claude API integration. The inline code completion excels at TypeScript types for API responses and error handling patterns.

Step-by-Step Build Guide

Phase 1: PDF Upload and Text Extraction

Start by creating a polished drag-and-drop upload interface with PDF text extraction. This component handles PDF validation, shows upload progress, and extracts text content for analysis.

// components/ContractUploader.tsx
import { useCallback, useState } from 'react';
import { useDropzone } from 'react-dropzone';

interface ContractUploaderProps {
  onUpload: (file: File) => Promise<void>;
}

export function ContractUploader({ onUpload }: ContractUploaderProps) {
  const [uploading, setUploading] = useState(false);
  const [progress, setProgress] = useState(0);

  const onDrop = useCallback(async (acceptedFiles: File[]) => {
    const file = acceptedFiles[0];
    if (!file || file.type !== 'application/pdf') {
      alert('Please upload a PDF file');
      return;
    }

    setUploading(true);
    try {
      await onUpload(file);
    } finally {
      setUploading(false);
      setProgress(0);
    }
  }, [onUpload]);

  const { getRootProps, getInputProps, isDragActive } = useDropzone({
    onDrop,
    accept: { 'application/pdf': ['.pdf'] },
    maxFiles: 1,
    maxSize: 10 * 1024 * 1024, // 10MB limit
  });

  return (
    <div
      {...getRootProps()}
      className={`
        border-2 border-dashed rounded-xl p-12 text-center
        transition-colors cursor-pointer
        ${isDragActive ? 'border-blue-500 bg-blue-500/10' : 'border-gray-600 hover:border-gray-500'}
      `}
    >
      <input {...getInputProps()} />
      {uploading ? (
        <div className="space-y-4">
          <div className="animate-spin w-8 h-8 border-2 border-blue-500 border-t-transparent rounded-full mx-auto" />
          <p>Uploading... {progress}%</p>
        </div>
      ) : (
        <div className="space-y-4">
          <div className="text-4xl">📄</div>
          <p className="text-lg">
            {isDragActive ? 'Drop your contract here' : 'Drag & drop a PDF contract, or click to browse'}
          </p>
          <p className="text-sm text-gray-500">Maximum file size: 10MB</p>
        </div>
      )}
    </div>
  );
}

Phase 2: Contract Parsing with Claude

Use pdf-parse to extract text from uploaded PDFs, then send the content to Claude for intelligent parsing. This phase establishes the core document processing pipeline.

// lib/pdf-parser.ts
import pdf from 'pdf-parse';

export interface ParsedContract {
  text: string;
  pageCount: number;
  metadata: {
    title?: string;
    author?: string;
    creationDate?: string;
  };
}

export async function parseContractPDF(buffer: Buffer): Promise<ParsedContract> {
  const data = await pdf(buffer);

  return {
    text: data.text,
    pageCount: data.numpages,
    metadata: {
      title: data.info?.Title,
      author: data.info?.Author,
      creationDate: data.info?.CreationDate,
    },
  };
}

// API Route: /api/upload/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { parseContractPDF } from '@/lib/pdf-parser';
import { createClient } from '@supabase/supabase-js';

export async function POST(request: NextRequest) {
  const formData = await request.formData();
  const file = formData.get('file') as File;

  if (!file) {
    return NextResponse.json({ error: 'No file provided' }, { status: 400 });
  }

  const buffer = Buffer.from(await file.arrayBuffer());
  const parsed = await parseContractPDF(buffer);

  // Store in Supabase
  const supabase = createClient(
    process.env.SUPABASE_URL!,
    process.env.SUPABASE_SERVICE_KEY!
  );

  // Upload PDF to storage
  const fileName = `contracts/${Date.now()}-${file.name}`;
  await supabase.storage.from('contracts').upload(fileName, buffer);

  // Save metadata to database
  const { data, error } = await supabase
    .from('contracts')
    .insert({
      filename: file.name,
      storage_path: fileName,
      page_count: parsed.pageCount,
      text_content: parsed.text,
      status: 'pending_analysis',
    })
    .select()
    .single();

  if (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }

  return NextResponse.json({ contractId: data.id, pageCount: parsed.pageCount });
}

Phase 3: Key Clause Identification

Large contracts can exceed Claude's context window. Implement smart chunking that preserves clause boundaries and enables accurate extraction of key contractual clauses.

// lib/chunker.ts
export interface DocumentChunk {
  index: number;
  content: string;
  startPage?: number;
  endPage?: number;
}

const MAX_CHUNK_SIZE = 100000; // ~25k tokens, safe for Claude
const OVERLAP_SIZE = 500; // Overlap to maintain context

export function chunkDocument(text: string): DocumentChunk[] {
  // If document fits in one chunk, return as-is
  if (text.length <= MAX_CHUNK_SIZE) {
    return [{ index: 0, content: text }];
  }

  const chunks: DocumentChunk[] = [];
  let currentPosition = 0;
  let chunkIndex = 0;

  while (currentPosition < text.length) {
    let endPosition = currentPosition + MAX_CHUNK_SIZE;

    // Try to break at paragraph or section boundary
    if (endPosition < text.length) {
      const breakPoints = [
        text.lastIndexOf('\n\n', endPosition),  // Paragraph break
        text.lastIndexOf('.\n', endPosition),   // Sentence + newline
        text.lastIndexOf('. ', endPosition),    // Sentence break
      ];

      const bestBreak = breakPoints
        .filter(bp => bp > currentPosition + MAX_CHUNK_SIZE * 0.5)
        .sort((a, b) => b - a)[0];

      if (bestBreak) {
        endPosition = bestBreak + 1;
      }
    }

    chunks.push({
      index: chunkIndex,
      content: text.slice(
        Math.max(0, currentPosition - (chunkIndex > 0 ? OVERLAP_SIZE : 0)),
        Math.min(text.length, endPosition)
      ),
    });

    currentPosition = endPosition;
    chunkIndex++;
  }

  return chunks;
}

Phase 4: Risk Assessment Scoring

The core of your analyzer: carefully crafted prompts that guide Claude to perform thorough risk assessment and score potential issues. This prompt structure ensures consistent, comprehensive results with actionable risk ratings.

// lib/contract-analyzer.ts
import Anthropic from '@anthropic-ai/sdk';
import { chunkDocument } from './chunker';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export interface ContractAnalysis {
  summary: string;
  clauses: ExtractedClause[];
  risks: IdentifiedRisk[];
  recommendations: string[];
}

export interface ExtractedClause {
  type: string;
  title: string;
  content: string;
  location: string;
}

export interface IdentifiedRisk {
  severity: 'high' | 'medium' | 'low';
  title: string;
  description: string;
  clauseReference: string;
  recommendation: string;
}

const LEGAL_ANALYSIS_PROMPT = `You are an expert legal analyst specializing in contract review.
Analyze the following contract text and provide a comprehensive analysis.

Your analysis must include:

1. **EXECUTIVE SUMMARY** (2-3 paragraphs)
   - Contract type and parties involved
   - Key terms and obligations
   - Overall assessment

2. **KEY CLAUSES** - Extract and categorize important clauses:
   - Indemnification clauses
   - Limitation of liability
   - Termination provisions
   - Confidentiality/NDA terms
   - Intellectual property rights
   - Payment terms
   - Governing law and jurisdiction
   - Force majeure
   - Assignment restrictions
   - Non-compete/non-solicitation

3. **RISK ANALYSIS** - Identify potential issues with severity ratings:
   - HIGH: Terms that could cause significant financial or legal exposure
   - MEDIUM: Terms that are unfavorable but manageable
   - LOW: Minor concerns or areas for negotiation

4. **RECOMMENDATIONS** - Actionable suggestions for:
   - Terms to negotiate or modify
   - Missing protections to add
   - Clarifications needed

Respond in valid JSON format matching this structure:
{
  "summary": "Executive summary text...",
  "clauses": [
    {
      "type": "indemnification",
      "title": "Clause title",
      "content": "Relevant clause text",
      "location": "Section X.X"
    }
  ],
  "risks": [
    {
      "severity": "high|medium|low",
      "title": "Risk title",
      "description": "Detailed description",
      "clauseReference": "Section X.X",
      "recommendation": "Suggested action"
    }
  ],
  "recommendations": ["Recommendation 1", "Recommendation 2"]
}

CONTRACT TEXT:
`;

export async function analyzeContract(contractText: string): Promise<ContractAnalysis> {
  const chunks = chunkDocument(contractText);

  if (chunks.length === 1) {
    // Single chunk - analyze directly
    return analyzeSingleChunk(chunks[0].content);
  }

  // Multiple chunks - analyze each, then synthesize
  const chunkAnalyses = await Promise.all(
    chunks.map(chunk => analyzeChunk(chunk.content, chunk.index, chunks.length))
  );

  return synthesizeAnalyses(chunkAnalyses);
}

async function analyzeSingleChunk(text: string): Promise<ContractAnalysis> {
  const response = await anthropic.messages.create({
    model: 'claude-3-opus-20240229',
    max_tokens: 4096,
    messages: [{
      role: 'user',
      content: LEGAL_ANALYSIS_PROMPT + text,
    }],
  });

  const content = response.content[0];
  if (content.type !== 'text') {
    throw new Error('Unexpected response type');
  }

  // Parse JSON from response
  const jsonMatch = content.text.match(/\{[\s\S]*\}/);
  if (!jsonMatch) {
    throw new Error('Failed to parse analysis response');
  }

  return JSON.parse(jsonMatch[0]);
}

Phase 5: Contract Comparison

Enable side-by-side comparison of multiple contracts to identify differences, missing clauses, and variations in terms. This feature is invaluable for contract negotiations and due diligence.

// lib/contract-comparison.ts
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export interface ContractDifference {
  clauseType: string;
  contractA: string | null;
  contractB: string | null;
  significance: 'critical' | 'important' | 'minor';
  analysis: string;
}

export interface ComparisonResult {
  summary: string;
  differences: ContractDifference[];
  missingInA: string[];
  missingInB: string[];
  recommendation: string;
}

const COMPARISON_PROMPT = `You are a legal contract analyst. Compare these two contracts
and identify all meaningful differences in terms, conditions, and protections.

Focus on:
1. Terms that differ between contracts
2. Clauses present in one but missing in the other
3. Variations in liability limits, indemnification, termination rights
4. Payment terms and conditions differences

Rate each difference as: critical, important, or minor.

CONTRACT A:
{contractA}

CONTRACT B:
{contractB}

Respond in JSON format:
{
  "summary": "Brief overview of key differences",
  "differences": [...],
  "missingInA": ["clauses in B but not A"],
  "missingInB": ["clauses in A but not B"],
  "recommendation": "Which contract is more favorable and why"
}`;

export async function compareContracts(
  contractA: string,
  contractB: string
): Promise<ComparisonResult> {
  const prompt = COMPARISON_PROMPT
    .replace('{contractA}', contractA)
    .replace('{contractB}', contractB);

  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 4096,
    messages: [{ role: 'user', content: prompt }],
  });

  const content = response.content[0];
  if (content.type !== 'text') {
    throw new Error('Unexpected response type');
  }

  const jsonMatch = content.text.match(/\{[\s\S]*\}/);
  return JSON.parse(jsonMatch![0]);
}

Phase 6: Summary Report Generation

Generate a professional, actionable summary report that presents the analysis in a clear, digestible format for stakeholders.

// components/AnalysisReport.tsx
import { ContractAnalysis, IdentifiedRisk } from '@/lib/contract-analyzer';

interface AnalysisReportProps {
  analysis: ContractAnalysis;
  contractName: string;
}

export function AnalysisReport({ analysis, contractName }: AnalysisReportProps) {
  const riskCounts = {
    high: analysis.risks.filter(r => r.severity === 'high').length,
    medium: analysis.risks.filter(r => r.severity === 'medium').length,
    low: analysis.risks.filter(r => r.severity === 'low').length,
  };

  return (
    <div className="space-y-8">
      {/* Header */}
      <div className="border-b border-gray-700 pb-6">
        <h1 className="text-2xl font-bold">Contract Analysis Report</h1>
        <p className="text-gray-400 mt-2">{contractName}</p>
        <p className="text-sm text-gray-500">Generated on {new Date().toLocaleDateString()}</p>
      </div>

      {/* Risk Overview */}
      <div className="grid grid-cols-3 gap-4">
        <RiskBadge count={riskCounts.high} severity="high" />
        <RiskBadge count={riskCounts.medium} severity="medium" />
        <RiskBadge count={riskCounts.low} severity="low" />
      </div>

      {/* Executive Summary */}
      <section>
        <h2 className="text-xl font-semibold mb-4">Executive Summary</h2>
        <div className="bg-gray-800 rounded-lg p-6">
          <p className="text-gray-300 whitespace-pre-wrap">{analysis.summary}</p>
        </div>
      </section>

      {/* Identified Risks */}
      <section>
        <h2 className="text-xl font-semibold mb-4">Identified Risks</h2>
        <div className="space-y-4">
          {analysis.risks.map((risk, index) => (
            <RiskCard key={index} risk={risk} />
          ))}
        </div>
      </section>

      {/* Key Clauses */}
      <section>
        <h2 className="text-xl font-semibold mb-4">Key Clauses</h2>
        <div className="space-y-4">
          {analysis.clauses.map((clause, index) => (
            <ClauseCard key={index} clause={clause} />
          ))}
        </div>
      </section>

      {/* Recommendations */}
      <section>
        <h2 className="text-xl font-semibold mb-4">Recommendations</h2>
        <ul className="space-y-2">
          {analysis.recommendations.map((rec, index) => (
            <li key={index} className="flex items-start gap-3">
              <span className="text-blue-400">→</span>
              <span className="text-gray-300">{rec}</span>
            </li>
          ))}
        </ul>
      </section>
    </div>
  );
}

function RiskBadge({ count, severity }: { count: number; severity: string }) {
  const colors = {
    high: 'bg-red-500/20 text-red-400 border-red-500/30',
    medium: 'bg-yellow-500/20 text-yellow-400 border-yellow-500/30',
    low: 'bg-green-500/20 text-green-400 border-green-500/30',
  };

  return (
    <div className={`rounded-lg border p-4 text-center ${colors[severity]}`}>
      <div className="text-3xl font-bold">{count}</div>
      <div className="text-sm capitalize">{severity} Risk</div>
    </div>
  );
}

function RiskCard({ risk }: { risk: IdentifiedRisk }) {
  const severityColors = {
    high: 'border-l-red-500',
    medium: 'border-l-yellow-500',
    low: 'border-l-green-500',
  };

  return (
    <div className={`bg-gray-800 rounded-lg p-4 border-l-4 ${severityColors[risk.severity]}`}>
      <div className="flex items-center justify-between mb-2">
        <h3 className="font-semibold">{risk.title}</h3>
        <span className="text-xs uppercase px-2 py-1 rounded bg-gray-700">
          {risk.severity}
        </span>
      </div>
      <p className="text-gray-400 text-sm mb-3">{risk.description}</p>
      <div className="text-xs text-gray-500 mb-2">{risk.clauseReference}</div>
      <div className="bg-blue-500/10 text-blue-400 text-sm p-2 rounded">
        💡 {risk.recommendation}
      </div>
    </div>
  );
}

Common Issues & Solutions

Here are some common issues you might encounter and how to solve them:

PDF Text Extraction Issues

Some PDFs (especially scanned documents) may not extract text properly. Consider adding OCR support with Tesseract.js for image-based PDFs, or validate that extracted text has sufficient content before analysis.

Token Limit Exceeded

If you encounter "context length exceeded" errors, adjust the MAX_CHUNK_SIZE in the chunker. Start with smaller chunks (80,000 characters) and increase if the analysis seems fragmented.

JSON Parsing Failures

Claude occasionally outputs malformed JSON. Implement retry logic with a more explicit prompt, or use a JSON repair library like jsonrepair to fix common issues.

Additional troubleshooting tips:

Rate limiting: Implement exponential backoff for API calls to handle rate limits gracefully
Timeout errors: Large documents may timeout on Vercel's free tier. Consider upgrading or using background jobs
Memory issues: pdf-parse can be memory-intensive. Use streaming for very large files

Next Steps

Congratulations on building your AI contract analyzer! Here are some enhancements to consider:

Add user authentication - Implement Supabase Auth for secure multi-user access
Contract comparison - Compare two contracts to identify differences
Custom clause templates - Let users define their own clause detection patterns
Export functionality - Generate PDF or Word reports of the analysis
Batch processing - Analyze multiple contracts simultaneously
Historical analysis - Track changes in contract terms over time
Integration with DocuSign - Connect to e-signature workflows

Legal Disclaimer

This tool is designed to assist with contract review, not replace professional legal advice. Always have important contracts reviewed by a qualified attorney before signing.

Follow the Vibe Coding Enthusiast

Follow JD — product updates on LinkedIn, personal takes on X.