What We're Building
In this guide, you'll build a sophisticated content recommendation system that uses vector embeddings to understand and match content with user preferences. Unlike traditional recommendation systems that rely on simple tags or collaborative filtering alone, our AI Content Curator uses semantic understanding to find truly relevant content.
Semantic Search
Find content based on meaning, not just keywords
User Preferences
Track and learn from user interactions
AI Summaries
Generate personalized content descriptions
Real-time Feed
Dynamic personalized content stream
The system works by converting content (articles, videos, podcasts) into vector embeddings using OpenAI's embedding model, storing them in Supabase with pgvector for efficient similarity search, and using Claude to generate personalized summaries and explanations.
Prerequisites
Before starting this guide, make sure you have the following:
- Node.js 18+ installed on your machine
- OpenAI API key for generating embeddings ($0.0001 per 1K tokens)
- Anthropic API key for Claude-powered summaries
- Supabase account (free tier works fine for development)
- Basic understanding of React and TypeScript
- Familiarity with SQL and database concepts
Tech Stack Specification
Here's the technology stack we'll use for this build:
| Layer | Technology | Why This Choice |
|---|---|---|
| Frontend | Next.js 14, Tailwind CSS | Fast, SEO-friendly with App Router and server components |
| Backend | Next.js API Routes | Serverless functions with zero config deployment |
| Database | Supabase + pgvector | Postgres with native vector search, built-in auth |
| AI | OpenAI Embeddings + Claude | Best-in-class embeddings for similarity, Claude for personalization |
| Hosting | Vercel | Zero config, automatic preview deployments |
pgvector enables vector similarity search directly in PostgreSQL. This means you can join vector searches with regular SQL queries, filter by metadata, and leverage existing Postgres tooling - all without managing a separate vector database.
AI Agent Workflow
Here's how to leverage AI tools throughout this build to maximize productivity:
Project Scaffolding with Claude Code
Start by using Claude Code to scaffold the entire project structure. Claude can generate the database schema, API routes, and React components in one go.
# Prompt for Claude Code:
Create a Next.js 14 content recommendation system with:
- Supabase database with pgvector for content embeddings
- API routes for: content ingestion, embedding generation,
similarity search, user preference tracking
- Components: ContentCard, PersonalizedFeed, SearchBar
- Use TypeScript, App Router, and Tailwind CSS
- Include database migration SQL for pgvector setup
UI Generation with v0.dev
Use v0.dev to rapidly prototype the feed UI. Describe the Netflix-style content grid layout and let v0 generate responsive Tailwind components.
When prompting v0.dev, be specific about interactive states: "Create a content card that shows a hover preview, loading skeleton state, and smooth fade-in animation when content loads."
Development with Cursor
Use Cursor's AI features to write and debug the embedding pipeline. Cursor excels at understanding the OpenAI and Supabase SDK patterns, helping you write type-safe database queries and handle edge cases in the embedding process.
Step-by-Step Build Guide
Phase 1: Project Setup
Initialize your Next.js project with all required dependencies for embeddings and vector search.
# Create Next.js project
npx create-next-app@latest content-curator --typescript --tailwind --app
cd content-curator
# Install dependencies
npm install @supabase/supabase-js openai @anthropic-ai/sdk
npm install -D @types/node
# Set up environment variables
cat > .env.local << 'EOF'
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
EOF
Phase 2: Database Setup with Vector Column
Set up your Supabase database with the pgvector extension and create tables for content and user preferences.
-- Enable pgvector extension (run in Supabase SQL Editor)
CREATE EXTENSION IF NOT EXISTS vector;
-- Content table with embedding column
CREATE TABLE content (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
title TEXT NOT NULL,
description TEXT,
content_type TEXT CHECK (content_type IN ('article', 'video', 'podcast')),
source_url TEXT,
thumbnail_url TEXT,
metadata JSONB DEFAULT '{}',
embedding vector(1536), -- OpenAI ada-002 dimension
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- User preferences table
CREATE TABLE user_preferences (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE,
content_id UUID REFERENCES content(id) ON DELETE CASCADE,
interaction_type TEXT CHECK (interaction_type IN ('view', 'like', 'save', 'dismiss')),
created_at TIMESTAMPTZ DEFAULT now(),
UNIQUE(user_id, content_id, interaction_type)
);
-- Create index for fast vector similarity search
CREATE INDEX content_embedding_idx ON content
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Function for similarity search
CREATE OR REPLACE FUNCTION match_content(
query_embedding vector(1536),
match_threshold FLOAT DEFAULT 0.78,
match_count INT DEFAULT 10
)
RETURNS TABLE (
id UUID,
title TEXT,
description TEXT,
content_type TEXT,
thumbnail_url TEXT,
similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY SELECT
content.id,
content.title,
content.description,
content.content_type,
content.thumbnail_url,
1 - (content.embedding <=> query_embedding) AS similarity
FROM content
WHERE 1 - (content.embedding <=> query_embedding) > match_threshold
ORDER BY content.embedding <=> query_embedding
LIMIT match_count;
END;
$$;
Phase 3: Content Embedding Pipeline
Create the API route that generates embeddings for content and stores them in your database.
import { NextResponse } from 'next/server';
import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
export async function POST(request: Request) {
const { title, description, contentType, sourceUrl, thumbnailUrl } =
await request.json();
// Generate embedding from title + description
const textToEmbed = `${title}. ${description}`;
const embeddingResponse = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: textToEmbed,
});
const embedding = embeddingResponse.data[0].embedding;
// Store content with embedding in Supabase
const { data, error } = await supabase
.from('content')
.insert({
title,
description,
content_type: contentType,
source_url: sourceUrl,
thumbnail_url: thumbnailUrl,
embedding,
})
.select()
.single();
if (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
return NextResponse.json({ content: data });
}
Phase 4: Similarity-Based Recommendations
Build the recommendation engine that finds similar content based on user interactions and query embeddings.
import { NextResponse } from 'next/server';
import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
export async function POST(request: Request) {
const { query, userId, limit = 10 } = await request.json();
let queryEmbedding: number[];
if (query) {
// Search by text query
const embeddingResponse = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: query,
});
queryEmbedding = embeddingResponse.data[0].embedding;
} else if (userId) {
// Generate embedding from user's liked content
const { data: likedContent } = await supabase
.from('user_preferences')
.select('content:content_id(embedding)')
.eq('user_id', userId)
.eq('interaction_type', 'like')
.limit(5);
if (!likedContent?.length) {
// Return trending content for new users
const { data } = await supabase
.from('content')
.select('id, title, description, content_type, thumbnail_url')
.order('created_at', { ascending: false })
.limit(limit);
return NextResponse.json({ recommendations: data });
}
// Average embeddings of liked content
const embeddings = likedContent.map((c: any) => c.content.embedding);
queryEmbedding = averageEmbeddings(embeddings);
} else {
return NextResponse.json(
{ error: 'Query or userId required' },
{ status: 400 }
);
}
// Find similar content using pgvector
const { data, error } = await supabase
.rpc('match_content', {
query_embedding: queryEmbedding,
match_threshold: 0.75,
match_count: limit,
});
if (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
return NextResponse.json({ recommendations: data });
}
function averageEmbeddings(embeddings: number[][]): number[] {
const dims = embeddings[0].length;
const avg = new Array(dims).fill(0);
for (const emb of embeddings) {
for (let i = 0; i < dims; i++) {
avg[i] += emb[i];
}
}
return avg.map(v => v / embeddings.length);
}
Phase 5: AI-Powered Content Summaries
Add Claude-powered personalized summaries that explain why content was recommended and generate engaging descriptions.
import { NextResponse } from 'next/server';
import Anthropic from '@anthropic-ai/sdk';
import { createClient } from '@supabase/supabase-js';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
export async function POST(request: Request) {
const { contentId, userId } = await request.json();
// Get content details
const { data: content } = await supabase
.from('content')
.select('title, description, content_type, metadata')
.eq('id', contentId)
.single();
// Get user's recent likes for context
const { data: userLikes } = await supabase
.from('user_preferences')
.select('content:content_id(title, content_type)')
.eq('user_id', userId)
.eq('interaction_type', 'like')
.limit(5);
const likedTitles = userLikes?.map((l: any) => l.content.title) || [];
const message = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 300,
messages: [{
role: 'user',
content: `You are a content curator. Generate a personalized 2-3 sentence summary explaining why this content would interest the user.
Content:
- Title: ${content.title}
- Type: ${content.content_type}
- Description: ${content.description}
User has recently liked: ${likedTitles.join(', ') || '(new user)'}
Write in second person ("you'll love this because..."). Be specific and engaging.`
}]
});
const summary = message.content[0].type === 'text'
? message.content[0].text
: '';
return NextResponse.json({ summary });
}
Phase 6: Personalized Feed UI
Build the frontend components for displaying the personalized content feed with loading states and interactions.
'use client';
import { useState, useEffect } from 'react';
import { ContentCard } from './ContentCard';
interface Content {
id: string;
title: string;
description: string;
content_type: string;
thumbnail_url: string;
similarity?: number;
}
export function PersonalizedFeed({ userId }: { userId?: string }) {
const [content, setContent] = useState([]);
const [loading, setLoading] = useState(true);
const [searchQuery, setSearchQuery] = useState('');
const fetchRecommendations = async (query?: string) => {
setLoading(true);
try {
const response = await fetch('/api/recommend', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
query,
userId,
limit: 12
}),
});
const data = await response.json();
setContent(data.recommendations || []);
} catch (error) {
console.error('Failed to fetch recommendations:', error);
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchRecommendations();
}, [userId]);
const handleSearch = (e: React.FormEvent) => {
e.preventDefault();
fetchRecommendations(searchQuery);
};
const handleInteraction = async (
contentId: string,
type: 'like' | 'save' | 'dismiss'
) => {
await fetch('/api/interaction', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ userId, contentId, type }),
});
// Refresh recommendations after interaction
fetchRecommendations(searchQuery || undefined);
};
return (
<div className="space-y-8">
{/* Search Bar */}
<form onSubmit={handleSearch} className="flex gap-4">
<input
type="text"
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Search by topic, mood, or description..."
className="flex-1 px-4 py-3 rounded-lg bg-gray-800
border border-gray-700 focus:border-purple-500"
/>
<button
type="submit"
className="px-6 py-3 bg-purple-600 rounded-lg
hover:bg-purple-700 transition-colors"
>
Search
</button>
</form>
{/* Content Grid */}
{loading ? (
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
{[...Array(6)].map((_, i) => (
<div key={i} className="animate-pulse">
<div className="bg-gray-800 aspect-video rounded-lg" />
<div className="h-4 bg-gray-800 rounded mt-4 w-3/4" />
<div className="h-3 bg-gray-800 rounded mt-2 w-1/2" />
</div>
))}
</div>
) : (
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
{content.map((item) => (
<ContentCard
key={item.id}
content={item}
userId={userId}
onInteraction={handleInteraction}
/>
))}
</div>
)}
</div>
);
}
Common Issues & Solutions
Here are some common issues you might encounter and how to solve them:
If you see "extension 'vector' does not exist", go to Supabase Dashboard > Database > Extensions and enable the "vector" extension. This is required before running the migration SQL.
OpenAI's text-embedding-ada-002 outputs 1536 dimensions. If you change to a different model (like text-embedding-3-small at 1536 or text-embedding-3-large at 3072), update the vector column dimension accordingly.
For larger datasets (10K+ items), ensure the IVF index is created properly. You may need to adjust the `lists` parameter - use sqrt(n) as a starting point, where n is your row count.
Other tips for a smooth build:
- Rate Limiting: OpenAI's embedding API has rate limits. For bulk imports, batch requests and add delays between calls.
- Caching: Cache embedding API calls to avoid regenerating embeddings for the same content. Use Redis or a simple in-memory cache for development.
- Cold Start: New users with no interaction history need a fallback strategy. Consider showing trending or recent content until you have preference data.
Next Steps
Congratulations on building your AI Content Curator! Here are ways to extend and improve your system:
- Hybrid Recommendations: Combine vector similarity with collaborative filtering for even better results
- Real-time Updates: Add Supabase Realtime to push new recommendations as users interact
- A/B Testing: Implement different recommendation strategies and measure engagement metrics
- Multi-modal Embeddings: Use CLIP or similar models to embed images and match visual preferences
- Diversity Controls: Add logic to ensure recommendation variety and avoid filter bubbles
For production deployments, consider using OpenAI's batch embedding API for cost efficiency, and implement a job queue (like Inngest or QStash) for processing large content libraries asynchronously.
Follow the Vibe Coding Enthusiast
Follow JD — product updates on LinkedIn, personal takes on X.