Intermediate ~3 hours

Build an AI Content Curator

Create an intelligent content recommendation system using vector embeddings and semantic search. Learn to build personalized feeds that understand user preferences and deliver relevant entertainment content.

What We're Building

In this guide, you'll build a sophisticated content recommendation system that uses vector embeddings to understand and match content with user preferences. Unlike traditional recommendation systems that rely on simple tags or collaborative filtering alone, our AI Content Curator uses semantic understanding to find truly relevant content.

Semantic Search

Find content based on meaning, not just keywords

User Preferences

Track and learn from user interactions

AI Summaries

Generate personalized content descriptions

Real-time Feed

Dynamic personalized content stream

The system works by converting content (articles, videos, podcasts) into vector embeddings using OpenAI's embedding model, storing them in Supabase with pgvector for efficient similarity search, and using Claude to generate personalized summaries and explanations.

Prerequisites

Before starting this guide, make sure you have the following:

  • Node.js 18+ installed on your machine
  • OpenAI API key for generating embeddings ($0.0001 per 1K tokens)
  • Anthropic API key for Claude-powered summaries
  • Supabase account (free tier works fine for development)
  • Basic understanding of React and TypeScript
  • Familiarity with SQL and database concepts

Tech Stack Specification

Here's the technology stack we'll use for this build:

Layer Technology Why This Choice
Frontend Next.js 14, Tailwind CSS Fast, SEO-friendly with App Router and server components
Backend Next.js API Routes Serverless functions with zero config deployment
Database Supabase + pgvector Postgres with native vector search, built-in auth
AI OpenAI Embeddings + Claude Best-in-class embeddings for similarity, Claude for personalization
Hosting Vercel Zero config, automatic preview deployments
Why pgvector?

pgvector enables vector similarity search directly in PostgreSQL. This means you can join vector searches with regular SQL queries, filter by metadata, and leverage existing Postgres tooling - all without managing a separate vector database.

AI Agent Workflow

Here's how to leverage AI tools throughout this build to maximize productivity:

Project Scaffolding with Claude Code

Start by using Claude Code to scaffold the entire project structure. Claude can generate the database schema, API routes, and React components in one go.

Claude Code Prompt
# Prompt for Claude Code:

Create a Next.js 14 content recommendation system with:
- Supabase database with pgvector for content embeddings
- API routes for: content ingestion, embedding generation,
  similarity search, user preference tracking
- Components: ContentCard, PersonalizedFeed, SearchBar
- Use TypeScript, App Router, and Tailwind CSS
- Include database migration SQL for pgvector setup

UI Generation with v0.dev

Use v0.dev to rapidly prototype the feed UI. Describe the Netflix-style content grid layout and let v0 generate responsive Tailwind components.

Pro Tip

When prompting v0.dev, be specific about interactive states: "Create a content card that shows a hover preview, loading skeleton state, and smooth fade-in animation when content loads."

Development with Cursor

Use Cursor's AI features to write and debug the embedding pipeline. Cursor excels at understanding the OpenAI and Supabase SDK patterns, helping you write type-safe database queries and handle edge cases in the embedding process.

Step-by-Step Build Guide

Phase 1: Project Setup

Initialize your Next.js project with all required dependencies for embeddings and vector search.

Terminal
# Create Next.js project
npx create-next-app@latest content-curator --typescript --tailwind --app

cd content-curator

# Install dependencies
npm install @supabase/supabase-js openai @anthropic-ai/sdk
npm install -D @types/node

# Set up environment variables
cat > .env.local << 'EOF'
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
EOF

Phase 2: Database Setup with Vector Column

Set up your Supabase database with the pgvector extension and create tables for content and user preferences.

SQL
-- Enable pgvector extension (run in Supabase SQL Editor)
CREATE EXTENSION IF NOT EXISTS vector;

-- Content table with embedding column
CREATE TABLE content (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  title TEXT NOT NULL,
  description TEXT,
  content_type TEXT CHECK (content_type IN ('article', 'video', 'podcast')),
  source_url TEXT,
  thumbnail_url TEXT,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536), -- OpenAI ada-002 dimension
  created_at TIMESTAMPTZ DEFAULT now(),
  updated_at TIMESTAMPTZ DEFAULT now()
);

-- User preferences table
CREATE TABLE user_preferences (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES auth.users(id) ON DELETE CASCADE,
  content_id UUID REFERENCES content(id) ON DELETE CASCADE,
  interaction_type TEXT CHECK (interaction_type IN ('view', 'like', 'save', 'dismiss')),
  created_at TIMESTAMPTZ DEFAULT now(),
  UNIQUE(user_id, content_id, interaction_type)
);

-- Create index for fast vector similarity search
CREATE INDEX content_embedding_idx ON content
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

-- Function for similarity search
CREATE OR REPLACE FUNCTION match_content(
  query_embedding vector(1536),
  match_threshold FLOAT DEFAULT 0.78,
  match_count INT DEFAULT 10
)
RETURNS TABLE (
  id UUID,
  title TEXT,
  description TEXT,
  content_type TEXT,
  thumbnail_url TEXT,
  similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY SELECT
    content.id,
    content.title,
    content.description,
    content.content_type,
    content.thumbnail_url,
    1 - (content.embedding <=> query_embedding) AS similarity
  FROM content
  WHERE 1 - (content.embedding <=> query_embedding) > match_threshold
  ORDER BY content.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

Phase 3: Content Embedding Pipeline

Create the API route that generates embeddings for content and stores them in your database.

TypeScript - app/api/embed/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';

const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

export async function POST(request: Request) {
  const { title, description, contentType, sourceUrl, thumbnailUrl } =
    await request.json();

  // Generate embedding from title + description
  const textToEmbed = `${title}. ${description}`;

  const embeddingResponse = await openai.embeddings.create({
    model: 'text-embedding-ada-002',
    input: textToEmbed,
  });

  const embedding = embeddingResponse.data[0].embedding;

  // Store content with embedding in Supabase
  const { data, error } = await supabase
    .from('content')
    .insert({
      title,
      description,
      content_type: contentType,
      source_url: sourceUrl,
      thumbnail_url: thumbnailUrl,
      embedding,
    })
    .select()
    .single();

  if (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }

  return NextResponse.json({ content: data });
}

Phase 4: Similarity-Based Recommendations

Build the recommendation engine that finds similar content based on user interactions and query embeddings.

TypeScript - app/api/recommend/route.ts
import { NextResponse } from 'next/server';
import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';

const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

export async function POST(request: Request) {
  const { query, userId, limit = 10 } = await request.json();

  let queryEmbedding: number[];

  if (query) {
    // Search by text query
    const embeddingResponse = await openai.embeddings.create({
      model: 'text-embedding-ada-002',
      input: query,
    });
    queryEmbedding = embeddingResponse.data[0].embedding;
  } else if (userId) {
    // Generate embedding from user's liked content
    const { data: likedContent } = await supabase
      .from('user_preferences')
      .select('content:content_id(embedding)')
      .eq('user_id', userId)
      .eq('interaction_type', 'like')
      .limit(5);

    if (!likedContent?.length) {
      // Return trending content for new users
      const { data } = await supabase
        .from('content')
        .select('id, title, description, content_type, thumbnail_url')
        .order('created_at', { ascending: false })
        .limit(limit);
      return NextResponse.json({ recommendations: data });
    }

    // Average embeddings of liked content
    const embeddings = likedContent.map((c: any) => c.content.embedding);
    queryEmbedding = averageEmbeddings(embeddings);
  } else {
    return NextResponse.json(
      { error: 'Query or userId required' },
      { status: 400 }
    );
  }

  // Find similar content using pgvector
  const { data, error } = await supabase
    .rpc('match_content', {
      query_embedding: queryEmbedding,
      match_threshold: 0.75,
      match_count: limit,
    });

  if (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }

  return NextResponse.json({ recommendations: data });
}

function averageEmbeddings(embeddings: number[][]): number[] {
  const dims = embeddings[0].length;
  const avg = new Array(dims).fill(0);

  for (const emb of embeddings) {
    for (let i = 0; i < dims; i++) {
      avg[i] += emb[i];
    }
  }

  return avg.map(v => v / embeddings.length);
}

Phase 5: AI-Powered Content Summaries

Add Claude-powered personalized summaries that explain why content was recommended and generate engaging descriptions.

TypeScript - app/api/summarize/route.ts
import { NextResponse } from 'next/server';
import Anthropic from '@anthropic-ai/sdk';
import { createClient } from '@supabase/supabase-js';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const supabase = createClient(
  process.env.NEXT_PUBLIC_SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);

export async function POST(request: Request) {
  const { contentId, userId } = await request.json();

  // Get content details
  const { data: content } = await supabase
    .from('content')
    .select('title, description, content_type, metadata')
    .eq('id', contentId)
    .single();

  // Get user's recent likes for context
  const { data: userLikes } = await supabase
    .from('user_preferences')
    .select('content:content_id(title, content_type)')
    .eq('user_id', userId)
    .eq('interaction_type', 'like')
    .limit(5);

  const likedTitles = userLikes?.map((l: any) => l.content.title) || [];

  const message = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 300,
    messages: [{
      role: 'user',
      content: `You are a content curator. Generate a personalized 2-3 sentence summary explaining why this content would interest the user.

Content:
- Title: ${content.title}
- Type: ${content.content_type}
- Description: ${content.description}

User has recently liked: ${likedTitles.join(', ') || '(new user)'}

Write in second person ("you'll love this because..."). Be specific and engaging.`
    }]
  });

  const summary = message.content[0].type === 'text'
    ? message.content[0].text
    : '';

  return NextResponse.json({ summary });
}

Phase 6: Personalized Feed UI

Build the frontend components for displaying the personalized content feed with loading states and interactions.

TypeScript - components/PersonalizedFeed.tsx
'use client';

import { useState, useEffect } from 'react';
import { ContentCard } from './ContentCard';

interface Content {
  id: string;
  title: string;
  description: string;
  content_type: string;
  thumbnail_url: string;
  similarity?: number;
}

export function PersonalizedFeed({ userId }: { userId?: string }) {
  const [content, setContent] = useState([]);
  const [loading, setLoading] = useState(true);
  const [searchQuery, setSearchQuery] = useState('');

  const fetchRecommendations = async (query?: string) => {
    setLoading(true);
    try {
      const response = await fetch('/api/recommend', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          query,
          userId,
          limit: 12
        }),
      });
      const data = await response.json();
      setContent(data.recommendations || []);
    } catch (error) {
      console.error('Failed to fetch recommendations:', error);
    } finally {
      setLoading(false);
    }
  };

  useEffect(() => {
    fetchRecommendations();
  }, [userId]);

  const handleSearch = (e: React.FormEvent) => {
    e.preventDefault();
    fetchRecommendations(searchQuery);
  };

  const handleInteraction = async (
    contentId: string,
    type: 'like' | 'save' | 'dismiss'
  ) => {
    await fetch('/api/interaction', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ userId, contentId, type }),
    });
    // Refresh recommendations after interaction
    fetchRecommendations(searchQuery || undefined);
  };

  return (
    <div className="space-y-8">
      {/* Search Bar */}
      <form onSubmit={handleSearch} className="flex gap-4">
        <input
          type="text"
          value={searchQuery}
          onChange={(e) => setSearchQuery(e.target.value)}
          placeholder="Search by topic, mood, or description..."
          className="flex-1 px-4 py-3 rounded-lg bg-gray-800
                     border border-gray-700 focus:border-purple-500"
        />
        <button
          type="submit"
          className="px-6 py-3 bg-purple-600 rounded-lg
                     hover:bg-purple-700 transition-colors"
        >
          Search
        </button>
      </form>

      {/* Content Grid */}
      {loading ? (
        <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
          {[...Array(6)].map((_, i) => (
            <div key={i} className="animate-pulse">
              <div className="bg-gray-800 aspect-video rounded-lg" />
              <div className="h-4 bg-gray-800 rounded mt-4 w-3/4" />
              <div className="h-3 bg-gray-800 rounded mt-2 w-1/2" />
            </div>
          ))}
        </div>
      ) : (
        <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
          {content.map((item) => (
            <ContentCard
              key={item.id}
              content={item}
              userId={userId}
              onInteraction={handleInteraction}
            />
          ))}
        </div>
      )}
    </div>
  );
}

Common Issues & Solutions

Here are some common issues you might encounter and how to solve them:

pgvector Extension Not Found

If you see "extension 'vector' does not exist", go to Supabase Dashboard > Database > Extensions and enable the "vector" extension. This is required before running the migration SQL.

Embedding Dimension Mismatch

OpenAI's text-embedding-ada-002 outputs 1536 dimensions. If you change to a different model (like text-embedding-3-small at 1536 or text-embedding-3-large at 3072), update the vector column dimension accordingly.

Slow Vector Search Performance

For larger datasets (10K+ items), ensure the IVF index is created properly. You may need to adjust the `lists` parameter - use sqrt(n) as a starting point, where n is your row count.

Other tips for a smooth build:

  • Rate Limiting: OpenAI's embedding API has rate limits. For bulk imports, batch requests and add delays between calls.
  • Caching: Cache embedding API calls to avoid regenerating embeddings for the same content. Use Redis or a simple in-memory cache for development.
  • Cold Start: New users with no interaction history need a fallback strategy. Consider showing trending or recent content until you have preference data.

Next Steps

Congratulations on building your AI Content Curator! Here are ways to extend and improve your system:

  • Hybrid Recommendations: Combine vector similarity with collaborative filtering for even better results
  • Real-time Updates: Add Supabase Realtime to push new recommendations as users interact
  • A/B Testing: Implement different recommendation strategies and measure engagement metrics
  • Multi-modal Embeddings: Use CLIP or similar models to embed images and match visual preferences
  • Diversity Controls: Add logic to ensure recommendation variety and avoid filter bubbles
Production Tip

For production deployments, consider using OpenAI's batch embedding API for cost efficiency, and implement a job queue (like Inngest or QStash) for processing large content libraries asynchronously.