Link Previews - nself-org/nchat GitHub Wiki
Complete implementation of rich link previews with Open Graph metadata extraction, Twitter Cards, and platform-specific handlers.
- ✅ Automatic URL detection in message content
- ✅ Extracts up to 2-3 URLs per message (configurable)
- ✅ Real-time preview fetching as messages are sent
- ✅ Smart URL pattern matching
- ✅ Open Graph protocol support (title, description, images, videos)
- ✅ Twitter Card metadata
- ✅ Fallback to basic HTML meta tags
- ✅ Favicon extraction
- ✅ Theme color detection
- ✅ Author and publication date
- ✅ YouTube video previews with thumbnails
- ✅ GitHub repository and issue/PR previews
- ✅ Twitter/X post previews
- ✅ Spotify track/album/playlist previews
- ✅ Code snippet previews (Gist, CodePen, CodeSandbox)
- ✅ Direct image and video URL handling
- ✅ SSRF protection (blocks private IPs and localhost)
- ✅ Rate limiting per domain (10 requests/minute)
- ✅ Client rate limiting (30 requests/minute)
- ✅ Server-side caching (1 hour TTL)
- ✅ Request timeout (10 seconds)
- ✅ Response size limits (5MB max)
- ✅ Blocked ports protection
- ✅ Privacy mode to hide/show previews
- ✅ Dismiss individual previews
- ✅ Disable auto-unfurl option
- ✅ Loading states with skeletons
- ✅ Error handling with fallback display
┌─────────────────────────────────────────────────────────────┐
│ Message Component │
├─────────────────────────────────────────────────────────────┤
│ MessageContent │
│ ├── Text Content │
│ └── LinkPreview Component │
│ ├── Auto-detect URLs │
│ ├── Fetch previews via API │
│ └── Display LinkPreviewCard(s) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API Route (/api/unfurl) │
├─────────────────────────────────────────────────────────────┤
│ 1. Rate Limiting │
│ 2. URL Validation & SSRF Check │
│ 3. Cache Check │
│ 4. Fetch URL with timeout │
│ 5. Parse HTML │
│ 6. Extract Metadata │
│ 7. Cache Result │
│ 8. Return Preview Data │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ URL Unfurler Service │
├─────────────────────────────────────────────────────────────┤
│ unfurlUrl() │
│ ├── Security checks (SSRF, blocked domains) │
│ ├── Rate limit enforcement │
│ ├── Cache management │
│ ├── HTTP fetch with protection │
│ └── HTML parsing │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Parser Services │
├─────────────────────────────────────────────────────────────┤
│ Open Graph Parser │
│ └── parseOpenGraph() - Extract og: meta tags │
│ Twitter Card Parser │
│ └── parseTwitterCard() - Extract twitter: meta tags │
│ Basic Metadata Parser │
│ └── parseBasicMetadata() - Extract <title>, etc. │
│ Domain Handlers │
│ ├── YouTube handler │
│ ├── GitHub handler │
│ ├── Twitter handler │
│ └── Spotify handler │
└─────────────────────────────────────────────────────────────┘
src/
├── components/chat/
│ ├── LinkPreview.tsx # Main preview component
│ ├── link-preview-card.tsx # (Integrated in LinkPreview.tsx)
│ └── message-content.tsx # Updated with preview support
│
├── lib/
│ ├── link-preview/
│ │ ├── preview-types.ts # TypeScript types
│ │ ├── unfurl.ts # Main unfurl logic
│ │ ├── og-parser.ts # Open Graph parser
│ │ ├── twitter-parser.ts # Twitter Card parser
│ │ ├── domain-handlers.ts # Platform-specific handlers
│ │ ├── preview-sanitizer.ts # Security sanitization
│ │ ├── preview-cache.ts # Cache management
│ │ └── index.ts # Public API
│ │
│ ├── social/
│ │ ├── url-unfurler.ts # Server-side unfurl service
│ │ └── og-parser.ts # Complete OG implementation
│ │
│ └── messages/
│ └── link-preview.ts # Client-side helpers
│
└── app/api/
├── unfurl/
│ └── route.ts # Main unfurl API endpoint
└── link-preview/
└── route.ts # Alternative endpoint
The LinkPreview component automatically detects and displays previews for URLs in messages:
import { MessageContent } from '@/components/chat/message-content'
;<MessageContent
content="Check out this article: https://example.com/article"
showLinkPreviews={true} // Default: true
/>You can also render previews manually:
import { LinkPreview } from '@/components/chat/LinkPreview'
;<LinkPreview
content="https://github.com/vercel/next.js"
maxPreviews={3}
autoFetch={true}
allowDismiss={true}
onDismiss={(url) => console.log('Dismissed:', url)}
/>Fetch preview data directly:
// Single URL
const response = await fetch('/api/unfurl?url=' + encodeURIComponent(url))
const result = await response.json()
if (result.success) {
console.log('Title:', result.data.title)
console.log('Image:', result.data.image)
console.log('Description:', result.data.description)
}
// Batch URLs
const response = await fetch('/api/unfurl', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
urls: ['https://example.com', 'https://github.com/vercel/next.js'],
forceRefresh: false,
}),
})
const result = await response.json()
// result.results contains a map of URL -> preview dataUse the unfurler service directly in API routes or server components:
import { unfurlUrl } from '@/lib/social/url-unfurler'
const result = await unfurlUrl('https://example.com', {
forceRefresh: false,
timeout: 10000,
cacheTtl: 3600000, // 1 hour
})
if (result.success) {
console.log('Preview:', result.data)
}# Optional: Custom user agent
UNFURL_USER_AGENT="nchat-bot/1.0 (+https://nchat.app/bot)"
# Optional: Custom timeout (milliseconds)
UNFURL_TIMEOUT=10000
# Optional: Cache duration (milliseconds)
UNFURL_CACHE_DURATION=3600000Default limits (configurable in code):
// Per domain
RATE_LIMIT_PER_DOMAIN: 10 requests/minute
// Per client (IP)
CLIENT_RATE_LIMIT: 30 requests/minuteAutomatically blocks:
- Private IP ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x)
- Loopback addresses (127.x.x.x, ::1)
- Link-local addresses (169.254.x.x)
- Cloud metadata services
- Blocked ports (22, 23, 25, 3389, 5432, 3306, 6379, 27017)
interface LinkPreviewData {
// Core fields
url: string
type: 'generic' | 'article' | 'video' | 'image' | 'audio' | ...
status: 'pending' | 'loading' | 'success' | 'error'
domain: string
isSecure: boolean
// Content
title?: string
description?: string
siteName?: string
favicon?: string
// Media
image?: string
imageWidth?: number
imageHeight?: number
imageAlt?: string
video?: string
audio?: string
// Metadata
author?: string
publishedTime?: string
modifiedTime?: string
locale?: string
themeColor?: string
// Cache
fetchedAt: number
expiresAt: number
// Error
error?: {
code: string
message: string
retryable: boolean
}
}- Video ID extraction
- Thumbnail URLs (hqdefault)
- Embed URLs
- Channel information (when available)
- Repository previews with stars/forks
- Issue and PR previews with state
- User/organization information
- OpenGraph images from GitHub CDN
- Tweet ID and author extraction
- oEmbed support
- User verification badges
- Media attachments
- Track, album, playlist, artist previews
- Embed URLs for playback
- Album artwork
- Artist information
- GitHub Gist
- CodePen
- CodeSandbox
- Embed URLs for live preview
The system handles various error conditions gracefully:
- Timeout (10 second limit)
- DNS resolution failures
- Connection refused
- SSL/TLS errors
- Response too large (5MB limit)
- Non-HTML content type
- Missing metadata
- Malformed HTML
- Private IP addresses
- Blocked domains
- SSRF attempts
- Blocked ports
- Per-domain rate limits
- Per-client rate limits
- Automatic retry-after headers
- In-memory cache (use Redis in production)
- 1 hour default TTL
- LRU eviction (max 1000 entries)
- Automatic cache size management
- Browser cache via Cache-Control headers
- 1 hour max-age
- Stale-while-revalidate for 24 hours
- Per-message preview caching
- Concurrent preview fetching (max 5 parallel)
- Lazy image loading
- Progressive preview rendering
- Cache hits avoid network requests
- Average fetch time: 200-500ms
- Cache hit rate: ~80% (typical)
- Memory usage: ~10KB per cached preview
- Network bandwidth: ~20-50KB per preview
# Test single URL
curl "http://localhost:3000/api/unfurl?url=https://github.com/vercel/next.js"
# Test with force refresh
curl "http://localhost:3000/api/unfurl?url=https://example.com&refresh=true"
# Test batch
curl -X POST http://localhost:3000/api/unfurl \
-H "Content-Type: application/json" \
-d '{"urls": ["https://github.com/vercel/next.js", "https://youtube.com/watch?v=dQw4w9WgXcQ"]}'import { unfurlUrl } from '@/lib/social/url-unfurler'
import { parseOpenGraph } from '@/lib/social/og-parser'
describe('URL Unfurler', () => {
it('should unfurl a valid URL', async () => {
const result = await unfurlUrl('https://example.com')
expect(result.success).toBe(true)
expect(result.data).toBeDefined()
})
it('should block private IPs', async () => {
const result = await unfurlUrl('http://192.168.1.1')
expect(result.success).toBe(false)
expect(result.errorCode).toBe('BLOCKED')
})
})- Replace in-memory cache with Redis
- Use CDN for preview images
- Implement distributed rate limiting
- Add monitoring and logging
- Regular security audits
- Update SSRF protection rules
- Monitor for abuse patterns
- Implement IP allowlists
- Enable HTTP/2 for parallel fetches
- Optimize image processing
- Implement preview pre-caching
- Use worker threads for parsing
- Check network tab for API errors
- Verify URL is accessible
- Check rate limits
- Inspect browser console for errors
- Verify URL is not private IP
- Check blocked domains list
- Ensure port is allowed
- Check cache hit rate
- Verify network latency
- Monitor API response times
- Check concurrent request limits
- Verify site has Open Graph tags
- Check robots.txt for bot blocking
- Verify User-Agent is allowed
- Test with different parsers
- oEmbed protocol support
- Video preview with inline player
- Audio waveform visualization
- PDF preview thumbnails
- RSS feed detection
- Schema.org structured data
- Machine learning for preview quality scoring
- WebSocket for real-time preview updates
- Preview history and analytics
- Custom domain handlers registry