Data Export - nself-org/nchat GitHub Wiki
Complete GDPR-compliant data export system for nself-chat.
The Data Export & Backup system allows users to export their chat data in multiple formats for backup purposes or GDPR compliance. The system supports background processing, real-time progress tracking, and automatic expiry for security.
- JSON - Structured data with complete metadata
- CSV - Spreadsheet-friendly flat format
- HTML - Styled, printable format
- PDF - Professional document format (server-side generation required)
- All Messages: Complete chat history
- Direct Messages: Private conversations only
- Specific Channels: Selected channels
- User Data: Profile and settings
- Files: Include or exclude file attachments
- Reactions: Include emoji reactions
- Threads: Include thread replies
- Edit History: Include message edit history
- Anonymization: GDPR-compliant data masking
Navigate to: Settings → Privacy & Security → Data Export
- Select Scope: Choose what data to export
- Choose Format: Select output format (JSON, CSV, HTML, PDF)
- Set Date Range (Optional): Filter by date
- Configure Options: Toggle files, reactions, threads, edits
- Enable Anonymization (Optional): For GDPR compliance
- Create Export: Start background processing
View all past export requests with:
- Status (Pending, Processing, Completed, Failed, Expired, Cancelled)
- Real-time progress (percentage and items processed)
- Download links for completed exports
- Cancel option for pending/processing exports
- Expiry information
┌─────────────┐
│ User UI │
└──────┬──────┘
│
↓
┌─────────────────┐
│ API Route │ /api/export
│ (route.ts) │
└──────┬──────────┘
│
↓
┌─────────────────┐
│ Background │
│ Worker │
└──────┬──────────┘
│
↓
┌─────────────────┐
│ DataExporter │ Fetch data from GraphQL
│ (data-exporter.ts)
└──────┬──────────┘
│
↓
┌─────────────────┐
│ Formatters │ Convert to JSON/CSV/HTML/PDF
│ (formatters.ts) │
└──────┬──────────┘
│
↓
┌─────────────────┐
│ File Storage │ S3/MinIO (production)
└─────────────────┘
│
↓
┌─────────────────┐
│ Email │ Notify user
│ Notification │
└─────────────────┘
-
/src/lib/export/types.ts- TypeScript types -
/src/lib/export/data-exporter.ts- Core export engine -
/src/lib/export/formatters.ts- Format converters -
/src/lib/export/index.ts- Public API -
/src/app/api/export/route.ts- API endpoint -
/src/components/settings/DataExport.tsx- UI component
-
Request Creation
- User configures export options
- POST to
/api/exportcreates request - Request queued for background processing
-
Background Processing
- DataExporter fetches data from GraphQL
- Processes messages, channels, users in batches
- Updates progress in real-time
- Applies anonymization if requested
-
File Generation
- Formatter converts data to selected format
- File stored in S3/MinIO (production) or memory (demo)
- Download URL generated
-
User Notification
- Email sent when export completes
- User can download from Export History
-
Automatic Cleanup
- Exports expire after 7 days
- Files automatically deleted
- Cleanup runs hourly
The export system supports these GDPR rights:
-
Right to Access (Article 15)
- Users can export all their personal data
- Includes messages, files, reactions, profile
-
Right to Data Portability (Article 20)
- Machine-readable formats (JSON, CSV)
- Structured, commonly used formats
- Easy to transfer to other services
-
Right to Erasure (Article 17)
- Users can export before requesting deletion
- Provides backup of their data
When enabled:
- User names → "User 1", "User 2", etc.
- Email addresses → [email protected]
- Avatar URLs → Removed
- Message content → Preserved (user's data)
- Timestamps → Preserved
- Channel names → Preserved
- Authentication: Only authenticated users can export
- Authorization: Users can only export their accessible data
- Automatic Expiry: Files deleted after 7 days
- Rate Limiting: Prevent abuse
- Audit Logging: All exports logged
POST /api/export
Content-Type: application/json
{
"options": {
"scope": "all_messages",
"format": "json",
"fromDate": "2024-01-01T00:00:00.000Z",
"toDate": "2024-12-31T23:59:59.999Z",
"includeFiles": true,
"includeReactions": true,
"includeThreads": true,
"includeEdits": false,
"anonymize": false,
"includeUserData": true,
"includeChannelData": true
},
"userId": "user-id"
}Response:
{
"success": true,
"exportId": "uuid",
"message": "Export request created. Processing in background.",
"estimatedTime": "5-10 minutes"
}GET /api/export?id={exportId}&action=statusResponse:
{
"success": true,
"export": {
"id": "uuid",
"status": "processing",
"progress": 75,
"itemsProcessed": 750,
"itemsTotal": 1000,
"createdAt": "2024-01-01T00:00:00.000Z",
"completedAt": null,
"expiresAt": "2024-01-08T00:00:00.000Z",
"downloadUrl": null,
"fileName": null,
"fileSize": null,
"errorMessage": null
}
}GET /api/export?id={exportId}&action=downloadResponse: File download with appropriate Content-Type
DELETE /api/export?id={exportId}Response:
{
"success": true,
"message": "Export cancelled/deleted"
}-
pending- Queued, not started yet -
processing- Currently being processed -
completed- Ready for download -
failed- Processing error occurred -
cancelled- User cancelled the export -
expired- Download link expired (7 days)
| Feature | JSON | CSV | HTML | |
|---|---|---|---|---|
| Machine Readable | ✅ | ✅ | ❌ | |
| Human Readable | ✅ | ✅ | ||
| Preserves Structure | ✅ | ❌ | ✅ | ✅ |
| Includes Metadata | ✅ | ❌ | ✅ | ✅ |
| Spreadsheet Compatible | ❌ | ✅ | ❌ | ❌ |
| Printable | ❌ | ❌ | ✅ | ✅ |
| File Size | Medium | Small | Large | Large |
| Processing Speed | Fast | Fast | Medium | Slow |
const options: ExportOptions = {
scope: 'all_messages',
format: 'json',
includeFiles: true,
includeReactions: true,
includeThreads: true,
includeUserData: true,
includeChannelData: true,
}const options: ExportOptions = {
scope: 'direct_messages',
format: 'csv',
includeFiles: false,
includeReactions: false,
includeThreads: false,
}const options: ExportOptions = {
scope: 'all_messages',
format: 'json',
includeFiles: true,
includeReactions: true,
includeThreads: true,
includeEdits: true,
anonymize: true,
includeUserData: true,
includeChannelData: true,
}const options: ExportOptions = {
scope: 'all_messages',
format: 'html',
fromDate: new Date('2024-01-01'),
toDate: new Date('2024-12-31'),
includeFiles: true,
includeReactions: true,
includeThreads: true,
}- Batching: Fetch 100-1000 messages per batch
- Streaming: Stream large exports to avoid memory limits
- Compression: Gzip exports before storage
- Caching: Cache frequently accessed data
- Indexing: Database indexes on key fields
- Maximum Messages: 1,000,000 per export
- Maximum File Size: 500 MB (configurable)
- Maximum Files: 10,000 attachments
- Batch Size: 100-1000 messages
| Data Volume | Estimated Time |
|---|---|
| < 1,000 messages | 1-2 minutes |
| 1,000-10,000 messages | 2-5 minutes |
| 10,000-100,000 messages | 5-15 minutes |
| > 100,000 messages | 15-60 minutes |
- Check background worker status
- Review server logs for errors
- Verify database connectivity
- Check queue system health
- Create a new export request
- Download within 7 days
- Consider setting up automatic backups
- Verify user has access to channels
- Check date range filters
- Ensure proper permissions
- Review export scope settings
- Use CSV instead of JSON/HTML
- Exclude file attachments
- Use date range filters
- Export specific channels
- Message Queue: Bull/BullMQ
- File Storage: AWS S3 or MinIO
- Email Service: SendGrid or AWS SES
- Database: PostgreSQL
- Redis: For queue management
STORAGE_ENDPOINT=https://s3.amazonaws.com
STORAGE_BUCKET=exports
STORAGE_ACCESS_KEY=...
STORAGE_SECRET_KEY=...
REDIS_URL=redis://localhost:6379
[email protected]
SENDGRID_API_KEY=...
EXPORT_EXPIRY_DAYS=7
EXPORT_MAX_SIZE_MB=500
EXPORT_RATE_LIMIT_PER_DAY=5Track these metrics:
- Export request rate
- Processing duration
- File sizes
- Error rates
- Queue depth
- Storage usage
- Download rates
- Scheduled recurring exports
- Export to cloud storage (Google Drive, Dropbox)
- Multi-format archives (ZIP with all formats)
- Incremental exports (changes since last export)
- Advanced filtering (by keyword, user, type)
- Export templates
- Webhook notifications
- Compression options
For issues or questions:
- GitHub Issues: nself-chat/issues
- Documentation: docs.nself.org
- Email: [email protected]