File Formats.md - himent12/FlashGenie GitHub Wiki

📄 FlashGenie File Formats Guide

Comprehensive guide to all supported import and export formats in FlashGenie. Learn how to work with different file types, format specifications, and conversion strategies.

🎯 Supported Formats Overview

FlashGenie supports multiple file formats for maximum compatibility:

Import Formats

  • CSV - Comma-separated values (most common)
  • TSV - Tab-separated values
  • TXT - Plain text with structured format
  • JSON - JavaScript Object Notation (FlashGenie native)
  • XML - Extensible Markup Language
  • APKG - Anki package format (via plugin)
  • XLSX - Excel spreadsheet format

Export Formats

  • JSON - FlashGenie native format (recommended)
  • CSV - Universal compatibility
  • TSV - Tab-separated for Excel
  • TXT - Human-readable text
  • XML - Structured markup
  • HTML - Web-ready format
  • PDF - Printable format (via plugin)
  • APKG - Anki package format (via plugin)

📊 CSV Format (Comma-Separated Values)

Basic CSV Structure

The most common and recommended import format:

question,answer,tags,difficulty
"What is Python?","A programming language","programming,basics",1.0
"Define API","Application Programming Interface","programming,concepts",1.5
"What is ML?","Machine Learning","AI,concepts",2.0

Advanced CSV with Metadata

question,answer,tags,difficulty,category,source,notes
"What is Python?","A programming language","programming,basics",1.0,"Programming","Course 101","Fundamental concept"
"Define API","Application Programming Interface","programming,concepts",1.5,"Programming","Documentation","Important for web dev"
"What is ML?","Machine Learning","AI,concepts",2.0,"AI","Research Paper","Growing field"

CSV Configuration Options

{
  "csv_import": {
    "delimiter": ",",                    // Field separator
    "quote_char": "\"",                  // Quote character
    "escape_char": "\\",                 // Escape character
    "encoding": "utf-8",                 // File encoding
    "skip_header": true,                 // Skip first row
    "question_column": "question",       // Question column name
    "answer_column": "answer",           // Answer column name
    "tags_column": "tags",               // Tags column name
    "difficulty_column": "difficulty",   // Difficulty column name
    "tag_separator": ",",                // Tag separator within cell
    "null_values": ["", "NULL", "null"]  // Values treated as null
  }
}

CSV Import Examples

# Basic CSV import
flashgenie import my_cards.csv --deck "My Deck"

# CSV with custom delimiter
flashgenie import data.csv --delimiter ";" --deck "European Data"

# CSV with custom column mapping
flashgenie import cards.csv --question-col "Q" --answer-col "A" --deck "Custom Format"

# CSV with encoding specification
flashgenie import cards.csv --encoding "latin1" --deck "Legacy Data"

📝 TXT Format (Plain Text)

Standard TXT Format

Simple question-answer pairs:

Q: What is Python?
A: A programming language

Q: Define API
A: Application Programming Interface

Q: What is ML?
A: Machine Learning

Extended TXT Format with Metadata

Q: What is Python?
A: A programming language
Tags: programming, basics
Difficulty: 1.0
Category: Programming

---

Q: Define API
A: Application Programming Interface
Tags: programming, concepts
Difficulty: 1.5
Category: Programming

---

Q: What is ML?
A: Machine Learning
Tags: AI, concepts
Difficulty: 2.0
Category: Artificial Intelligence

TXT Configuration

{
  "txt_import": {
    "question_prefix": "Q:",             // Question line prefix
    "answer_prefix": "A:",               // Answer line prefix
    "tags_prefix": "Tags:",              // Tags line prefix
    "difficulty_prefix": "Difficulty:",  // Difficulty line prefix
    "separator": "---",                  // Card separator
    "encoding": "utf-8",                 // File encoding
    "strip_whitespace": true,            // Remove extra whitespace
    "ignore_case": true                  // Ignore case for prefixes
  }
}

🔧 JSON Format (Native)

FlashGenie Native JSON

The most complete format preserving all metadata:

{
  "deck": {
    "id": "deck_12345",
    "name": "Programming Basics",
    "description": "Fundamental programming concepts",
    "created_at": "2023-01-01T00:00:00Z",
    "updated_at": "2023-01-15T12:30:00Z",
    "version": "1.5.0",
    "metadata": {
      "author": "John Doe",
      "source": "CS101 Course",
      "difficulty_level": "beginner"
    }
  },
  "flashcards": [
    {
      "id": "card_001",
      "question": "What is Python?",
      "answer": "A programming language",
      "tags": ["programming", "basics"],
      "difficulty": 1.0,
      "created_at": "2023-01-01T00:00:00Z",
      "updated_at": "2023-01-01T00:00:00Z",
      "review_count": 5,
      "correct_count": 4,
      "last_reviewed": "2023-01-10T14:30:00Z",
      "next_review": "2023-01-12T14:30:00Z",
      "ease_factor": 2.5,
      "interval": 2,
      "metadata": {
        "source_line": 1,
        "confidence_level": 0.8
      }
    }
  ],
  "statistics": {
    "total_cards": 1,
    "total_reviews": 5,
    "average_difficulty": 1.0,
    "success_rate": 0.8
  }
}

Simplified JSON Format

For basic use cases:

{
  "cards": [
    {
      "question": "What is Python?",
      "answer": "A programming language",
      "tags": ["programming", "basics"]
    },
    {
      "question": "Define API",
      "answer": "Application Programming Interface",
      "tags": ["programming", "concepts"]
    }
  ]
}

📋 XML Format

Standard XML Structure

<?xml version="1.0" encoding="UTF-8"?>
<flashcards>
  <deck>
    <name>Programming Basics</name>
    <description>Fundamental programming concepts</description>
    <created>2023-01-01T00:00:00Z</created>
  </deck>
  <cards>
    <card id="card_001">
      <question>What is Python?</question>
      <answer>A programming language</answer>
      <tags>
        <tag>programming</tag>
        <tag>basics</tag>
      </tags>
      <difficulty>1.0</difficulty>
      <metadata>
        <created>2023-01-01T00:00:00Z</created>
        <reviews>5</reviews>
        <correct>4</correct>
      </metadata>
    </card>
  </cards>
</flashcards>

📈 Excel/XLSX Format

Excel Spreadsheet Structure

Question Answer Tags Difficulty Category Notes
What is Python? A programming language programming,basics 1.0 Programming Core concept
Define API Application Programming Interface programming,concepts 1.5 Programming Important

Excel Import Configuration

{
  "xlsx_import": {
    "sheet_name": "Sheet1",              // Worksheet name
    "header_row": 1,                     // Header row number
    "data_start_row": 2,                 // First data row
    "question_column": "A",              // Question column
    "answer_column": "B",                // Answer column
    "tags_column": "C",                  // Tags column
    "difficulty_column": "D",            // Difficulty column
    "skip_empty_rows": true,             // Skip empty rows
    "max_rows": null                     // Maximum rows to read
  }
}

🎴 Anki Format (APKG)

Anki Package Support

FlashGenie can import/export Anki packages with the Anki plugin:

# Install Anki plugin
flashgenie plugin install anki-importer

# Import Anki deck
flashgenie import my_deck.apkg --format anki

# Export to Anki format
flashgenie export "My Deck" my_export.apkg --format anki

Anki Text Format

What is Python?	A programming language	programming,basics
Define API	Application Programming Interface	programming,concepts
What is ML?	Machine Learning	AI,concepts

🌐 HTML Export Format

HTML Study Cards

<!DOCTYPE html>
<html>
<head>
    <title>FlashGenie Study Cards</title>
    <style>
        .card { border: 1px solid #ccc; margin: 10px; padding: 15px; }
        .question { font-weight: bold; color: #333; }
        .answer { color: #666; margin-top: 10px; }
        .tags { font-size: 0.8em; color: #999; }
    </style>
</head>
<body>
    <h1>Programming Basics</h1>
    <div class="card">
        <div class="question">What is Python?</div>
        <div class="answer">A programming language</div>
        <div class="tags">Tags: programming, basics</div>
    </div>
</body>
</html>

🔄 Format Conversion

Converting Between Formats

# Convert CSV to JSON
flashgenie convert input.csv output.json --from csv --to json

# Convert JSON to Anki format
flashgenie convert deck.json output.apkg --from json --to anki

# Batch convert multiple files
flashgenie convert *.csv --to json --output-dir converted/

# Convert with custom settings
flashgenie convert data.csv output.json --csv-delimiter ";" --encoding "latin1"

Conversion Configuration

{
  "conversion": {
    "preserve_metadata": true,           // Keep all metadata
    "validate_output": true,             // Validate converted data
    "backup_original": true,             // Backup original files
    "overwrite_existing": false,         // Overwrite existing files
    "batch_size": 1000,                  // Process in batches
    "error_handling": "skip"             // Error handling: "skip", "stop", "log"
  }
}

📋 Format Validation

Validation Rules

FlashGenie validates imported data:

{
  "validation": {
    "required_fields": ["question", "answer"],
    "max_question_length": 1000,
    "max_answer_length": 2000,
    "max_tags_per_card": 20,
    "difficulty_range": [0.1, 5.0],
    "allowed_characters": "unicode",
    "duplicate_detection": true,
    "empty_field_handling": "warn"
  }
}

Validation Commands

# Validate file before import
flashgenie validate input.csv --format csv

# Validate with strict rules
flashgenie validate input.json --strict

# Get validation report
flashgenie validate input.csv --report validation_report.txt

# Auto-fix common issues
flashgenie validate input.csv --auto-fix --output fixed_input.csv

🛠️ Custom Format Support

Creating Custom Importers

from flashgenie.data.importers.base_importer import BaseImporter

class CustomImporter(BaseImporter):
    """Custom format importer."""
    
    supported_extensions = ['.custom']
    
    def import_from_file(self, file_path, config=None):
        """Import from custom format."""
        cards = []
        
        with open(file_path, 'r', encoding='utf-8') as f:
            # Custom parsing logic
            content = f.read()
            cards = self.parse_custom_format(content)
        
        return cards
    
    def parse_custom_format(self, content):
        """Parse custom format content."""
        # Implementation here
        pass

Registering Custom Formats

# Register custom importer
flashgenie plugin register custom-importer

# Use custom format
flashgenie import data.custom --format custom

📊 Format Comparison

Format Import Export Metadata Compatibility File Size
JSON Full FlashGenie Medium
CSV Limited Universal Small
TXT Basic Universal Small
XML Full Good Large
XLSX Good Excel Medium
APKG ✅* ✅* Good Anki Medium
HTML Visual Web Large
PDF ✅* Visual Print Large

*Requires plugin


🎯 Best Practices

Choosing the Right Format

  1. For FlashGenie: Use JSON for full feature support
  2. For Compatibility: Use CSV for universal support
  3. For Anki Users: Use APKG format
  4. For Sharing: Use TXT for human readability
  5. For Printing: Use HTML or PDF export

Data Quality Tips

  1. Consistent Formatting: Use consistent question/answer formats
  2. Meaningful Tags: Use descriptive, hierarchical tags
  3. Appropriate Difficulty: Set realistic difficulty levels
  4. Clean Data: Remove duplicates and empty entries
  5. Backup Regularly: Keep backups in multiple formats

🔗 Related Documentation


Need help with a specific format? Check our FAQ or see the format examples repository for more samples! 🧞‍♂️✨

⚠️ **GitHub.com Fallback** ⚠️