Agent‐Canvas Backup Explorer - RutgersGRID/VIDAHub GitHub Wiki
Canvas Backup Explorer
Overview
The Canvas Backup Explorer is an AI-powered tool developed by Rick Anderson as part of the VIDA project ecosystem. This sophisticated application provides comprehensive analysis and search capabilities for Canvas course content, enabling educators to explore, analyze, and understand their course materials through advanced AI-driven search and content processing.
Purpose
The Canvas Backup Explorer addresses several key challenges faced by educators:
- Course Content Analysis: Provides deep insights into course structure and content
- Comprehensive Search: Enables concept-based searching across all course materials
- Content Backup: Creates accessible backups of Canvas course data
- Content Organization: Provides systematic organization and cataloging of course materials
- Course Optimization: Helps identify content gaps, duplications, and improvement opportunities
Current Features
File Processing & Analysis
- Multi-format Support: Processes diverse file types including PDFs, images, documents, and multimedia
- Automatic OCR: Extracts text from images and scanned documents using DocLing
- Content Conversion: Converts all materials to searchable markdown format
- Intelligent Filtering: Automatically skips system files (CSS, JS, XML, ZIP) while processing educational content
Advanced Search Capabilities
- Concept-based Search: Goes beyond keyword matching to find conceptually related content
- Multi-type Queries: Searches across text, images, and other content types simultaneously
- Relevance Scoring: Color-coded match quality indicators (green/yellow/red)
- Content Chunking: Breaks down large documents into searchable segments
Course Analysis
- Content Mapping: Identifies and catalogs course elements (assignments, quizzes, discussions)
- Media Detection: Locates charts, graphs, images, and multimedia content
- Metadata Extraction: Finds due dates, schedules, contact information, and structural elements
- Raw Data Access: Provides direct access to underlying course database
Technical Architecture
Core Technology Stack
- Retrieval Augmented Generation (RAG): Advanced AI search and analysis system
- DocLing Integration: Automated document processing and OCR
- Search Indexing: Comprehensive content indexing for fast retrieval
- Markdown Conversion: Standardized content format for analysis
Processing Pipeline
- Course Import: Upload Canvas course backup files
- Content Analysis: Process and categorize all course materials
- Index Building: Create searchable database of course content
- Search Interface: Enable queries across all processed content
Search Capabilities
Comprehensive Content Discovery
- Assignment Analysis: Locate all assignments, rubrics, and grading criteria
- Assessment Content: Find quizzes, exams, and assessment materials
- Discussion Topics: Identify discussion prompts and collaborative activities
- Multimedia Resources: Discover images, charts, graphs, and visual content
- Course Structure: Map modules, lessons, and learning pathways
- Administrative Elements: Find due dates, schedules, contact information, and policies
Advanced Search Features
- Concept-based Queries: Search by educational concepts rather than just keywords
- Multi-format Analysis: Simultaneous search across text, images, and documents
- Relevance Scoring: Intelligent ranking of search results with quality indicators
- Content Type Filtering: Focus searches on specific material types
- Cross-reference Analysis: Find relationships between different course elements
Typical Processing Capacity
- Course Scale: Handles comprehensive course backups (~400+ files)
- Content Analysis: Processes ~70% of educational materials successfully
- Format Support: Analyzes text documents, PDFs, images, and web content
- Intelligent Filtering: Automatically excludes system files while preserving educational content
Potential Use Cases
For Faculty
- Course Review: Comprehensive analysis of course content and structure
- Content Discovery: Find specific elements quickly across large courses
- Accessibility Audit: Identify content that may need accessibility improvements
- Course Mapping: Visualize course structure and content relationships
For Instructional Designers
- Quality Assurance: Systematic review of course materials
- Content Gap Analysis: Identify missing or inconsistent content
- Template Development: Extract successful patterns for reuse
- Course Structure Analysis: Understand content organization and flow
For Administrators
- Course Backup: Reliable backup and archival of course content
- Content Analytics: Understand content patterns across courses
- Resource Management: Identify commonly used resources and materials
Planned Enhancements
Short-term Development
- Enhanced Search Accuracy: Refinement of concept-based search algorithms
- Improved User Interface: More intuitive interface for search and analysis results
- Content Export Features: Better tools for extracting and sharing analysis results
Future Vision
- Chat Agent Interface: Conversational queries using Claude AI integration
- Code Agent Capabilities: System that can write its own queries for advanced analysis
- Course Editing: Direct editing capabilities through the course database
- Collaborative Features: Sharing and collaboration tools for instructional teams
- ALLY Integration: Potential future integration with accessibility analysis tools
VIDA Project Alignment
Guiding Pillars Integration
- Faculty Empowerment & Agency: Provides powerful analysis tools without requiring technical expertise
- Educational Pedagogy Integration: Course concept mapping supports evidence-based instructional design
- Augmentation, Not Replacement: Enhances rather than replaces human analysis and decision-making
- Knowledge Sharing and Community Building: Enables sharing of course analysis insights and best practices
Development Framework
- Built using VIDA's rapid development framework
- Follows VIDA's standardized deployment processes
- Integrates with VIDA's AI service pipelines
- Maintains VIDA's accessibility and usability standards
Current Status
Development Phase: Active exploration and refinement
Primary Developer: Rick Anderson
Status: Core functionality implemented, search capabilities being refined
Next Steps: Collaboration with instructional designers to define use cases and requirements
Getting Started
Prerequisites
- Canvas course backup file
- Access to VIDA development environment
- Understanding of course content structure
Basic Workflow
- Export Course: Download Canvas course backup
- Upload to Explorer: Import course backup into the tool
- Processing: Allow system to analyze and index content (~284 files typically processed)
- Search & Analyze: Use search interface to explore course content
- Review Results: Analyze findings with color-coded relevance scoring
Integration Opportunities
Canvas LMS
- Direct Integration: Potential for direct Canvas API integration
- Workflow Integration: Seamless integration with Canvas course development workflow
- Real-time Analysis: Live analysis of course content as it's developed
Future Accessibility Tools
- ALLY Integration: Potential future integration with ALLY accessibility reports for enhanced analysis
- Accessibility Auditing: Could support accessibility analysis workflows in future iterations
- Compliance Reporting: Potential for automated accessibility compliance documentation
Support and Documentation
Resources
- Technical documentation available in VIDA development wiki
- Integration guides for ALLY and Canvas systems
- Best practices for course analysis and optimization
Collaboration
- Active development with input from instructional designers
- Collaboration opportunities with Dena and ID team
- Feedback integration for continuous improvement
Contact
- Primary Developer: Rick Anderson
- Project Integration: VIDA Project Team
- Instructional Design Consultation: Available through VIDA project channels
Technical Notes
File Processing Statistics
- Supported Formats: PDF, DOC, DOCX, images (JPG, PNG, GIF), HTML, TXT
- Automatically Skipped: MP3, CSS, JS, XML, ZIP, system files
- Processing Success Rate: ~69% of files successfully analyzed
- Search Index: Full-text and concept-based indexing
Performance Characteristics
- Processing Time: Varies based on course size and content complexity
- Search Speed: Near-instantaneous search results
- Storage Requirements: Efficient storage with markdown conversion
- Scalability: Designed to handle large course collections
Last updated: June 2025
For questions or collaboration opportunities, contact the VIDA project team or Rick Anderson directly.