Data Migration - FaqiangMei/MHA-Survey-Portal GitHub Wiki
Data Migration
This guide documents how to move student, advisor, and survey data from legacy systems into the Health application. It focuses on planning, conversion utilities, validation, and transfer workflows.
Source systems
Identify each legacy source and the format it exposes:
| Source | Export Format | Notes |
|---|---|---|
| Legacy advising app | CSV export (students, advisors, survey results) | Accessible via nightly report or manual export |
| Offline survey spreadsheets | XLSX | Structured with question columns matching Survey#survey_id |
| Historical notifications | None | Optional; can be regenerated after import |
Document credentials, data owners, and any rate limits in the team runbook.
Target schema overview
users: keyed byemail, includesrole,first_name,last_namestudent_profiles: linked to users withuin,classification,majorsurveys,competencies,questions: defined by seeds (db/seeds.rb) and YAML fixtures (db/data/program_surveys.yml)survey_responsesandquestion_responses: store student submissionsfeedbacks: capture advisor/student qualitative feedback
Refer to Code Documentation for detailed model relationships.
Conversion workflow
- Extract legacy data into CSV/JSON files stored in
tmp/imports/(ignored by Git). - Transform the files to match Rails models:
- Normalize email casing
- Map legacy roles to
admin,advisor,student - Align survey question identifiers with
questions.question_order - Convert timestamps to UTC ISO 8601 format
- Load via Rails tasks or scripts.
Suggested utilities
Create import scripts under lib/tasks/ or scripts/:
lib/tasks/import_users.rake: Readstmp/imports/users.csv, upsertsUserandStudentProfilerecords.lib/tasks/import_survey_responses.rake: Maps legacy survey answers toSurveyResponse+QuestionResponseentries.scripts/migrate_from_legacy.rb: Ruby script that orchestrates the tasks and logs progress.
Pseudo-code structure for an import task:
namespace :import do
desc "Load users from tmp/imports/users.csv"
task users: :environment do
require "csv"
path = Rails.root.join("tmp", "imports", "users.csv")
CSV.foreach(path, headers: true) do |row|
user = User.find_or_initialize_by(email: row["Email"].strip.downcase)
user.assign_attributes(
first_name: row["First Name"],
last_name: row["Last Name"],
role: row["Role"].presence || "student"
)
user.save!
if user.student?
profile = user.student_profile || user.build_student_profile
profile.update!(uin: row["UIN"], classification: row["Classification"], major: row["Major"])
end
end
end
end
Store final utilities in version control and add specs to cover edge cases.
Validation checklist
- Run imports in the staging environment first.
- After loading data, verify:
- Counts of users/surveys match source totals.
- Sample students can sign in and view historical responses.
- Reports generated via
ReportsControllerreflect imported data.
- Use
rails test test/models/*_test.rbto ensure core models remain valid. - Take a fresh backup before and after large imports (see Backup Plan).
Transfer schedule
- Plan migrations during low-traffic windows (e.g., evenings, weekends).
- Communicate downtime expectations to advisors and students.
- For phased migrations, toggle legacy system to read-only during the cutover to avoid divergence.
Rollback strategy
- Capture a database snapshot immediately before running import tasks.
- If the import fails, restore from the snapshot (
heroku pg:backups:restore). - Track imported record IDs to allow targeted deletion if partial cleanup is needed.
Next steps
- Implement the suggested utilities and check them into
lib/tasks/. - Document command usage in
README.mdorwiki/Development-Guide.md. - Add automated tests for transformation helpers and import tasks.
- Schedule end-to-end rehearsals before the production cutover.