Skill Generating Tests - accelerate-data/migration-utility GitHub Wiki
Generates test scenarios for a stored procedure migration. Enumerates conditional branches from the procedure's logic, synthesizes minimal fixture data for each branch, and writes structured JSON to test-specs/. This skill produces scenarios only -- it does not execute stored procedures, capture ground truth, or generate dbt YAML. The /generate-tests command handles bulk execution and YAML conversion after the review loop completes.
/generating-tests <schema.table>
Argument is the fully-qualified table name (the item_id). The writer procedure is read automatically from the catalog scoping section.
-
manifest.jsonmust exist in the project root. If missing, run Command Setup DDL first. -
catalog/tables/<item_id>.jsonmust exist. If missing, run/listing-objects list tablesto see available tables. - The table must have been scoped (Skill Analyzing Table) and profiled (Skill Profiling Table) before test generation.
uv run --project <shared-path> migrate context --table <item_id>Output JSON contains:
| Field | Description |
|---|---|
profile |
Classification, keys, watermark, PII answers |
materialization |
Derived from profile (snapshot/table/incremental) |
statements |
Resolved statement list with action (migrate/skip) and SQL |
proc_body |
Full original procedure SQL |
columns |
Target table column list |
source_tables |
Tables read by the writer |
schema_tests |
Deterministic test specs (entity integrity, referential integrity, recency, PII) |
For each table in source_tables, the skill reads catalog/tables/<schema>.<table>.json to get full column metadata (is_nullable, is_identity, types, auto_increment_columns).
For each statement where action == migrate, the skill identifies all conditional branches:
| Pattern | Branches to enumerate |
|---|---|
| MERGE WHEN clauses | One per WHEN MATCHED, WHEN NOT MATCHED, WHEN NOT MATCHED BY SOURCE |
| CASE/WHEN | One per arm + ELSE |
| JOIN | Match, no-match (NULL right side for LEFT JOIN), partial multi-condition match |
| WHERE | Row that passes, row that fails |
| Subquery | EXISTS true/false, IN match/miss, correlated hit/miss |
| NULL handling | Nullable columns in filters/joins/COALESCE -- NULL vs non-NULL |
| Aggregation | Single group, multiple groups, empty group |
| Type boundaries | Watermark date edges, MAX int, empty string |
| Empty source | Zero-row edge case |
Output: a branch manifest with IDs, descriptions, statement indexes, and patterns.
For each branch, generates minimum synthetic input rows (1-3 per source table):
- Each scenario is self-contained -- no shared test data across scenarios
- FK-consistent within each scenario using catalog
foreign_keysfor topological ordering - Type-appropriate values from catalog column metadata
- Parameters are ignored or flagged (rare in warehouse procs)
Columns excluded from fixtures:
- Computed columns (defined with
AS <expression>in DDL) - Identity columns not needed by the scenario (let SQL Server auto-generate)
NOT NULL column coverage: Every source table fixture row includes all columns where is_nullable == false, except computed and unneeded identity columns. Type-appropriate defaults are used for NOT NULL columns not referenced by the procedure:
| SQL Type Pattern | Default Value |
|---|---|
| INT, BIGINT, SMALLINT, TINYINT | 0 |
| NVARCHAR, VARCHAR, CHAR, NCHAR |
"" (empty string) |
| DATETIME, DATETIME2, DATE, SMALLDATETIME | "1900-01-01" |
| BIT | 0 |
| DECIMAL, NUMERIC, MONEY, SMALLMONEY | 0.00 |
| FLOAT, REAL | 0.0 |
| UNIQUEIDENTIFIER | "00000000-0000-0000-0000-000000000000" |
| VARBINARY, BINARY |
"" (empty string) |
CHECK constraint compliance: Fixture values satisfy any CHECK constraints found in the DDL (range, enum, cross-column).
Shows branch manifest, generated fixtures, and any uncovered branches or warnings. The user approves, requests edits, or declines.
Writes test-specs/<item_id>.json with the TestSpec schema. The expect field is omitted -- ground truth is captured later by the command.
Re-invoking the skill on the same table runs in merge mode: existing scenarios are preserved (including any expect blocks that already contain ground truth), and only new scenarios for uncovered branches are appended.
- Every
unit_tests[]entry has at least onegivenentry with rows - Every
given[].rowsentry includes all NOT NULL non-identity columns -
coverageset:completewhen all branches have scenarios,partialotherwise -
statusset:ok,partial, orerror
| File | Description |
|---|---|
manifest.json |
Project root validation |
catalog/tables/<item_id>.json |
Target table columns, scoping, profile |
catalog/tables/<source>.json |
Source table columns for NOT NULL coverage |
catalog/procedures/<writer>.json |
Writer procedure statements and refs |
| Field | Type | Required | Description |
|---|---|---|---|
item_id |
string | yes | Fully qualified table name |
status |
string | yes | Enum: ok, partial, error
|
coverage |
string | yes | Enum: complete, partial
|
branch_manifest |
array | yes | All identified branches |
unit_tests |
array | yes | Test scenarios with fixtures |
uncovered_branches |
string[] | yes | Branch IDs that lack scenarios |
warnings |
array | yes | Diagnostics entries |
validation |
object | yes |
passed (boolean) and issues[]
|
errors |
array | yes | Diagnostics entries |
| Field | Type | Required | Description |
|---|---|---|---|
id |
string | yes | Stable branch identifier (e.g., merge_matched_update) |
statement_index |
integer | yes | Index into the resolved statements array (0-based) |
description |
string | yes | Human-readable branch description |
scenarios |
string[] | yes | Test names that exercise this branch |
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | yes | Test name: test_<load_pattern>_<scenario_description>
|
target_table |
string | yes | Bracket-quoted target table identifier, e.g., [silver].[DimProduct]
|
procedure |
string | yes | Bracket-quoted stored procedure identifier, e.g., [silver].[usp_load_DimProduct]
|
model |
string | no | dbt model name -- optional; used during dbt YAML conversion |
given |
array | yes | One entry per fixture source table (minItems: 1) |
expect |
object | no | Ground truth output rows -- added by command after execution |
| Field | Type | Required | Description |
|---|---|---|---|
table |
string | yes | Bracket-quoted SQL identifier, e.g., [bronze].[SalesOrderHeader]
|
rows |
array | yes | Synthetic fixture input rows as column-value maps (minItems: 1) |
| Field | Type | Required | Description |
|---|---|---|---|
rows |
array | yes | Ground truth output rows captured from proc execution |
- Test name:
test_<load_pattern>_<scenario_description> -
target_table: bracket-quoted FQN, e.g.,[silver].[DimProduct] -
procedure: bracket-quoted FQN, e.g.,[silver].[usp_load_DimProduct] -
given[].table: bracket-quoted SQL identifier, e.g.,[bronze].[SalesOrderHeader]
| Condition | coverage |
status |
|---|---|---|
| All branches have scenarios | complete |
ok |
| Branches remain after review loop | partial |
partial |
| Generation failed | -- | error |
validation.issues[], warnings[], and errors[] use the shared diagnostics schema:
| Field | Type | Required | Description |
|---|---|---|---|
code |
string | yes | Stable machine-readable identifier (e.g., SCOPING_NOT_COMPLETED) |
message |
string | yes | Human-readable description |
item_id |
string | no | Fully qualified table name |
field |
string | no | Optional field path associated with the issue |
severity |
string | yes | Enum: error, warning
|
details |
object | no | Optional structured context |
{
"item_id": "silver.dimproduct",
"status": "ok",
"coverage": "complete",
"branch_manifest": [
{
"id": "merge_matched_update",
"statement_index": 0,
"description": "MERGE WHEN MATCHED -> UPDATE existing product",
"scenarios": ["test_merge_matched_existing_product_updated"]
},
{
"id": "merge_not_matched_insert",
"statement_index": 0,
"description": "MERGE WHEN NOT MATCHED -> INSERT new product",
"scenarios": ["test_merge_not_matched_new_product_inserted"]
}
],
"unit_tests": [
{
"name": "test_merge_matched_existing_product_updated",
"target_table": "[silver].[DimProduct]",
"procedure": "[silver].[usp_load_DimProduct]",
"given": [
{
"table": "[bronze].[Product]",
"rows": [
{ "ProductID": 1, "Name": "Widget", "ListPrice": 99.99, "ModifiedDate": "2024-01-15", "rowguid": "A0000000-0000-0000-0000-000000000001" }
]
},
{
"table": "[silver].[DimProduct]",
"rows": [
{ "ProductKey": 1, "ProductName": "Old Widget", "ListPrice": 50.00 }
]
}
]
},
{
"name": "test_merge_not_matched_new_product_inserted",
"target_table": "[silver].[DimProduct]",
"procedure": "[silver].[usp_load_DimProduct]",
"given": [
{
"table": "[bronze].[Product]",
"rows": [
{ "ProductID": 999, "Name": "New Widget", "ListPrice": 29.99, "ModifiedDate": "2024-06-01", "rowguid": "B0000000-0000-0000-0000-000000000001" }
]
}
]
}
],
"uncovered_branches": [],
"warnings": [],
"validation": {
"passed": true,
"issues": []
},
"errors": []
}When invoked with a feedback_for_generator JSON block (from the test reviewer sub-agent):
-
uncovered_branches: list of branch IDs missing coverage. New scenarios are generated for each listed branch and added tounit_tests[]. -
quality_fixes: per-scenario remediation instructions. Named scenarios are revised inunit_tests[]as directed.
Previously approved scenarios are preserved -- only additions and revisions are applied.
The test generator must not:
- Execute stored procedures or access the sandbox
- Generate dbt SQL model files
- Render YAML --
unit_tests[]is structured JSON; dbt YAML conversion happens post-execution - Make materialization or business key decisions
- Score its own coverage authoritatively -- the test reviewer sub-agent does that
| Error | Cause | Fix |
|---|---|---|
migrate context exit code 1 |
No profile or no statements in catalog | Run Skill Analyzing Table and Skill Profiling Table first |
migrate context exit code 2 |
IO/parse error reading catalog | Check file permissions and JSON validity in catalog/
|
| Empty branch manifest | Procedure has no conditional logic (single straight-through INSERT) | This is valid -- generate a single scenario for the base case |
| NOT NULL violation in fixtures | Fixture row missing a required non-nullable column | Ensure all NOT NULL non-identity columns are included with type-appropriate defaults |