10_pgmpy_Migration_Plan - ravkorsurv/kor-ai-core GitHub Wiki
Migration Plan: From Agena API to pgmpy โ Kor.ai
This page outlines the strategy for moving from the Agena Cloud API (commercial Bayesian scoring engine) to an in-house solution built using the open-source pgmpy
library.
๐ฏ Why Migrate?
Reason | Benefit |
---|---|
Licensing | Remove commercial dependency |
IP Control | Full transparency and internal ownership |
Customisation | Fine-grained control over inference logic |
Offline Capability | Local scoring, no cloud calls |
Integration Flexibility | Easier to embed into microservices |
๐ง Target Stack
Component | Tool / Format |
---|---|
Model Definition | JSON โ pgmpy format |
CPT Handling | pgmpy.TabularCPD |
Graph Structure | pgmpy.BayesianModel |
Inference Engine | pgmpy.VariableElimination |
Deployment | Python microservice (Flask/FastAPI) |
CI/CD | GitHub Actions + pytest coverage |
๐งฑ Migration Phases
๐น Phase 1 โ Extract Existing Logic
- Export all
.json
models from Agena - Convert structure into
pgmpy
graph format - Preserve obfuscated node naming (
Q1
,Q2
, etc.) - Document CPTs per node
๐น Phase 2 โ Rebuild Scoring Logic
- Define all CPDs using
TabularCPD
- Create graph using
BayesianModel.add_edges_from(...)
- Build internal test suite for scoring scenarios
๐น Phase 3 โ Build Model Runner API
- Wrap model inside a
POST
-based microservice - Inputs: Node dictionary (e.g.
{Q1: "High", Q3: "No"}
) - Output: Final risk probabilities (
{"Q10": {"Yes": 0.76}}
)
๐น Phase 4 โ Benchmark & Validate
- Compare Agena vs. pgmpy scoring on test cases
- Adjust CPTs or structure where needed
- Run performance benchmarks on live data
๐น Phase 5 โ Cutover + Deprecate Agena
- Replace Agena calls in core inference service
- Update alert pipeline logic to use new microservice
- Finalise internal documentation + alert audit links
๐ File Structure
Suggested layout under kor-ai-core/bayesian-engine/
:
๐งช Validation Plan
- Use all test cases from
06_Test_Case_Scenarios.md
- Confirm output score delta between Agena and pgmpy โค ยฑ0.03
- Use mock input coverage for edge cases and missing values
๐ Long-Term Vision
- Enable retraining of CPTs using historical case data
- Support hybrid graph + rule scoring
- Extend engine to support multi-risk models in parallel
Maintainer: @ravkorsurv
Target Cutover: Q4 2025