10_pgmpy_Migration_Plan - ravkorsurv/kor-ai-core GitHub Wiki

Migration Plan: From Agena API to pgmpy โ€“ Kor.ai

This page outlines the strategy for moving from the Agena Cloud API (commercial Bayesian scoring engine) to an in-house solution built using the open-source pgmpy library.


๐ŸŽฏ Why Migrate?

Reason Benefit
Licensing Remove commercial dependency
IP Control Full transparency and internal ownership
Customisation Fine-grained control over inference logic
Offline Capability Local scoring, no cloud calls
Integration Flexibility Easier to embed into microservices

๐Ÿง  Target Stack

Component Tool / Format
Model Definition JSON โ†’ pgmpy format
CPT Handling pgmpy.TabularCPD
Graph Structure pgmpy.BayesianModel
Inference Engine pgmpy.VariableElimination
Deployment Python microservice (Flask/FastAPI)
CI/CD GitHub Actions + pytest coverage

๐Ÿงฑ Migration Phases

๐Ÿ”น Phase 1 โ€“ Extract Existing Logic

  • Export all .json models from Agena
  • Convert structure into pgmpy graph format
  • Preserve obfuscated node naming (Q1, Q2, etc.)
  • Document CPTs per node

๐Ÿ”น Phase 2 โ€“ Rebuild Scoring Logic

  • Define all CPDs using TabularCPD
  • Create graph using BayesianModel.add_edges_from(...)
  • Build internal test suite for scoring scenarios

๐Ÿ”น Phase 3 โ€“ Build Model Runner API

  • Wrap model inside a POST-based microservice
  • Inputs: Node dictionary (e.g. {Q1: "High", Q3: "No"})
  • Output: Final risk probabilities ({"Q10": {"Yes": 0.76}})

๐Ÿ”น Phase 4 โ€“ Benchmark & Validate

  • Compare Agena vs. pgmpy scoring on test cases
  • Adjust CPTs or structure where needed
  • Run performance benchmarks on live data

๐Ÿ”น Phase 5 โ€“ Cutover + Deprecate Agena

  • Replace Agena calls in core inference service
  • Update alert pipeline logic to use new microservice
  • Finalise internal documentation + alert audit links

๐Ÿ“‚ File Structure

Suggested layout under kor-ai-core/bayesian-engine/:


๐Ÿงช Validation Plan

  • Use all test cases from 06_Test_Case_Scenarios.md
  • Confirm output score delta between Agena and pgmpy โ‰ค ยฑ0.03
  • Use mock input coverage for edge cases and missing values

๐Ÿ” Long-Term Vision

  • Enable retraining of CPTs using historical case data
  • Support hybrid graph + rule scoring
  • Extend engine to support multi-risk models in parallel

Maintainer: @ravkorsurv
Target Cutover: Q4 2025