11.0_Bayesian_Model_Principles - ravkorsurv/kor-ai-core GitHub Wiki

Bayesian Model Principles – Kor.ai Surveillance

This page documents the core Bayesian Network concepts that guide the Kor.ai risk scoring models. It draws directly from academic theory (esp. Darwiche, Chs. 7–10) and compares each concept with its application in Kor.ai’s insider dealing and spoofing use cases.


πŸ“š Key Concepts from Darwiche (Chapters 7–10)

Principle Description
Explaining Away If two causes explain an effect, observing one reduces belief in the other
Conditional Independence Nodes are independent unless connected by a direct/indirect path
Bayesian Inference Posterior belief updated using new evidence
Noisy-OR / Noisy-MAX Approximation technique for multiple causal parents
CPT Compression Large CPTs can be compacted using logic patterns
Abduction (Backwards Reasoning) Reasoning from effect to most likely cause
Inference by Enumeration Exhaustive approach to marginalisation (used by pgmpy under the hood)

🧠 Kor.ai Model Design – Concept Mapping

Concept Kor.ai Implementation Notes
Explaining Away βœ… Q1 (size), Q2 (price), Q3 (comms) reduce belief in alternate causes Active in insider model
Conditional Independence βœ… Clean separation unless probabilistic tie defined CPTs manually control scope
Prior Probabilities βœ… All nodes have seeded priors Tuned manually
Noisy-OR ❌ Not yet applied Optional v2 enhancement
Abduction βœ… Analysts interpret from high posterior cause nodes Supports explainability
Dynamic Bayesian Networks ❌ Not supported (static only) MVP constraint
CPT Compression ❌ Full CPTs used, compression not implemented May reduce inference time in future

πŸ§ͺ Insider Dealing – Application of β€œExplaining Away”

Example:

  • If we observe:
    • Q1 = High (large trade)
    • Q2 = Yes (price spike before news)
    • Q3 = Yes (chat with insider)
  • Then:

    "These jointly explain Q10 = Insider Dealing" and reduce the need for additional evidence from Q6 or Q7.


πŸ”„ Implications for Surveillance Library

  • βœ… Node behaviors match theoretical causality
  • πŸ”§ Future nodes (e.g., Q11 = profit share, Q12 = access logs) should maintain independence unless justified
  • ❗ Watch for false causality links during new model builds

🧭 Next Enhancements

  • Add Noisy-OR logic to reduce CPT burden
  • Move to pgmpy with full control over graph topology + inference engine
  • Consider dynamic or sequential models for time-window-based behavior (e.g., spoofing patterns)

Maintainer: @ravkorsurv
Source: Modeling and Reasoning with Bayesian Networks, Darwiche, Ch. 7–10
Kor.ai models: InsiderModel.json, SpoofingModel.json