Inference - statnett/Talk2PowerSystem GitHub Wiki
- #16 (DONE) consider derived (shortcut) props
- #20 cimex ontology: decide prefix and prop names
- #156 CIM needs subproperties
- #93 (DONE) Removing Tautologies
- #270 (DONE) define terms used in reasoning
- #218 (DONE) Pare down inference
Reasoning simplifies working with the CIM knowledge graph by materializing relationships in the data are not explicit. They are derived from semantic relationships such as subclass, inverse, subproperty, transitive property chains. This enables much simpler SPARQL queries and allows LLMs and humans to work with higher-level, "shortcut" properties (such as cimr:connectedThroughPart) instead of navigating complex property paths.
For a detailed explanation of the motivation and practical examples, read the blog post Using Semantic Reasoning to Help LLM with SPARQL Generation in Electrical CIM.
Inst4CIM-KG section Reasoning discusses what reasoning is appropriate with CIM.
- CIM defines
rdfs:subClassOfreasoning and SHACL should rely on it. - CIM defines
owl:inverseOf(all CIM relations have inverses) but doesn't rely on it.
Here we enable the above and add:
-
rdfs:subPropertyOf: needed forcimr:hasPart, cimr:isPart -
owl:TransitiveProperty: needed forcimr:hasPartTransitive, cimr:isPartTransitive -
owl:propertyChainAxiom: needed forcimr:connectedTo, cimr:connectedThroughPart -
owl:SymmetricProperty:cimr:connectedTo, cimr:connectedThroughPartare declared symmetric, but we don't need this reasoning since the respective property chains are already symmetric.
Load the cimr.ttl ontology.
We use a custom ruleset cim.pie. It is a minimal ruleset that only has rules that we need (see GDB doc) Which is:
- optimised to remove tautologies (see #93)
- uses a more efficient
transitiveOverrule (see here) - is optimized by fixed-arity property chains, instead of chains represented with
rdf:List. See (#218)
This query 01-add-inference.ru can be used to load and use it directly.
PREFIX sys: <http://www.ontotext.com/owlim/system#>
INSERT DATA {
<_:cim> sys:addRuleset <https://raw.githubusercontent.com/statnett/Talk2PowerSystem/refs/heads/main/data/cim.pie> .
[] sys:defaultRuleset "cim".
[] sys:reinfer [].
}Check that the correct ruleset is activated:
prefix sys: <http://www.ontotext.com/owlim/system#>
SELECT ?state ?ruleset {
?state sys:listRulesets ?ruleset
}Using a custom ruleset allows us to pare-down inference compared to a standard ruleset. Statistics before and after the change:
| ruleset | explicit | inferred | expansion | rdf:type | rdfs:domain | rdfs:range |
|---|---|---|---|---|---|---|
| OWL2-RL-optimized | 122,948 | 199,932 | 2.63 | 38568 | 18875 | 11916 |
| Custom (pared-down) | 122,906 | 155,887 | 2.27 | 32163 | 5246 | 5233 |
| Reduction | 22% | 16% | 72% | 54% |
Most importantly, we have eliminated inferred domain/range statements.
These are counter-intuitive and useless.
Take for example this query that looks for properties related to cim:Switch, i.e. having this class as domain or range:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX cim: <https://cim.ucaiug.io/ns#>
select ?x ?p ?y {
{?p rdfs:domain cim:Switch; rdfs:range ?y}
union {?p rdfs:domain ?x; rdfs:range cim:Switch}
}- Without reasoning, it returns only props that are directly related to Switch.
You can see that from the prop name:
- Attributes of switch (
Switch.<lowercase>) - Outgoing relations of switch (
Switch.<Uppercase>) - Incoming relations of switch (
<Uppercase>.Switch)
- Attributes of switch (
| x | p | y |
|---|---|---|
| cim:Switch.SwitchSchedules | cim:SwitchSchedule | |
| cim:Switch.normalOpen | xsd:boolean | |
| cim:Switch.ratedCurrent | xsd:float | |
| cim:Switch.retained | xsd:boolean | |
| cim:Switch.SvSwitch | cim:SvSwitch | |
| cim:Switch.locked | xsd:boolean | |
| cim:Switch.open | xsd:boolean | |
| nc:Switch.TopologyAction | nc:TopologyAction | |
| nc:Switch.SwitchRegularSchedule | nc:SwitchRegularSchedule | |
| nc:Switch.SwitchSchedule | nc:SwitchSchedule | |
| cim:SwitchSchedule | cim:SwitchSchedule.Switch | |
| cim:SvSwitch | cim:SvSwitch.Switch | |
| nc:TopologyAction | nc:TopologyAction.Switch | |
| nc:SwitchRegularSchedule | nc:SwitchRegularSchedule.Switch | |
| nc:SwitchSchedule | nc:SwitchSchedule.Switch |
- With reasoning, the query also returns all superclasses. For example:
-
SwitchSchedule.Switchhas domain not onlySwitchSchedulebut also all its superclasses:IdentifiedObject, BasicIntervalSchedule, SeasonDayTypeSchedule, RegularIntervalSchedule -
Switch.openhas domain not onlySwitchbut also all its superclasses:IdentifiedObject, PowerSystemResource, Equipment, ConductingEquipment
-
This is counter-intuitive and useless.
- It would be useful if the domain of
Switch.openincluded all its subclasses:Breaker Cut DisconnectingCircuitBreaker Disconnector Fuse GroundDisconnector Jumper LoadBreakSwitch ProtectedSwitch - But this is not how
rdfs:domain"inheritance" is defined in RDFS.
We guess that RDFS got Type variance wrong (Contravariance vs Covariance).
Statnett:
- We ended up making shortcuts in the model to simplify the query, but we cant alter the model every time we have a new use case. Or can we ?? And still keep it manageable?
- Both containment relations and connectivity are very commonly used. We should keep in mind is that there are several different hierarchies that could be used, depending on who's using the model. A simple non-electrical example: for some users substations are part of geographical regions but for others they primarily considered part of a bidding zone, while a third set of users primarily talk about who owns (or operates) the substation.
- I'm not sure whether any of our internal models are doing this, but the standard encourages Line objects that attach to more than two substations in the case of switchless junctions. See figure 9 and 10 in IEC 61970-301:2020 for an example of this.
We need some structure or principles or "theory" what inferred props to create. 10y ago I worked with a complex ontology in heritage/archeology/history called CIDOC CRM. It captures large and sprawling graphs of situations and attribution. The theory of what shortcuts to make was called "Fundamental Relations". Eg one FR is "thing is From place", which in CRM could mean:
- thing was made in place
- part of thing was made in subplace of place
- thing was made by person born in place
- thing was made by person who flourished (worked) in place
- thing was made for important event that happened at place (eg "Vatican tiara")
Now add subprops and recursive loops at various spots, and you'll quickly see how "From" collapses a whole bunch of possible "situation subgraphs" into one easy to use relation.
References:
- Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM). Alexiev, V.; Manov, D.; Parvanova, J.; and Petrov, S. In Workshop Practical Experiences with CIDOC CRM and its Extensions (CRMEX 2013) at TPDL 2013, volume 1117, Valetta, Malta, September 2013. CEUR WS Paper slides preprint
- Implementing CIDOC CRM Search Based on Fundamental Relations and OWLIM Rules. Alexiev, V. In Workshop on Semantic Digital Archives (SDA 2012), part of International Conference on Theory and Practice of Digital Libraries (TPDL 2012), volume 912, Paphos, Cyprus, September 2012. CEUR WS Paper slides published
- FR Implementation (in an old Confluence, so don't mind the security warning). In particular I extracted the rules from the text, and implemented an expander from a shorthand form
p <ptop:transitiveOver> q; x p y; y q z => x p z
to GraphDB Rules (.pie) notation
p <ptop:transitiveOver> q
x p y
y q z
--------
x p z
- FR Dependency Graph of relation dependencies (of course derived from the text!). It shows me that I don't have dependency loops, and have not mistyped a relation (no disconnected parts)
What I learned since is that it's better to use more generic rule structures; and push the domain-specifics into axioms. This gives various ideas how to use specialized rule constructs while keeping domain-specific stuff in axioms (not to overload the rules file with domain-specific terminology):
- Extending OWL2 Property Constructs with OWLIM Rules. Alexiev, V. Technical Report Ontotext Corp, September 2014.
The recent ASHRAE 233P standard has a lot of connectivity stuff (about buildings) that we can use as inspiration.
If you think CIM has a complex connection model, consider the ASHRAE 223p standard that has an even more elaborate connection model. The same is referenced in the Bricks Schema that has a simpler model. ASHRAE 223p and Bricks are used in Building Management Systems to describe producers (eg a heater), consumers (eg a radiator), flows, sensors, actuators, and the connections between them.
cnx is the basic asserted (symmetric) relation, and all relations on the following figure can be inferred from it:
ASHRAE inference is implemented using SHACL Rules (Triple and SPARQL rules) as discussed in data-shapes#343. How this could be implemented efficiently is discussed in data-shapes#347.
