Demo1 - statnett/Talk2PowerSystem GitHub Wiki
This is an initial demo (before project kickoff) on the existing Demo0 data:
- GraphDB: https://cim.ontotext.com/graphdb/ . Anonymous read-only access is enabled.
This is Nordic44 data as described in Datasets. CIM Primer is useful to understand the questions.
Tasks
- #5 pre-demo (before kickoff meeting). DONE. shows the 5 business questions to address.
- #16 consider derived (shortcut) props. DONE. Described in Inference
- #20 cimex ontology: decide prefix and prop names. Future.
- #19 upgrade cim.ontotext.com Elastic. Future: for now, dropped GDB-Elastic connectors.
- #15 upgrade CIM demo's GDB and SO?. Future or cancel
Observations and Fixes
Data observations:
- There's Autocomplete index, and it includes IdentifiedObject.name (and
rdfs:label
) - Some classes use local names, and some class pairs use the same names: #14 check uniqueness of name. However, the names of substations are unique among substations
- The repo defines
cim
: as a versioned namespace:<http://iec.ch/TC57/2013/CIM-schema-cim16#>
- UUID is made with uuid1() so the variation is in the last 2 chars of the first part, not at the end
- There aren't any mRIDs: Nordic44#6 add mRID (GUID) from IdentifiedObject URLs
- Uses ontology terms that are not defined: Nordic44#8 provide all term definitions or migrate to CIM/CGMES/NC terms
- Including
entsoe2:EnergyCongestionZone
: Nordic44#10 Remove entsoe and entsoe2 namespaces - Including
pti:Substation.EnergySchedulingArea
: Nordic44#5: migrate pti: props to CGMES or NC props
- Including
Question observations:
- XYZ can be IdentifiedObject.mRID (the GUID) or IdentifiedObject.name.
So the chatbot needs to find XYZ somehow among all entities (eg substations).
This is a candidate for Autocomplete or IRI discovery.
- But
mRID
is missing, so we just resolve name
- But
Decisions:
- How to feed ontologies to LLM? We do it with a query. Other options:
- From a local file
- From a graph
- How to subset the ontologies? We select only terms used in the data with
nordic44-ontology-query.rq
:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX onto: <http://www.ontotext.com/>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
construct {?x ?p ?o} from onto:explicit {
{SELECT DISTINCT ?x {{[] ?x []} UNION {[] a ?x}}}
UNION {?x rdfs:isDefinedBy cimr:}
?x ?p ?o
} order by ?x
Competency Questions
Q1.1 Transformers
List all transformers within Substation "XYZ"
- Transformers have PowerTransformerEnds that relate directly to BaseVoltage (a value object, eg "300 kV"), not to a VoltageLevel (which is per substation)
- Contrary to what the CIM-Primer says: I see mostly single VoltageLevel per Substation. This returns 43 substations with 1 level, and 1 substation with 2 levels:
select (count(*) as ?subs) ?volts {
{select ?sub (count(*) as ?volts) {
?sub cim:Substation.VoltageLevels ?y
} group by ?sub}
} group by ?volts
- "ARENDAL" is the only substation with 2 VoltageLevels:
select ?sub ?subName (count(*) as ?volts) {
?sub cim:Substation.VoltageLevels ?y; cim:IdentifiedObject.name ?subName
} group by ?sub ?subName having (?volts>1)
But why is this not shown by Demo0's early attempt to generate a map? (File chatGPT/locations-show.html in internal gitlab project)? Blue is 300kV, red is 420kV, and there's only blue around ARENDAL: perhaps a Line is missing in the model.
Q1.2 Substations
List all substations within bidding zone "XYZ":
- Note: In this dataset bidding zones are
cim:SubGeographicalRegion
: Nordic44#2 convert subgeographic regions to nc:BiddingZone - They are named "NO1 SGR" .. NO5 SGR, SE1 SGR .. SE4 SGR, FI1 SGR
- Note: there are also "NO1 SLA", which means
cim:SubLoadArea
- Note: there are also "NO1 SLA", which means
- They have parent (relation
SubGeographicalRegion.Region
) to thecim:GeographicalRegion
"NO" (the whole country)
List all substations within subgeographical region "NO1 SGR" succeeds:
select ?sub ?subName {
?sub a cim:Substation; cim:IdentifiedObject.name ?subName;
cim:Substation.Region ?subgeographicalRegion .
?subgeographicalRegion a cim:SubGeographicalRegion ; cim:IdentifiedObject.name "NO1 SGR" .
}
Of course, the user may ask for "NO1", so we need more sophisticated entity resolution. Autocomplete works:
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
prefix auto: <http://www.ontotext.com/plugins/autocomplete#>
select * {
?x auto:query "NO1"; a cim:SubGeographicalRegion
} limit 100
Note: swapping the order of auto:query and a fails: GDB-10567: bad plan: ordering of real prop vs Autocomplete magic prop
Q1.3 Substation Connectivity
List all substations that are connected via an AC-line or a DC line to substation named "XYZ"
This diagram from Telemark-120 (also in our offer sec 5.11.2 Electricity-Specific Diagrams) shows that a substation is not directly connected to AC-lines but through a number of its equipments (transformers, busbars, breakers…)
As explained in Inference: only with CIM properties, the correct query is very complex because it needs to drill down into parts, and drill across Terminals and ConnectivityNodes (saved query Q3
).
After adding cimr:
derived props, the query becomes much simpler (saved query Q3-simple
) and easier to generate:
PREFIX cimr: <https://cim.ucaiug.io/rules#>
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select ?sub1Name ?lineName ?sub2Name {
values ?sub1Name {"ARENDAL"}
?sub1 a cim:Substation; cim:IdentifiedObject.name ?sub1Name;
cimr:connectedThroughPart ?line.
?line a cim:Line; cim:IdentifiedObject.name ?lineName.
?sub2 a cim:Substation; cim:IdentifiedObject.name ?sub2Name;
cimr:connectedThroughPart ?line.
filter(?sub1 != ?sub2)
}
This raised some Connection-Questions that we put on a separate page.
Q1.4 AC Lines Crossing SubGeographicalRegions
List all AC-lines that traverse bidding zones X and Y (i.e. they have connectivity nodes that belong to bidding zone X and bidding zone Y).
According to this UML diagram from Svein, we should look for Line-SchedulingArea-BiddingZone
:
But this query shows that things are much simpler: Line-Region
:
select * {
[] a cim:ACLineSegment;
cim:IdentifiedObject.name ?segment;
cim:Equipment.EquipmentContainer [
cim:IdentifiedObject.name ?line;
cim:Line.Region [cim:IdentifiedObject.name ?region]]
} order by ?line
- Lines are made of segments (
ACLineSegment-Equipment.EquipmentContainer-Line
) - Each Line has exactly one ACLineSegment.
- The names are set as
lineName="LC "+segmentName
(the only exception is segment "420SYSLEHAGAFOSS" corresponds to line "LC 420SYSLE-HAGAFOSS": dash added in the middle)
Each line has only one region:
select ?line (count(*) as ?regs) {
?line a cim:Line; cim:Line.Region ?reg
} group by ?line having (?regs>1)
A query that lists lines and their regions:
select * {
[] a cim:Line; cim:IdentifiedObject.name ?line;
cim:Line.Region [cim:IdentifiedObject.name ?region]
} order by ?line
The original question clarifies by talking of ConnectivityNodes
but they have no Region
.
The only things that have SubGeographicalRegion are Line, Substation
and GeographicalRegion
("child" link to SubGeographicalRegion
):
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select ?type ?p (count(*) as ?c) where {
?x ?p ?y; sesame:directType ?type.
?y sesame:directType cim:SubGeographicalRegion
} group by ?type ?p
Connection-Questions#Lines Connected Directly finds that lines can be connected directly.
But in a situation like this can we say that Line2
is cross-region? (the parens indicate region):
Line1(NO1)-Line2(NO2)-Line3(NO3)
Perhaps Line1, Line3
should also be considered cross-region, regardless what is on their other ends??
But I think the two things on the side should be point features: Substation
not Line
:
Substation1(NO1)-Line2(NO2)-Substation3(NO3)
Another confusing aspect is AC vs DC (Inst4CIM-KG#158 AC-DC confusion.
So I think we are looking for Lines that connect Substations that are in different SubGeographicalRegions, and the question should be reformulated to:
List Lines that traverse bidding zones X and Y (i.e. are connected to a substation in bidding zone X and another substation in bidding zone Y).
This query finds all such lines. Notice the "<" filter at the end: the two regions have a symmetric role, and we don't want permutations
(saved query Q4-general
):
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
select ?sub1Name ?reg1Name ?lineName ?sub2Name ?reg2Name {
?line a cim:Line; cimr:connectedThroughPart ?sub1, ?sub2; cim:IdentifiedObject.name ?lineName.
?sub1 a cim:Substation; cim:Substation.Region ?reg1; cim:IdentifiedObject.name ?sub1Name.
?sub2 a cim:Substation; cim:Substation.Region ?reg2; cim:IdentifiedObject.name ?sub2Name.
?reg1 a cim:SubGeographicalRegion; cim:IdentifiedObject.name ?reg1Name.
?reg2 a cim:SubGeographicalRegion; cim:IdentifiedObject.name ?reg2Name.
filter(?reg1Name<?reg2Name)
}
We want it for 2 specific regions, which simplifies the query. Eg here it is for NO1-NO5
(saved query Q4
):
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
select ?sub1Name ?lineName ?sub2Name {
?reg1 a cim:SubGeographicalRegion; cim:IdentifiedObject.name "NO1 SGR".
?reg2 a cim:SubGeographicalRegion; cim:IdentifiedObject.name "NO5 SGR".
?sub1 a cim:Substation; cim:Substation.Region ?reg1; cim:IdentifiedObject.name ?sub1Name.
?sub2 a cim:Substation; cim:Substation.Region ?reg2; cim:IdentifiedObject.name ?sub2Name.
?line a cim:Line; cimr:connectedThroughPart ?sub1, ?sub2; cim:IdentifiedObject.name ?lineName.
}
Note: The chatbot currently has trouble with identifying the lines because it tries to use the cim:Line.Region property to identify the two regions. This has been addressed with custom instructions but it indicates an issue with the ontology and lack of clarity for how to use the corresponding Line property.
Q1.5 Measurements
Give me the analogue measurements that exist on AC-line "XYZ".
There are no measurements about AC lines. There are 30 Analog measurements, and they are all about entsoe2:EnergyCongestionZone
:
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select * {
[] a cim:Analog;
cim:IdentifiedObject.name ?measName;
cim:IdentifiedObject.description ?measDescr;
cim:Measurement.measurementType ?type; # all are "ThreePhaseActivePower"
cim:Measurement.PowerSystemResource [sesame:directType ?psrType; cim:IdentifiedObject.name ?psrName]
}
The question can be reformulated like this:
Give me the analogue measurements of congestion zone NO-ELSP-3. Return measurement name, description and type
Patch to Add entsoe2 Terms
Note: for each zone, the above query returns two direct types: entsoe2:EnergyCongestionZone, cim:PowerSystemResource
.
The latter comes from rdfs:range
reasoning for prop cim:Measurement.PowerSystemResource
:
although that class is inferred, it is not inferred through subclass reasoning.
When we add the proper parent class entsoe2:EnergyCongestionZone rdfs:subClassOf cim:PowerSystemResource
this problem disappears.
entsoe2:EnergyCongestionZone
is not described in the ontology, so we add it to crmex:
for the LLM to be able to use it:
entsoe2:EnergyCongestionZone a rdfs:Class;
rdfs:label "EnergyCongestionZone"@en;
rdfs:comment "Energy congestion zone"@en;
rdfs:subClassOf cim:PowerSystemResource;
rdfs:isDefinedBy crmex: .
entsoe2:EnergyCongestionZone.netDCInterchange a rdf:Property;
rdfs:label "netDCInterchange"@en;
rdfs:comment "Net DC interchange"@en;
rdfs:domain entsoe2:EnergyCongestionZone;
cims:dataType cim:Float;
rdfs:isDefinedBy crmex: .
entsoe2:EnergyCongestionZone.netACInterchangeTolerance a rdf:Property;
rdfs:label "netDCInterchangeTolerance"@en;
rdfs:comment "Net DC interchange tolerance"@en;
rdfs:domain entsoe2:EnergyCongestionZone;
cims:dataType cim:Float;
rdfs:isDefinedBy crmex: .
entsoe2:EnergyCongestionZone.netACInterchange a rdf:Property;
rdfs:label "netACInterchange"@en;
rdfs:comment "Net AC interchange"@en;
rdfs:domain entsoe2:EnergyCongestionZone;
cims:dataType cim:Float;
rdfs:isDefinedBy crmex: .
There are other props and classes that relate to it, but we don't bother to define them, eg:
entsoe2:EnergySchedulingArea.EnergyCongestionZone
pti:EnergySchedulingArea.ControlArea
pti:EnergyCongestionZone.marketCode
pti:ControlAreaGeneratingUnit.EnergyCongestionZone