Demo1 - statnett/Talk2PowerSystem GitHub Wiki

This is an initial demo (before project kickoff) on the existing Demo0 data:

This is Nordic44 data as described in Datasets. CIM Primer is useful to understand the questions.

Tasks

Observations and Fixes

Data observations:

Question observations:

  • XYZ can be IdentifiedObject.mRID (the GUID) or IdentifiedObject.name. So the chatbot needs to find XYZ somehow among all entities (eg substations). This is a candidate for Autocomplete or IRI discovery.
    • But mRID is missing, so we just resolve name

Decisions:

  • How to feed ontologies to LLM? We do it with a query. Other options:
    • From a local file
    • From a graph
  • How to subset the ontologies? We select only terms used in the data with nordic44-ontology-query.rq:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX onto: <http://www.ontotext.com/>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
construct {?x ?p ?o} from onto:explicit {
  {SELECT DISTINCT ?x {{[] ?x []} UNION {[] a ?x}}}
  UNION {?x rdfs:isDefinedBy cimr:}
  ?x ?p ?o
} order by ?x

Competency Questions

Q1.1 Transformers

List all transformers within Substation "XYZ"

  • Transformers have PowerTransformerEnds that relate directly to BaseVoltage (a value object, eg "300 kV"), not to a VoltageLevel (which is per substation)
  • Contrary to what the CIM-Primer says: I see mostly single VoltageLevel per Substation. This returns 43 substations with 1 level, and 1 substation with 2 levels:
select (count(*) as ?subs) ?volts {
  {select ?sub (count(*) as ?volts) {
    ?sub cim:Substation.VoltageLevels ?y
  } group by ?sub}
} group by ?volts
  • "ARENDAL" is the only substation with 2 VoltageLevels:
select ?sub ?subName (count(*) as ?volts) {
    ?sub cim:Substation.VoltageLevels ?y; cim:IdentifiedObject.name ?subName
} group by ?sub ?subName having (?volts>1)

But why is this not shown by Demo0's early attempt to generate a map? (File chatGPT/locations-show.html in internal gitlab project)? Blue is 300kV, red is 420kV, and there's only blue around ARENDAL: perhaps a Line is missing in the model.

Q1.2 Substations

List all substations within bidding zone "XYZ":

  • Note: In this dataset bidding zones are cim:SubGeographicalRegion: Nordic44#2 convert subgeographic regions to nc:BiddingZone
  • They are named "NO1 SGR" .. NO5 SGR, SE1 SGR .. SE4 SGR, FI1 SGR
    • Note: there are also "NO1 SLA", which means cim:SubLoadArea
  • They have parent (relation SubGeographicalRegion.Region) to the cim:GeographicalRegion "NO" (the whole country)

List all substations within subgeographical region "NO1 SGR" succeeds:

select ?sub ?subName {
  ?sub a cim:Substation; cim:IdentifiedObject.name ?subName;
    cim:Substation.Region ?subgeographicalRegion .
  ?subgeographicalRegion a cim:SubGeographicalRegion ; cim:IdentifiedObject.name "NO1 SGR" .
}

Of course, the user may ask for "NO1", so we need more sophisticated entity resolution. Autocomplete works:

PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
prefix auto: <http://www.ontotext.com/plugins/autocomplete#>
select *  {
  ?x auto:query "NO1"; a cim:SubGeographicalRegion
} limit 100

Note: swapping the order of auto:query and a fails: GDB-10567: bad plan: ordering of real prop vs Autocomplete magic prop

Q1.3 Substation Connectivity

List all substations that are connected via an AC-line or a DC line to substation named "XYZ"

This diagram from Telemark-120 (also in our offer sec 5.11.2 Electricity-Specific Diagrams) shows that a substation is not directly connected to AC-lines but through a number of its equipments (transformers, busbars, breakers…)

As explained in Inference: only with CIM properties, the correct query is very complex because it needs to drill down into parts, and drill across Terminals and ConnectivityNodes (saved query Q3). After adding cimr: derived props, the query becomes much simpler (saved query Q3-simple) and easier to generate:

PREFIX cimr: <https://cim.ucaiug.io/rules#>
PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select ?sub1Name ?lineName ?sub2Name {
    values ?sub1Name {"ARENDAL"}
    ?sub1 a cim:Substation; cim:IdentifiedObject.name ?sub1Name;
      cimr:connectedThroughPart ?line.
    ?line a cim:Line; cim:IdentifiedObject.name ?lineName.
    ?sub2 a cim:Substation; cim:IdentifiedObject.name ?sub2Name;
      cimr:connectedThroughPart ?line.
    filter(?sub1 != ?sub2)
}

This raised some Connection-Questions that we put on a separate page.

Q1.4 AC Lines Crossing SubGeographicalRegions

List all AC-lines that traverse bidding zones X and Y (i.e. they have connectivity nodes that belong to bidding zone X and bidding zone Y).

According to this UML diagram from Svein, we should look for Line-SchedulingArea-BiddingZone:

But this query shows that things are much simpler: Line-Region:

select * {
    [] a cim:ACLineSegment;
       cim:IdentifiedObject.name ?segment;
       cim:Equipment.EquipmentContainer [
        cim:IdentifiedObject.name ?line;
        cim:Line.Region [cim:IdentifiedObject.name ?region]]
} order by ?line
  • Lines are made of segments (ACLineSegment-Equipment.EquipmentContainer-Line)
  • Each Line has exactly one ACLineSegment.
  • The names are set as lineName="LC "+segmentName (the only exception is segment "420SYSLEHAGAFOSS" corresponds to line "LC 420SYSLE-HAGAFOSS": dash added in the middle)

Each line has only one region:

select ?line (count(*) as ?regs) {
  ?line a cim:Line; cim:Line.Region ?reg
} group by ?line having (?regs>1)

A query that lists lines and their regions:

select * {
    [] a cim:Line; cim:IdentifiedObject.name ?line;
        cim:Line.Region [cim:IdentifiedObject.name ?region]
} order by ?line

The original question clarifies by talking of ConnectivityNodes but they have no Region.

The only things that have SubGeographicalRegion are Line, Substation and GeographicalRegion ("child" link to SubGeographicalRegion):

PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select ?type ?p (count(*) as ?c) where {
  ?x ?p ?y; sesame:directType ?type.
  ?y sesame:directType cim:SubGeographicalRegion
} group by ?type ?p

Connection-Questions#Lines Connected Directly finds that lines can be connected directly.

But in a situation like this can we say that Line2 is cross-region? (the parens indicate region):

Line1(NO1)-Line2(NO2)-Line3(NO3)

Perhaps Line1, Line3 should also be considered cross-region, regardless what is on their other ends??

But I think the two things on the side should be point features: Substation not Line:

Substation1(NO1)-Line2(NO2)-Substation3(NO3)

Another confusing aspect is AC vs DC (Inst4CIM-KG#158 AC-DC confusion.

So I think we are looking for Lines that connect Substations that are in different SubGeographicalRegions, and the question should be reformulated to:

List Lines that traverse bidding zones X and Y (i.e. are connected to a substation in bidding zone X and another substation in bidding zone Y).

This query finds all such lines. Notice the "<" filter at the end: the two regions have a symmetric role, and we don't want permutations (saved query Q4-general):

PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
select ?sub1Name ?reg1Name ?lineName ?sub2Name ?reg2Name {
  ?line a cim:Line; cimr:connectedThroughPart ?sub1, ?sub2; cim:IdentifiedObject.name ?lineName.
  ?sub1 a cim:Substation; cim:Substation.Region ?reg1; cim:IdentifiedObject.name ?sub1Name.
  ?sub2 a cim:Substation; cim:Substation.Region ?reg2; cim:IdentifiedObject.name ?sub2Name.
  ?reg1 a cim:SubGeographicalRegion; cim:IdentifiedObject.name ?reg1Name.
  ?reg2 a cim:SubGeographicalRegion; cim:IdentifiedObject.name ?reg2Name.
  filter(?reg1Name<?reg2Name)
}

We want it for 2 specific regions, which simplifies the query. Eg here it is for NO1-NO5 (saved query Q4):

PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX cimr: <https://cim.ucaiug.io/rules#>
select ?sub1Name ?lineName ?sub2Name {
  ?reg1 a cim:SubGeographicalRegion; cim:IdentifiedObject.name "NO1 SGR".
  ?reg2 a cim:SubGeographicalRegion; cim:IdentifiedObject.name "NO5 SGR".
  ?sub1 a cim:Substation; cim:Substation.Region ?reg1; cim:IdentifiedObject.name ?sub1Name.
  ?sub2 a cim:Substation; cim:Substation.Region ?reg2; cim:IdentifiedObject.name ?sub2Name.
  ?line a cim:Line; cimr:connectedThroughPart ?sub1, ?sub2; cim:IdentifiedObject.name ?lineName.
}

Note: The chatbot currently has trouble with identifying the lines because it tries to use the cim:Line.Region property to identify the two regions. This has been addressed with custom instructions but it indicates an issue with the ontology and lack of clarity for how to use the corresponding Line property.

Q1.5 Measurements

Give me the analogue measurements that exist on AC-line "XYZ".

There are no measurements about AC lines. There are 30 Analog measurements, and they are all about entsoe2:EnergyCongestionZone:

PREFIX cim: <http://iec.ch/TC57/2013/CIM-schema-cim16#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
select * {
  [] a cim:Analog;
    cim:IdentifiedObject.name ?measName;
    cim:IdentifiedObject.description ?measDescr;
    cim:Measurement.measurementType ?type; # all are "ThreePhaseActivePower"
    cim:Measurement.PowerSystemResource [sesame:directType ?psrType; cim:IdentifiedObject.name ?psrName]
}

The question can be reformulated like this:

Give me the analogue measurements of congestion zone NO-ELSP-3. Return measurement name, description and type

Patch to Add entsoe2 Terms

Note: for each zone, the above query returns two direct types: entsoe2:EnergyCongestionZone, cim:PowerSystemResource. The latter comes from rdfs:range reasoning for prop cim:Measurement.PowerSystemResource: although that class is inferred, it is not inferred through subclass reasoning. When we add the proper parent class entsoe2:EnergyCongestionZone rdfs:subClassOf cim:PowerSystemResource this problem disappears.

entsoe2:EnergyCongestionZone is not described in the ontology, so we add it to crmex: for the LLM to be able to use it:

entsoe2:EnergyCongestionZone a rdfs:Class;
  rdfs:label "EnergyCongestionZone"@en;
  rdfs:comment "Energy congestion zone"@en;
  rdfs:subClassOf cim:PowerSystemResource;
  rdfs:isDefinedBy crmex: .

entsoe2:EnergyCongestionZone.netDCInterchange a rdf:Property;
  rdfs:label "netDCInterchange"@en;
  rdfs:comment "Net DC interchange"@en;
  rdfs:domain entsoe2:EnergyCongestionZone;
  cims:dataType cim:Float;
  rdfs:isDefinedBy crmex: .

entsoe2:EnergyCongestionZone.netACInterchangeTolerance a rdf:Property;
  rdfs:label "netDCInterchangeTolerance"@en;
  rdfs:comment "Net DC interchange tolerance"@en;
  rdfs:domain entsoe2:EnergyCongestionZone;
  cims:dataType cim:Float;
  rdfs:isDefinedBy crmex: .

entsoe2:EnergyCongestionZone.netACInterchange a rdf:Property;
  rdfs:label "netACInterchange"@en;
  rdfs:comment "Net AC interchange"@en;
  rdfs:domain entsoe2:EnergyCongestionZone;
  cims:dataType cim:Float;
  rdfs:isDefinedBy crmex: .

There are other props and classes that relate to it, but we don't bother to define them, eg:

  • entsoe2:EnergySchedulingArea.EnergyCongestionZone
  • pti:EnergySchedulingArea.ControlArea
  • pti:EnergyCongestionZone.marketCode
  • pti:ControlAreaGeneratingUnit.EnergyCongestionZone