Summary Learnings, Gaps, Enhancements - ge-high-assurance/RACK GitHub Wiki

Contents

Problem is due to missing data in our original artifacts as received

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address /

Comments [& Owner]

Build Info

[GrammaTech] No information on build tools or build process

  • Compiler and linker vendor and version

  • Makefiles or equivalent build scripting

  • When were build steps done, and by whom

Software.SADL includes
Hazard

[JHU-APL] No hazard analysis provided

[GrammaTech] No HAZARD information provided

[STR] RACK should contain information about:

  • Hazard enumeration

  • Hazard mitigation – both system and extra-system

  • Requirements used to support mitigation

[Adelard] E.g. No hazard analysis

Open Example [LM-ATL] A broadly accepted surrogate (w/ justification for acceptability) usable without ITAR restrictions would be extremely valuable
Processes

[JHU-APL] No operational data provided

[JHU-APL] Would be optimal to have input/output to actual certification process for this system (what was delivered, comments from certifiers, etc.)

[RTX] Evidence about processes

Processes

[JHU-APL] No process data provided (e.g., version control history; internal process for documentation; issue tracker data)

[RTX] Reviews, check lists, compliance analysis, cost, quality metrics and versions (how many versions of requirements)

[RTX] There is not data about DAL, Algorithms, Hardware Platform, Test Procedure, Reviews (Code reviews, Requirement’s compliance, Code conform with the standard)

Requirements [STR] (Derived) Requirements should discuss pre/post conditions
Requirements

[STR] Requirements model details

  • Missing (derived) functional requirements and relationships (derives, satisfies, etc.)

  • Missing mappings of requirements to software components

  • Additional work needed to parse requirements into conditions (e.g., CLEAR specs)

  • Missing mapping of requirements to functional pre and post conditions (contracts)

[RTX] Nav System was not tight to any requirement or Processor. And all the requirements are tight to the processor

  • It is need to be defined whether the requirements are tight to the system or the sub systems

Requirements [Adelard] Gaps in LLR level
Simulation [LM-ATL] No access to a configurable simulator, in source or as binary executable
System

[LM-ATL] Insufficiently described system; all performers should have access to an authoritative ontological description of the system and its target domain

[RTX] There is not data about architecture, partitions

[Adelard] Gaps at the architectural definition level

System

[STR] Component hierarchy: system components only provided at a high level (need detailed subsystem components, interfaces, and functions)

  1. Top-level requirements (Natural Language) are ok but not machine reasonable

  2. Need architectural blocks and connections (from architecture diagrams) & system activity diagrams (SysML)

  3. Need component contracts (from derived requirements) with reference to pre and post condition logic

Tests [LM-ATL] Lack of reproducibility; resolvable through access to the test suite & testing environment
Tests

[LM-ATL] Test data lacks coverage/completeness information

[RTX] There are not test coverage metrics in the data.

[Adelard] Gaps in coverage analysis

Tests [LM-ATL] Test suite outputs largely define success i.e. variable A should have value a. Need for definitions of failure as well i.e. a+0.5 would be acceptable, a+>0.5 would not.
Tests

[STR] Testing evidence

  • Only unit test driver and log files provided

  • Need integration test data for precondition CPTs

Tests [STR] Tests should refer to conditions, e.g., which pre-conditions are true for a test and which post-conditions are implied by a successful test
Traceability

[RTX] Current data is not grouped by sub-systems. There is an assumption that the RACK only has one sub system. For example, if a requirement is entered in the RACK, there is no way to know to which system/subsystem belongs. Therefore, for us is hard to judge which subsystem has the untraceable requirements. We thought that the connection govern was used to that, but this connection is not used consistently through all TAs. The same situation happens with other entities. For example, with test. If a test is entered and it is not connected to anything (what can happened). We cannot know to which system/subsystem belongs.

[RTX] There are untraceable entities, such as, requirement, test, software components, etc.

  • Traceability is key in the certification process. There should be rules on ingestion that ensure that ingested data is always related to corresponding entities

[STR] Overall, once we have the required pieces of evidence we must also ensure traceability from test results back to requirements to components!

  • Not all of the required links seem present

  • (From last slide on RACK queries)

    1. Top-level Requirements of Target-System

    2. Derived Requirements of Requirement

    3. Components of System

    4. Requirements implemented-by component

    5. Requirement tested-by Test

    6. Test part-of Test-plan

    7. Assumptions of Requirement

    8. Guarantees of Requirement

    9. Pre-conditions of Test

    10. Results of Test

[Adelard] Gaps in traceability

Suggested things that TA2 can do to help identify (or fill in???) the data gaps

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address /

Comments [& Owner]

Check for forbidden characters [RTX] There are software components with characters that are not ingested by the RACK correctly. It would be good to define forbidden characters like *,…, / . Identifiers are allow to have spaces.

Primarily training.

Can we make it easier to share importSpec transforms & rules across nodegroups.

I could use more clarity on which/why characters are not allowed (semtk should accept ASCII) and why the current ingestion template validation and transformation tools are not sufficient. Is this a training issue or is there a feature we can add? -Paul

Get more info. No RACK issue yet.

Data Audit

[STR] Responsibility for RACK content quality should rest on all TAs

  • TA1 for populating needed content

  • TA3 for helping to define needed content and for consuming it

  • TA2 for supporting content capture in the RACK (e.g., common ontology) and for performing basic RACK access queries as a means of assuring that the RACK repository contents are fit for program purpose

[RTX] TA1s/TA2 know how the data is ingested. But it is up to the TA3s to query the data. For that TA3, needs to learn in detail the data structure. It would be more efficient if the TA1s could provide node groups to query their data.

[RTX] Some queries show (e.g., untraceable requirements) more evidence is required. TA3s can point out missing evidence. However, there is not a process that establishes what to do with that. For example, RTX found there are untraceable requirements. Who should resolve this problem?

TA2 could build a unified ingestion script that loads provided data packages and runs a set of unit tests to help check that everyone is working from the same view of the combined data. - Eric
Data Audit

[STR] We should define a set of basic queries and TA2 should run and audit the results of these queries against the RACK as part of a regular QA process

  • By audit I mean confirm population of various types in the RACK relevant to the current Phase 2 system under analysis, e.g., for subsystem XXX there should be 42 requirements

  • Check traceability, e.g., requirement XXX has 3 associated test cases

  • Check harmonization between TA1 sources, e.g., requirement XXX from TA1 provide A traces to test case YYY from TA1 provider B

[RTX] There should be rules for entering the data. For example, if a source code file is entered, it needs to be connected to a software component and a development activity.

This (and more) would be addressed by our idea of putting more (cardinality) constraints in the model then building a tool to check all the constraints and report. -Paul

https://github.com/ge-high-assurance/RACK/issues/581 (build report generator)

https://github.com/ge-high-assurance/RACK/issues/582 (report violations of cardinality constraints)

RACK queries

[STR] Suggested RACK queries

Overall, once we have the required pieces of evidence we must also ensure traceability from test results back to requirements to components!

  • Not all of the required links seem present

  • (From last slide on RACK queries)

  1. Top-level Requirements of Target-System

  2. Derived Requirements of Requirement

  3. Components of System

  4. Requirements implemented-by component

  5. Requirement tested-by Test

  6. Test part-of Test-plan

  7. Assumptions of Requirement

  8. Guarantees of Requirement

  9. Pre-conditions of Test

  10. Results of Test

[RTX] There is need for defining queries that ensure that the ingested data is correct and complete. I know that there is a way in the RACK however, the queries definition should be worked amount all TAs.

[SRI] With heavily connected relationships among object instances multiple classes, it becomes very difficult to

keep track of whether we are using proper foreign key identifiers in a CSV file. We came up with a identifier naming structure/scheme to manage these and help in manual reviews of the evidence being ingested.

  1. It would be helpful to have some automated checks to verify we have connected the entities and activities properly – e.g., requirements for component1 should point to the identifier for component1 and component2. Manual reviews can be exhausting. Example of a check would be to detect if no requirement is pointing to component2, this would indicate errors in connections.

  2. If there is no object declared for an identifier used in a foreign key, there should be an automatic ingestion error reported. Not sure if RACK check and ASSIST do this.

[STR] This seems like a solid short-term priority. It might be a use case for nodegroup transformation wizards. I.e. define a nodegroup that matches legal/complete version of the 10 items listed. Have a wizard/transformation that runs query(ies) to find all graph patterns that do NOT match, hopefully clearly showing what is wrong. -Paul

[RTX] [SRIa] These can be accomplished with our proposed constraint checker and by building “stock” queries.

[SRIb] This functionality exists and we use it widely

[Paul] The comment about foreign keys is disturbing. I think RACK issues 581, 582 go a long way. Do we need a new issue to build a specific report?

Left off from above because they are either questionable or not as generally applicable

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address /

Comments [& Owner]

[RTX] Evaluate relevance of the data:

  • GrammaTech provides a very low abstraction level, resulting in approximate 66K software components. It needs to be re-evaluated if that number of components provide relevant evidence, meaning, if they are connected to tests, to coverage metrics, to reviews, to partitions, etc. If there is not value added, the amount of component makes the RACK ingestion and query slower.

  • Looked Martin has tests names that are not tight to any test in the rack.

[RTX] Before ingesting data in the RACK, it should be evaluated in a council whether the data provide a usable evidence and that the purpose of this evidence is clear.

What is the Vision/Purpose of RACK

[RTX] What is the vision of the RACK? That the RACK is the truth, is an evidence broker, or an index of the evidence. Our vision of the RACK is:

  • RACK contains evidence about a system and its development process to reason about the level of assurance of a system

  • This document does not contain any export controlled technical data.

  • RACK has entities about the system and its process so that it is possible to connect the entities to the evidence (e.g., RACK has 10 tests and analysis output for each test that ensure that the test covers the requirement to which is traced to)

  • RACK has the link to the real artifacts

  • RACK has abstracted enough information in the evidence that high level assurance reasoning can be done without analyzing the artifacts

[Adelard] What is the purpose of the RACK?

  • The source of all knowledge

  • The consolidation of various information sources (an index) but not the source of truth

  • A broker for accessing assurance evidence / assurance services

  • The semantics of the RACK ontology and the RACK API depend on the answer to this question.

[STR] Binary level software component information not relevant to ARBITER case
[STR] Gaps in alignment of different TA1 data sets (e.g., reference to common requirements)
Processes [JHU-APL] Gaps in knowledge where details are redacted from documents
Security Related

[STR]

  • System security requirements (NL)

  • Environmental security plan (audit, patching, surveillance, rapid response)

  • Threat model (NL) àThreat activity diagram (causal model attack fails)

    • Unanticipated paths through the system

  • Process activities for security that should be documented

    • Enumerating risks

    • Enumerating mitigation for each risk

    • Apply mitigation (e.g., running analysis tools, adding to the design)

  • Process leads to requirements that mitigates threat to ensure meeting some security requirement (traceability)?

Problem is due to errors or gaps in TA1 evidence generation, e.g. due to tool limitations

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address /

Comments [& Owner]

[GrammaTech] Lack of traceability from harness executables in TEST_EXECUTION.csv to COMPONENT.csv

  • TEST_EXECUTION is probably not the right entity for what we are implementing

  • We expect the most value to come from reporting the set of COMPONENTs which have been executed in conjunction with the coverage (statement, branch, etc.) associated with that set of components.

  • If appropriate, we could introduce a MODEL subclass or some other derived entity type which clusters the set of COMPONENTs covered by our tests.

  • Concrete solutions will require further discussion and feedback on the needs of TA3.

Short Term: Brain storming session between GT and TA3 teams – Lucja & Honeywell & others
[GrammaTech] Missing dataInsertedBy. It is possible to do this in the nodegroup, but should there be a preference for having it as part of the source input such as the CSV file? Cannot do as part of the ingestion as that would result in every version of RACK having different “dataInsertedBy”. The Scraping Tool Kit automatically will create this relationship, otherwise it should be capture when generating the CSV. Note: it may be possible to hardcode as part of a custom nodegroup, but then the nodegroup loses utility.
[GrammaTech] Missing generatedAtTime. Per Eric Mertens, this can’t be done in the nodegroup because it generates an explosion of slightly-different time values. It is also not clear what generatedAtTime should represent, see a separate discussion in Section C.1 of this document. SemTK has a rarely-used feature that might be helpful. The ingestion templates support Text of %ingestTime or %ingestEpoch. We ought to use this or be able to find a short-term solution -Paul

[GrammaTech] SPECIFICATION.dateOfIssue – is being populated with file date (Unix modtime of the PDF) because we can’t reliably extract the values from the title page

  • We propose to remove this misleading information and leave the field blank pending improvements in our extraction tools.

[GrammaTech] SPECIFICATION.wasAttributedTo → AGENT.identifier – is being populated with the author information in the PDF document metadata because we can’t OCR the signatures

  • We propose to remove this misleading information and leave the field blank pending improvements in our extraction tools.

[GrammaTech] TOOL entities aren’t being used at present
[RTX] Lockheed Martin ontology uses strings to connect component. The connections should be part of the ontology.
Confidence

[JHU-APL] Unable to determine confidence of a particular piece of evidence

  • Confidence for new types of evidence,

  • How to interpret new coverage metrics

Requirements

[JHU-APL] Have to parse lengthy text blobs of requirements (need machine-readable)

[Adelard] E.g. poor text extraction of requirements, failure to link up requirement ids, noise generated by ingestion, etc.

Problem is due to Data Model limitations

Pertains to:

Specific Problem

Long Term /

Short Term /

Will Not Address /

Comments [& Owner]

EACF [JHU-APL] Have to reconstruct EACF representation from low-level RACK ontology mapping; would prefer to have the original EACF Action: Need team raising the issue to provide an explicit data model and concrete sample data for EACF.
PROV data model [LM-ATL] Data models should incorporate widely accepted and well-documented data (meta) models avoiding “reinvention of the wheel.” For example, consider UTP2 for testing (https://www.omg.org/spec/UTP2/2.1/About-UTP2/), or SAE’s EIA649 for configuration management; also, SACM for representing structured arguments. SACM includes an Artifact Metamodel suited for assurance processes and for representing evidence. (From the spec: “The SACM Artifact Metamodel defines a catalog of elements for constructing and interchanging packages of evidence that communicate how the evidence was collected”).

Will not address. RACK data model is already based on the two most widely accepted and well-documented data models: E-R model and PROV-W3C model. Assurance case domain model built in RACK leverages these to provide a complete model for assurance case evidence.

Discussed on 11/29/2021 and closed.

PROV data model

[Adelard] In general, I don’t care about who inserted something into the RACK or when it was inserted. Hence, I’m not convinced that the time / origin “meta attributes” are useful - they just clutter up the ontology. What matters are the dependency relationships, not the precise timings of who did what when.

Also, dataInsertedBy is recursive. How is the recursion resolved? What was the activity that inserted the activity that inserted … that inserted the data?

There is no consistent naming convention for attributes - impactedBy vs. wasImpactedBy.

THING

  • Every entity is assigned a GUID and a type by SemTK

  • Adding an identifier, description, and title seems excessive - title is not needed

  • Identifiers don’t generally include spaces, but the Boeing identifiers apparently did

  • Identifiers need to be unique per data ingestion but not necessarily unique per THING

ENTITY

  • Although there are semantic differences between the derivedFrom, revisionOf, impactedBy relationships, from a structural perspective, these are different names for the same relationship and clutter up the core ontology unnecessarily - a single structural relationship (derivedFrom) should be sufficient.

ACTIVITY

  • In principle, every ACTIVITY should have a purpose, follow a standard, be part of a plan, etc. Should these properties be captured at this level or should they only be defined for specific kinds of ACTIVITY? See later comments about REVIEW vs ANALYSIS.

Will not address. No missing functionality or other functional concern.

Will not address. data in dataInsertedBy refers to data, not metadata. The provenance of data should be captured, but the provenance of metadata should not.

Will not address. No missing functionality or other functional concern.

Will not address. No missing functionality seen.

Will not address. Semantics often show up in databases in the form of informative attribute or relationship names. Removing those often leads to more confusion than improvement.

Detailed discussion on this at the all TA call on 10/12/2021; we will add some more documentation about it. We may add a relationship from ACTIVITY to a plan.

Short term: ACTIVITY and Plan, working on a new idea to ingest information from Apache PSAQ and SDP – Dan & Kit & Greg K.

PROV data model

[SRI] Overall: The RACK ontology was based on generic provenance ontology that support all sorts of applications of the base classes (ACTIVITY, ENTITY, etc.). Therefore, there are many attributes (properties) and relationships in the classes that are not useful for assurance evidence and also result in multiple possible transitive relationships across ACTIVITIES and ENTITIES that can cause confusion and/or inconsistencies.

Our recommendation is to normalize the relationships (and remove/disable redundant ones) so that there are minimum specific transitive paths to traverse in the graph with specific semantics attached to them. Additionally, we should have some rules/guidelines regarding how to populate (which relationships) and traverse the evidence.

(Note: In our TA1 evidence, we populated many such “redundant” relationships because we didn’t know what TA3s were going to use and how they were going to traverse the graph.)

  • Example: The ACTIVITY class has wasInformedBy relationship to another ACTIVITY. However, in a assurance flow the entities control the relationships. E.g., an ACTIVITY ‘a1’ generates some ENTITY ‘e1’

that is then used by another ACTIVITY ‘a2’, so any relationship among ‘a1’ and ‘a2’ can be inferred transitively via ‘e1’ and there is no need for wasInformedBy reference from ‘a2’ to ‘a1’.

  • Example: In ENTITY, wasAttributedTo relationship to an AGENT is not needed (and confusing) in an assurance flow. An ENTITY is generated by an ACTIVITY that is associated with an agent, so it is redundant to have same relationship from an ENTAITY to an agent.

  • Example: ANALYSIS_OUTPUT has a relationship analyzes to an ENTITY that is redundant with another path through producedBy to an ACTIVITY and from there used to an ENTITY.

Some generic relationships in base classes (e.g., wasInformedBy, wasDerivedFrom) are specialized in subclasses that we use in our evidence, because specific semantics are not provided for the generic relationships. The specialized relationships have specific semantics. But RACK allows to also use the

generic relationship from a subclass. It should be possible to somehow disable the generic relationships from a subclass so that TA3s don’t expect to find anything there.

  • We need to properly define the specific purpose and semantics of ANALYSIS and related classes. What we used in the SRI overlay was that the purpose of ANALYSIS is to prove a GenericProperty or a SpecificProperty – i.e., that the property holds over its propertyScope.

  • We are not sure what the contents should be for title and description inherited from THING. The title does not make sense except for people’s names, etc. We should agree upon a common criteria for description

which is relevant for specific classes.

Will not address: TA2’s approach from the beginning was to have a very general core ontology that can be used on other programs as well as in ARCOS. Having a property in the ontology does not mean that it needs to be populated, and if there is no data for a property then it will not be part of the query results.
[LM-ATL] RACK does not offer fine-grained curation. That is, tokens in strings do not map to a common ontology or terminology, which could be used to create linkages. Will not address. Linkages should be captured explicitly in relationships, NEVER implicitly by string matching.
Versioning [RTX] It is not clear to me how to model life cycle data. For example, version of one requirement (not about timing). I’m not sure if that should be on the RACK. Or if there is a tool that can provide evidence about the variable of a requirement over time.

Will not address. All Entities have a wasRevisionOf property. The intent is that each version would be captured as unique entities (e.g. R-1v1, R1v2). The turnstile example is being updated was an example of this, however no Boeing data exists.

Each ENTITY is a unique piece of evidence.

[STR] What is scheme to trace from an ARBITER assurance case back to evidence in the RACK?

  • How do we reference evidence in RACK? UID, pretty-print object, query result?

  • Do we reference the document (section, page) that the RACK data was extracted from?

Will not address. Every THING in RACK has a unique, queryable identifier that can be inserted into an assurance case for traceability.

But for clarity, will the final assurance cases point to RACK, or use RACK to build an independent assurance case? Do the identifiers need to primarily be meaningful in RACK or to the source data? – Eric

ACTIVITY [SRI] Tool invocation by activities: Often, an activity invokes a tool in a specific context, passing specific parameters and options to the tool. This needs to be captured independently of the activity. We added a ToolInvocationInstance class and relationship to this from each subclass of activity Short term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement and RACK PR #540. Has been pushed into master branch.
Ambiguity [GrammaTech] Ambiguity for whether TEST applies only to original artifacts, or equally to tests generated by tools like GrammaTech’s A-CERT during the software assurance process.

Will not address: We can capture the source of TESTS with TEST_DEVELOPMENT activities. We can capture released groups of tests with BASELINEs. Between these two groupings we ought to be able to distinguish the two categories. – Eric

Test should apply equally to all tests. Overlays should be used to make distinction between different types of tests.

Ambiguity

[JHU-APL] Unable to determine whether evidence is not present for a specific reason, e.g., how to tell the difference between:

  • There is no tool that is capable of collecting the data / ingesting into RACK (data

  • will never be present until upgrade or similar)

  • Tool did not run to completion (if the tool is too resource-heavy, etc.)

  • Evidence is not present with no underlying reason (e.g., neglect or omission)

Will not address: Tool runs that failed are already representable in RACK. Key attributes or relationships are already marked in RACK if they are required to be present. For other data, anything missing was simply not inserted by the data provider. The database cannot conjecture about why.
Ambiguity in Identifier, etc.

[GrammaTech] General ambiguity of properties

  • When something has a UUID, and identifier, a description, and a title, what goes where?

  • In general, if a property is simply of type ENTITY it’s not very clear what it is supposed to point to (e.g., SWCOMPONENT.instantiates; SWCOMPONENT.wasImpactedBy; TEST.verifies)

[RTX] It is not clear the difference between identifier, title, and description. Every performer is using it differently.

[GE] is this question coming because of the core ontology or one of the overlay? Possible suggestion (?): in assurance case, use the actual artifact name and not the RACK identifier. Establish some universal best practice for what to put in the title, description, etc., so that there’s consistency. Include version somewhere.

Will not address: Identifiers are used to reference/link data. Titles are short descriptions suitable for use in a list. Descriptions are the free-form, unstructured prose describing something in detail.

Identifiers should be treated as opaque linking tokens. Titles and descriptions should be treated as human-readable opaque content. Any extracted semantics should be broken out into semantic properties on objects. – Eric

ANALYSIS [RTX] Analysis is a key component of our approach. It is important to define what is the claim that the analysis provide. For example, “REQUIREMENT_CONSISTENCY_ANALYSIS” analysis if the requirements are consistent Short-term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement (especially sadl-examples) and RACK PR #540 Has been pushed into master branch.
ANALYSIS

[Adelard]

Unlike REVIEW, an ANALYSIS is not governedBy a REQUIREMENT or SPECIFICATION.

What is the purpose of the ANALYSIS? What property is being analysed and what can be concluded from the results of the analysis?

It might be helpful to link the ANALYSIS to an OBJECTIVE from a standard such as DO-178C.

ANALYSIS_RESULT (Passed, Failed, Indeterminate) is too simplistic - the purpose of an analysis might be to calculate a value or verify a property.

Given the variety of different analyses that are possible, properties like result and metric should be associated with the MODEL for a particular ANALYSIS.

The ANALYSIS activity should be linked to the tool that was used to perform the analysis, any tool-specific or analysis-specific configuration, including the purpose of the analysis, and the ENTITY that was analysed.

The ANALYSIS_OUTPUT should record the result of the analysis without any pre-judgment about whether the analysis passed or failed - the outcome of the analysis should be captured by an ANALYSIS_MODEL.

The purpose of ANALYSIS_ANNOTATION and ANALYSIS_ANNOTATION_TYPE is not clear - they seem to be associated with a particular kind of analysis and therefore don’t belong in the core ontology.

Short term: ACTIVITY and Plan, working on a new idea to ingest information from Apache PSAQ and SDP – Dan & Kit & Greg K.

Short term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement and RACK PR #540 Has been pushed into master branch.

DOCUMENT vs. COLLECTION

[GrammaTech] Ambiguity for when to use COLLECTION in preference to many-to-one edge sets.

  • Is there a benefit from ontological representation of sets versus structural representation, and if so where should it be enforced?

  • Compare DOCUMENT, SECTION, and SRI SystemRequirementSet etc.

Will not address: Things should only be modeled as a collection when that collection has some externally recognizable identity. Prefer many-to-one edge sets when there are many things that happen to be associated in the same way. Raw (unsubtyped) COLLECTIONs ought to be quite rare – Eric
COLLECTION [SRI] The COLLECTION and MODEL classes were quite useful – we had several subtypes of these in the SRI overlay. We may want to add some attributes to indicate COLLECTION semantics; i.e., what is the purpose of the collection? What does it mean for the entities to be in a collection – do they form parts of a whole? Can an entity participate in more than one collection? Will not address: Please add needed attributes in your overlay and as necessary they can be promoted to core ontology. Entities are “members of” a collection. An entity can be in multiple collections.
DOCUMENT

[Adelard] The various properties of DOCUMENT are too prescriptive, particularly DOC_STATUS. Each organization will have its own set of document attributes and document types. Hence, DOCUMENT should just be a placeholder in the core ontology that can be extended as necessary.

DOCUMENT should have a URL attribute that can be used to identify / access the document

Will not address: Please add needed attributes in your overlay and as necessary they can be promoted to core ontology.

Short-term: v9.0 will have an URL for a document. RACK Issue #578.

ENTITY

[GrammaTech] ENTITY.generatedAtTime could benefit from clarification

  • When an artifact is created by ARCOS tooling, it is not clear whether generatedAtTime should refer to the time that the content was actually generated (for example, when the source code for a TEST was created), the time a tool processed the content (the TEST.csv file was created) or the time the entity was entered into the RACK (ingestion time). The latter seems most compatible with the description: “the time this entity was created and available for use,” but other interpretations are possible.

  • What time to assign to artifacts that are supplied as inputs to ARCOS tools, e.g., documentation, source code, object code files?

Will not address: generatedAtTime and the other descriptive properties on an ENTITY are about the thing being modeled and not the model. In the case of a TEST object, the generatedAtTime would be the time the test itself was created. All the information about the source of the modeling information goes into the dataInsertedBy activity which tells when the data was modeled. – Eric
FILE

[GrammaTech] FILE entity type ambiguous in at least two ways:

  • Are all files (PDF, source code, binary, test input) intended to be covered by the common entity type, or is there an expectation that it should be subclassed?

  • Unclear whether FILE.filename should include just a basename, a full path, etc. and if path is included, what is it relative to? (Should there be some defined filesystem structure being referenced?)

Will not address: All files are covered by the common entity type. A FORMAT attribute is provided.

Filenames are purely informative. There’s no meaningful absolute paths that we could capture, and the same file might be stored under many filenames. If we want to model a directory structure or an archive structure or something like that I think we’d need to do that on top of the current FILE representation.

FILE [LM-ATL] How to submit a literal computer file (not the FILE concept in the ontology) for storage in the RACK? Describing the file, via the ontology, is not (and will never be) entirely sufficient to describe its contents, so we must also be able to include literal files. What is the authoritatively correct procedure for doing this? Will not address: RACK doesn’t store files.
FILE

[Adelard] I am not sure if it is useful to distinguish between FILE and DOCUMENT or whether FILEvshould be considered to be a kind of DOCUMENT.

The satisfies property of FILE does not belong in the core ontology.

Short-term: In v9.0 we will address FILE vs DOCUMENT in documentation. RACK Issue #579.

Will not address. No functional concern identified.

FILE_FORMAT

[GrammaTech] FILE_FORMAT not specified with enough precision for entity resolution. Some options:

  • Provide set of pre-defined identifiers that all performers must use. (Guarantees resolution, limits flexibility to add new items.)

  • Provide set of pre-defined identifiers that may be extended.

  • Provide guidance on preferred format (for example “file extension” or “output from Unix `file` command”, type case preference, etc.

  • ‘C’ versus ‘C source, ASCII text’

  • ‘pdf’ versus ‘PDF’ versus ‘PDF document, version 1.5’

  • ‘binary’ versus ‘ELF’ versus ‘ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, with debug_info, not stripped’

[Adelard] The fileFormat property of a file is described as the “byte-level encoding” of the file. This confuses character encoding (e.g. ASCII, UTF-7) with data format (CSV, XML, etc.)

Short-term: In v9.0 we will add a set of pre-defined identifiers. This set can be extended as needed in overlays. Preferred format is “file extension”, all lowercase. Beginning list:

[c, txt, csv, doc, pdf, exe, log, xls]

RACK Issue #580.

Short-term: In v9.0 we will document about fileFormat. (Better yet, add an example in Turnstile.) RACK issue #580.

HAZARD

[STR] RACK should contain information about:

  • Hazard enumeration

  • Hazard mitigation – both system and extra-system

  • Requirements used to support mitigation

Will not address: see Hazard.SADL and Requirements.SADL.
Missing (DO-178C)

[RTX] A lot of concepts should be based on DO-178C since they are general and supported by the certification community. For example, the definition of test, test cases, test oracles.

[Adelard] The following definitions are taken from DO-178C. Not all of the necessary concepts are modelled in the RACK ontology - it would be worth considering each concept in turn and considering how it might be modelled and whether there are any gaps / omissions from the ontology. Some obvious gaps are highlighted in red below.

  • Software Requirements Data: Software Requirements Data is a definition of the high-level requirements including the derived requirements. This data should include:

  1. Description of the allocation of system requirements to software, with attention to safety-related requirements and potential failure conditions.

  2. Functional and operational requirements under each mode of operation.

  3. Performance criteria, for example, precision and accuracy.

  4. Timing requirements and constraints.

  5. Memory size constraints.

  6. Hardware and software interfaces, for example, protocols, formats, frequency of inputs, and frequency of outputs.

  7. Failure detection and safety monitoring requirements.

  8. Partitioning requirements allocated to software, how the partitioned software components interact with each other, and the software level(s) of each partition.

  • Design Description: The Design Description is a definition of the software architecture and the low-level requirements that will satisfy the high-level requirements. This data should include:

  1. A detailed description of how the software satisfies the specified high-level requirements, including algorithms, data structures, and how software requirements are allocated to processors and tasks.

  2. The description of the software architecture defining the software structure to implement the requirements.

  3. The input/output description, for example, a data dictionary, both internally and externally throughout the software architecture.

  4. The data flow and control flow of the design.

  5. Resource limitations, the strategy for managing each resource and its limitations, the margins, and the method for measuring those margins, for example, timing and memory.

  6. Scheduling procedures and inter-processor/inter-task communication mechanisms, including time-rigid sequencing, preemptive scheduling, Ada rendezvous, and interrupts.

  7. Design methods and details for their implementation, for example, software loading, user-modifiable software, or multiple-version dissimilar software.

  8. Partitioning methods and means of preventing partition breaches.

  9. Descriptions of the software components, whether they are new or previously developed, and, if previously developed, reference to the baseline from which they were taken.

  10. Derived requirements resulting from the software design process.

  11. If the system contains deactivated code, a description of the means to ensure that the

  12. code cannot be enabled in the target computer.

  13. Rationale for those design decisions that are traceable to safety-related system requirements.

  • Source code: This data consists of code written in source language(s). The Source Code is used with the compiling, linking, and loading data in the integration process to develop the integrated system or equipment. For each Source Code component, this data should include the software identification, including the name and date of revision and/or version, as applicable.

  • Executable object code: The Executable Object Code consists of a form of code that is directly usable by the processing unit of the target computer and is, therefore, the software that is loaded into the hardware or system.

Action: If there is consensus among the TA performers, some of these properties from overlays could move to Boeing.sadl overlay. Some properties will move from SRI.sadl to core ontology (in RACK v9.0). Additional properties can also be added, what we would like for that is

  • What property

  • Is there a consensus about the property

  • Who will be providing instance data

Missing [RTX] There are missing concepts in the ontology, such as: Architecture, Partition, DAL, Algorithms, Hardware Platform, Test Procedure. Action: Same answer as above.
Objective/Property [SRI] GenericProperty and SpecificProperty: To use formal methods for verification and for property-based assurance in general, we added these two classes to give property a first-class status with relationships such as propertyScope and propertyBasis. Then a property can be analyzed by many activities independently.

Short-term: Plan to have it in v9.0. RACK Issue #575.

Has been pushed into master branch.

Overlay

[RTX] The overlays should be merged. Before that, it should be analyzed whether the information of the overlay can be modeled with the core ontology.

  • We used the visualization tool to understand the data better and then design the node groups to extract it. However, it seems to me that there can be multiple paths that one can take to extract the same data given the connections in the graph in the SRI overlay. But, I believe with just the edges connected to different ontology elements it’s not obvious what should be connected together and in which specific fashion to extract a specific evidence.

Action: Need team raising the issue to provide an explicit data model proposal and concrete sample data
Over specified

[Adelard] As a general observation, I agree with the intention behind the MODEL abstraction, namely that the core RACK ontology should be structural rather than semantic, but this implies that some entities, attributes and relationships should be deleted from the core model.

https://github.com/ge-high-assurance/RACK/wiki/Using-MODEL

For example, CONFIDENCE does not belong in the core ontology - it’s clearly a semantic model - and even entities like PERSON and TOOL in AGENT are over-specified. Why should a PERSON be identified by an email address or a TOOL by a version? It is sufficient

to identify PERSON and TOOL as different kinds of AGENT without being more specific about their attributes.

Similarly, I would question whether HAZARD should include severity and likelihood properties, or whether these should be part of a HAZARD_MODEL

Will not address: The MODEL abstraction causes longer query times by first tracing to the base structure and then selection a model attached to that structure. MODELs should only be used where there are several known distinct models for a structure, rather than used as a general solution. Are there specific structures where specific MODELs have been identified beyond REQUIREMENTS?
PROCESS

[Adelard] I like the idea of an OBJECTIVE, but I was surprised to find it defined under PROCESS.

"An OBJECTIVE identifies tasks from a process for which evidence must be provided to show that the task has been completed. "

The note refers to “tasks from a process” rather than “activities” - why introduce new terminology unnecessarily?

The note also refers to “evidence that must be provided”, but this isn’t reflected in the definition of OBJECTIVE, which simply links objectives to activities.

Short-term: In V9.0 there will be additional updates to OBJECTIVE. A “note” is just a textual string to try and give some color to the concepts. Somewhat related to this is RACK PR #540 which has been pushed into master branch.
Process of updating [LM-ATL] Data model updates are not agile. Improvements to overlays and data models contributed by performers, e.g. SACM do not get disseminated to other performers quickly.

Will not address: complex data models need significant review and acceptance by users, much like complex software. A versioned release system lowers risk to the program and gives users time to plan for changes.

Regarding nodegroups, we now have auto generated nodegroups in V8.0 that performers can utilize, so that should allow you to generate dataset more quickly.

REQUIREMENT [RTX] The concept of system/high/low level requirements is general, and it is important to differentiate them. Currently there is types specific to Boeing which are not necessary. It makes the node groups not reusable. Will not address: This should be defined in the overlay. Generic queries can be made that use the REQUIREMENT class that can grab the hierarchy of requirements. The difficulty with this is how many levels of requirements do you have, not all developments have the same number of levels, this is the reason for using the overlay to capture the project specific details,
REQUIREMENT

[GrammaTech] REQUIREMENT granularity is too coarse

  • By only allowing the requirement granularity and identifiers provided by Boeing (i.e., “SP.NAV:XXX…” or “SubDD 3 XXX…”) it is impossible to codify individual “little-r” requirements within these broad sections of text

  • Consider, for example, SubDD 3.3.2.1.2.2 which is assigned a single identifier (redacted for ITAR reasons). This document section contains 15 separate sentences each of which could be tested separately; even 3.3.2.1.2.2.a contains two separate sentences.

  • By pushing any further breakdown into performer-specific MODELs, no shared representation or identifiers for finer-granularity requirements are possible.

  • GrammaTech specific consequence: In the absence of specific fine-grained identifiers, we cannot describe our test coverage effectively.

[Adelard] System requirements, High-level requirements and Low-level requirements should be distinguished in the core ontology

[SRI] Subclasses and collections of REQUIREMENT: As others have pointed out, it would help if we organized subclasses of REQUIREMENT based upon DO-178C guidance which is quite generic. Currently there is one REQUIREMENT class in RACK. It would be better to have subclasses for system-level, software HLR, and software LLR with specific attributes in each and restricted ranges for the relationships. We ended up making such subclasses in the SRI overlay

Will not address: Granularity of REQUIREMENTs is based on the data provided by TA1s.
REQUIREMENT

[Adelard]

  • It should also be possible to distinguish Safety Requirements at the system level.

  • A concept of a RequirementSet might be useful.

  • Requirements need to have some kind of semantics, otherwise, it is not possible to determine whether the test cases for a requirement are adequate. In effect, requirements need to be testable – a simple text string representation is not good enough.

Will not address: In RACK v8.0, Boeing.sadl already allows for “set of requirements” etc.
REQUIREMENT

[GrammaTech] REQUIREMENT subclassing limits applicability to MIL-STD-498/490 derived standards.

  • If one were to try to apply this to making an assurance case for COTS/FOSS software or to an Agile development model, it might be difficult to force the requirements to align.

  • Suggestion: Assume that the CSID/PIDS/SRS/SubDD breakdown is content rather than structure; represent this either with wasDerivedFrom or add a property REQUIREMENT.level: string that can be applied to any set of assurance artifacts.

Short-term: referred to data model decision team for consideration.
REVIEW

[Adelard] The concept of a REVIEW_LOG is too simplistic:

A review typically results in a set of observations – these might be categorized in some way, for example, as technical vs editorial. Each observation needs to be resolved – accepted, rejected, or perhaps held over to the next version. The agreed changes then need to be applied to produce a new version and there should be some kind of confirmation / approval that the changes have been applied correctly.

For this reason, the definition of REVIEW_STATE is too prescriptive. REVIEW_STATE should reflect the status of each observation – Open / Unresolved, Accept, Reject, Hold Over, etc. – and should be modelled as an extensible set of values.

The values ReviseWithoutReview and ReviseWithReview are particularly problematic

because they mix up review status, version, and approval.

In principle, once the review findings have been agreed, a new version of the reviewed artefact should be produced that addresses the review comments, and an independent check should be made that the comments have been acted upon correctly. So a typical review cycle might look like this:

Version X -> Review comments -> Agree resolution -> Apply comments -> Approve changes -> Version Y

In particular, it should be possible to check that the review comments have been correctly applied and to link each change to a review comment.

It is also important to capture the purpose of the review, i.e. the review criteria. What can I conclude after a successful review / revision?

It might be helpful to link the REVIEW to an OBJECTIVE from a standard such as DO-178C.

The author attribute of REVIEW is unnecessary / misleading. Author is a property of the entity being reviewed, not a property of the review. Also, the person who requests the review is not necessarily the author of the entity being reviewed.

Action: Modify the relevant overlay. If teams are interested in promoting the revision to the core data model, we will address.
SOFTWARE

[Adelard] Suggest separating the ontology for the Software Architecture (SWCOMPONENT) from the ontology of source code / binary artefacts and the activities that generate them.

Binary components should be separated from software components – COMPONENT_TYPE should only apply to source code entities (Class, Package, Interface, etc.)

There should be a clear mapping from the system architecture to the software architecture,

and it should be possible to identify which software components implement particular system functions

Similarly, there should be a mapping from the software architecture to the binary artefacts that are deployed on the hardware – something like a UML Deployment Diagram.

Is it really necessary to model basic blocks in the RACK or are they only useful for specific analyses performed by TA1? Unless there is a need to share the basic block decomposition between TA1 and TA3, this level of detail is too fine-grained for the RACK.

Surprisingly, there does not seem to be an explicit concept of Source Code or Executable

Code in the core ontology

Action: Need team raising the issue to provide an explicit data model and concrete sample data
SWCOMPONENT [GrammaTech] Removal of control flow information (SWCOMPONENT.ConditionalSuccessor etc.) that has been indicated to be of interest to some TA3s. Already being tracked on the Data Model Proposals board: https://github.com/ge-high-assurance/RACK/projects/8#card-60351600
SYSTEM

[Adelard]

  • SYSTEM, INTERFACE and FUNCTION are reasonable architectural abstractions, but there is no notion of PARTITION

  • How are data / control flows modelled at the architectural level?

  • Need a mapping between system architecture and software architecture, for example, from FUNCTION to SWCOMPONENT

  • OP_ENV and OP_PROCEDURE do not belong here – they might be relevant to a safety case, but they are not part of the system architecture

  • “Commodity” is a strange name for the “thing conveyed by an interface”

  • Does there need to be a separate ontology for Hardware (the physical architecture)?

Action: Need team raising the issue to provide an explicit data model and concrete sample data
SYSTEM [SRI] Architecture: System or software architecture specification also deserve first-class status in the ontology. In SRI overlay, we added a class and activities for SystemArchitecture, with specific relationships to entities such as SYSTEM. A SoftwareArchitecture Short-term: Adding to Boeing.sadl in v9.0. RACK Issue #576 and Data Model Proposal Card.
TEST [JHU-APL] Summary of test results in RACK instead of raw test results Action: Need team raising the issue to provide an explicit data model and concrete sample data
TEST [RTX] The only way to group entities is via subtyping (like SubDD, SRS). Example of what we need is, test that are meant to address robustness and test that address normal cases, agents that have a lot of experience from junior ones, etc. Will not address: This is already possible but it requires added data to an overlay ontology and then using that as part of a query, for example if you added a “jobTitle”, to an Engineer, then you would be able to do searches that could exclude “Senior Test Engineer” or “Test Engineer”. But this is predicated on this information being populated.
TEST

[Adelard] DO-178C distinguishes between a Test case and a Test procedure. Roughly speaking, a Test case is the specification of a test and a Test procedure is an implementation of a test – logically, these are not the same thing.

The RACK concept of a TEST blurs the distinction between the two concepts – they should be modelled separately.

Similarly, there should be two separate TEST_DEVELOPMENT activities, one for specifying a test and one for implementing a test.

TEST_EXECUTION has an executed_on property of type AGENT, which is described as “AGENT(s) (e.g. some testing software/machine) running those tests”

I think the testing software that executes the test (test harness) should be considered to be part of the test implementation – in contrast, it makes sense to model the computer / execution environment on which the test runs, but there is no MACHINE agent.

Short-term: To be fixed in RACK Issue #573
TEST [SRI] Test Obligations, Test Cases, and Test Procedures: There is a single class TEST in RACK that is not sufficient to capture the attributes semantics across all the testing-related entities, as others have also noted. It would help to use generic guidance from DO-178C here. In Do-178C, a test case is an abstract specification that consists of test criteria for requirement coverage (i.e., what aspect of requirement clause/subclause does the test cover), test description, trace to the requirement or to test obligation (oracle), test inputs values and expected output values applied to the component under test. A test procedure is typically a script that applies the input values to the component, measures output values and compares them to expected outputs of test case. One may merge test case and procedure as long as all the relevant semantics are captured; but the TEST class in RACK doesn’t have those attributes Short-term: To be fixed in RACK Issue #573
Theories [SRI] Theories: There should be a discussion of theories and how they are used in assurance argumentation. We used requirement-based testing theory to support a claim that the test oracles test all aspects of requirements’ behaviors per the DO-178C guidelines. Questions: To what extent should a theory be explicitly described in RACK evidence? Should it be opaque (i.e., read a sperate document) or should a theory’s claims/proofs should be explicitly modeled in RACK? Action: Waiting on specific theories proposal.
Tool Qual [SRI] Tool installation and qualification data: Besides the tool version, the actual installation configuration of the tool on a particular PC/OS is essential to establish confidence that the tool invocation produces correct results. The tool installation instructions must come with the installation configuration/verification checklist that should be filled in for this purpose when the tool is installed. Also, tool qualification data (pertaining to the particular tool version) must be included in the evidence Short-term: Will be there in v9.0. RACK Issue #574. Has been pushed into master branch.
[JHU-APL] JSON file pointers in data (have to pull reference on filesystem) Makes sense.

Problem is due to Query engine limitations

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address/

Comments

[& Owner]

[JHU-AP] Unable to programmatically discover new evidence in RACK

Suggestions:

  • Ability to automatically discover/query the names of entity categories and the paths between them (as opposed to querying through the interface, or manually building nodegroups for data we know is there)

  • Ability to query arbitrary ontology categories and relationships (as opposed to querying the data only through predefined nodegroups; need to be able to query for new relationships as they occur in templates)

  • Ability to query information about evidence (e.g., instead of querying directly for evidence, be able to query about types of evidence present in the database)

Much of this is available in the new Explore instance counts.

Others are available through the ontology. We could add to python API.

Need to convene session to flesh this out and assess priority.

–Paul

Partially addressed with explore tab: Release 8.

Created https://github.com/ge-high-assurance/RACK/issues/583 - but it would be nice to confirm that python is the missing piece for JHU. The system does provide this information in other ways.

[JHU-APL] Have to post-process results of RACK queries to perform joins / intersects / sort / select operations

Suggestion:

  • API for Lego-style construction of new queries / nodegroups by chaining the existing ones with join and intersect operations

There seems to be consensus this would be useful. It is longer-term to design & build. -Paul

Needs team effort to define this task.

This could be a big task. Important to talk to folks and get the requirements correct.

https://github.com/ge-high-assurance/RACK/issues/584

[LM-ATL] Query engine lacks a fast method for iteratively identifying all direct links to/from a given object within the RACK i.e. when looking at a particular TEST, I should be able to quickly find all other objects with a direct relationship to that TEST and then, without writing any querying language

or performing any drag-and-drop operations, I should be able to select one of the objects that had a relationship to the TEST and find all of its direct relations

This would be a useful explore tool. It could be built in the medium term. SPARQL fully supports this type of thing. -Paul

(Release 9) https://github.com/ge-high-assurance/RACK/issues/585

Constraints [RTX] Queries with “minus” have limitations. For example, given me all requirements that do not have a “Passed” analysis. “Minus” is very important since we are always looking for missing data. Probably a training & doc issue. A meeting to see a bunch of examples might help. -Paul
Constraints [RTX] When variables are used in the filter, they need to be returned. If not, the data is not filtered.

Sounds like a bug worthy of short term solution - Paul

Release 8. Is this fixed in v 8.0??? Yes. Confirmed fixed. I can’t find the issue

Delete

[RTX] Deleting names of variables

  • When remoted runtime variable is deleted, the variable’s name does not get deleted unless you delete the component.

  • When I try to update the names of the attributes for any ontology element it doesn’t work. I have to delete the element and add it again if I need to modify any names inside it.

Sounds like a bug worthy of short term solution – Paul

Need a concrete example. What is a “remoted runtime variable” and what does it mean to “delete” it?

Nodegroup Editing [RTX] Having an undo, copy (a specific portion of the query from another node group), and paste mechanism would help a lot (if it already exists then it would be good to learn where to find it).

Undo would be useful and is doable in the short-medium term.

https://github.com/ge-high-assurance/RACK/issues/586

Copy/paste/merge should be part of the “join / intersect” task above. RACK issue 584 -Paul

Performance

[RTX] How to group analysis to reduce the scalability problems. RACK could not ingest the data, when we tried to add couple of analysis for each software component (66K). Another example, “REQUIREMENT_CONSISTENCY_ANALYSIS” should one analysis output be connected to all requirements or one analysis output too all of them.

  • Possible problems: Analysis output that are related to multiple entities. ”SOFTWARE_ARCHITECTURE_COMPATIBILITY_HLR_ANALYSIS” analyzes whether HLR1 is compatible to SW-ARC1. Therefore, the output should connect both

All these ingest Performance issues require some careful analysis and improvements may be limited while we’re using desktop docker fuseki RACK.
Performance [RTX] I have noticed that during data ingestion process sometimes it needs a few tries for the data to successfully get ingested. The reason for failure (whenever it happens) is the connection timeout issue.
Performance [RTX] It is not clear how to optimize the ingestion queries.
Performance [RTX] Loading data in the RACK is too long. GrammaTech Data took 1 hour. We added connection between software component and files, and it took couple of hours. At the end, we tried to add analysis to every software component and the ingestion never ended.
Performance [RTX] Queries related to software components took longer than other. However, Lockheed Martin’s queries are the one the take the longest, couple of minutes.
Performance [SRI] There should be some support for ingestions of files. If files cannot be physically put in the main database due to performance issues, some support can be provided in SPARQLgraph for navigating to the file based upon the filename in RACK.
Query Types

[RTX] Types of queries.

  • RACK provides different types of queries: distinct, count, etc. There are different points to send that type of queries. Since the user is free to create new node groups, our tool does not know in advance if it is a distinct query or a count query. It would be better to have a mechanism to save the query as a count query is needed and use the same point.

  • A mechanism for a true/false query will also be very helpful.

Short term –Paul

SPARQLgraph remembers default query type, and REST endpoint availabile in Release 8.

https://github.com/ge-high-assurance/RACK/issues/587

Add to python interface

https://github.com/ge-high-assurance/RACK/issues/588

Fully support ASK

https://github.com/ge-high-assurance/RACK/issues/395

Query Types [Adelard] SparqlGraph does not support aggregate queries (e.g. GROUP BY / HAVING)

GROUP BY and aggregate functions are in next release. Release 8.

done

HAVING clause

Added HAVING to tech debt

https://github.com/ge-high-assurance/RACK/issues/590

Query Types [Adelard] SparqlGraph does not support MINUS queries (set difference) Yes it does.
Query Types

[Adelard] SPARQL does not support functions over attributes

  • E.g. count how many times the word “shall” appears in each requirement

aggregate functions are in next release. done

Problem has not been root caused, but still needs addressed

Pertains to: Specific Problem

Long Term /

Short Term /

Will Not Address/

Comments

[& Owner]

Shared RACK [LM-ATL] As soon as is reasonable, preferably by the next assessment, a live, shared RACK instance should be accessible to all TA1s and TA3s for the purpose of pushing and pulling data. The production, proliferation, and utilization of ingest packages as a workaround for all performers having separate, local instances of the RACK was a large drag on productivity.

What worked well

Pertains to: Specific Problem
SADL [SRI] Ontology: Use of SADL to specify the ontology worked well. A useful feature was to be able to restrict the range of a subtype of a relationship from the parent class. This allowed “type checking” the foreign key references in CSV files to the classes in the restricted range.
COLLECTION and MODEL [SRI] The COLLECTION and MODEL classes were quite useful – we had several subtypes of these in the SRI overlay. We may want to add some attributes to indicate COLLECTION semantics; i.e., what is the purpose of the collection? what does it mean for the entities to be in a collection – do they form parts of a whole? Can an entity participate in more than one collection?
CDR ingestion [SRI] Capability to ingest CSV files flexibly using nodegroups with blank columns and simple order. We had many cross relationship between objects but the ingestion part wasn’t difficult. Some additional checks/support would be useful as noted in section C (see DesCert-RACK-Gaps-Analysis.pdf, section C. Issues Faced in Creating Evidence for Ingestion).
⚠️ **GitHub.com Fallback** ⚠️