Summary Learnings, Gaps, Enhancements - ge-high-assurance/RACK GitHub Wiki
- Problem is due to missing data in our original artifacts as received
- Problem is due to errors or gaps in TA1 evidence generation, e.g. due to tool limitations
- Problem is due to Data Model limitations
- Problem is due to Query engine limitations
- Problem has not been root caused, but still needs addressed
- What worked well
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address / Comments [& Owner] |
---|---|---|
Build Info |
[GrammaTech] No information on build tools or build process
|
Software.SADL includes |
Hazard |
[JHU-APL] No hazard analysis provided [GrammaTech] No HAZARD information provided [STR] RACK should contain information about:
[Adelard] E.g. No hazard analysis |
|
Open Example | [LM-ATL] A broadly accepted surrogate (w/ justification for acceptability) usable without ITAR restrictions would be extremely valuable | |
Processes |
[JHU-APL] No operational data provided [JHU-APL] Would be optimal to have input/output to actual certification process for this system (what was delivered, comments from certifiers, etc.) [RTX] Evidence about processes |
|
Processes |
[JHU-APL] No process data provided (e.g., version control history; internal process for documentation; issue tracker data) [RTX] Reviews, check lists, compliance analysis, cost, quality metrics and versions (how many versions of requirements) [RTX] There is not data about DAL, Algorithms, Hardware Platform, Test Procedure, Reviews (Code reviews, Requirement’s compliance, Code conform with the standard) |
|
Requirements | [STR] (Derived) Requirements should discuss pre/post conditions | |
Requirements |
[STR] Requirements model details
[RTX] Nav System was not tight to any requirement or Processor. And all the requirements are tight to the processor
|
|
Requirements | [Adelard] Gaps in LLR level | |
Simulation | [LM-ATL] No access to a configurable simulator, in source or as binary executable | |
System |
[LM-ATL] Insufficiently described system; all performers should have access to an authoritative ontological description of the system and its target domain [RTX] There is not data about architecture, partitions [Adelard] Gaps at the architectural definition level |
|
System |
[STR] Component hierarchy: system components only provided at a high level (need detailed subsystem components, interfaces, and functions)
|
|
Tests | [LM-ATL] Lack of reproducibility; resolvable through access to the test suite & testing environment | |
Tests |
[LM-ATL] Test data lacks coverage/completeness information [RTX] There are not test coverage metrics in the data. [Adelard] Gaps in coverage analysis |
|
Tests | [LM-ATL] Test suite outputs largely define success i.e. variable A should have value a. Need for definitions of failure as well i.e. a+0.5 would be acceptable, a+>0.5 would not. | |
Tests |
[STR] Testing evidence
|
|
Tests | [STR] Tests should refer to conditions, e.g., which pre-conditions are true for a test and which post-conditions are implied by a successful test | |
Traceability |
[RTX] Current data is not grouped by sub-systems. There is an assumption that the RACK only has one sub system. For example, if a requirement is entered in the RACK, there is no way to know to which system/subsystem belongs. Therefore, for us is hard to judge which subsystem has the untraceable requirements. We thought that the connection govern was used to that, but this connection is not used consistently through all TAs. The same situation happens with other entities. For example, with test. If a test is entered and it is not connected to anything (what can happened). We cannot know to which system/subsystem belongs. [RTX] There are untraceable entities, such as, requirement, test, software components, etc.
[STR] Overall, once we have the required pieces of evidence we must also ensure traceability from test results back to requirements to components!
[Adelard] Gaps in traceability |
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address / Comments [& Owner] |
---|---|---|
Check for forbidden characters | [RTX] There are software components with characters that are not ingested by the RACK correctly. It would be good to define forbidden characters like *,…, / . Identifiers are allow to have spaces. |
Primarily training. Can we make it easier to share importSpec transforms & rules across nodegroups. I could use more clarity on which/why characters are not allowed (semtk should accept ASCII) and why the current ingestion template validation and transformation tools are not sufficient. Is this a training issue or is there a feature we can add? -Paul Get more info. No RACK issue yet. |
Data Audit |
[STR] Responsibility for RACK content quality should rest on all TAs
[RTX] TA1s/TA2 know how the data is ingested. But it is up to the TA3s to query the data. For that TA3, needs to learn in detail the data structure. It would be more efficient if the TA1s could provide node groups to query their data. [RTX] Some queries show (e.g., untraceable requirements) more evidence is required. TA3s can point out missing evidence. However, there is not a process that establishes what to do with that. For example, RTX found there are untraceable requirements. Who should resolve this problem? |
TA2 could build a unified ingestion script that loads provided data packages and runs a set of unit tests to help check that everyone is working from the same view of the combined data. - Eric |
Data Audit |
[STR] We should define a set of basic queries and TA2 should run and audit the results of these queries against the RACK as part of a regular QA process
[RTX] There should be rules for entering the data. For example, if a source code file is entered, it needs to be connected to a software component and a development activity. |
This (and more) would be addressed by our idea of putting more (cardinality) constraints in the model then building a tool to check all the constraints and report. -Paul https://github.com/ge-high-assurance/RACK/issues/581 (build report generator) https://github.com/ge-high-assurance/RACK/issues/582 (report violations of cardinality constraints) |
RACK queries |
[STR] Suggested RACK queries Overall, once we have the required pieces of evidence we must also ensure traceability from test results back to requirements to components!
[RTX] There is need for defining queries that ensure that the ingested data is correct and complete. I know that there is a way in the RACK however, the queries definition should be worked amount all TAs. [SRI] With heavily connected relationships among object instances multiple classes, it becomes very difficult to keep track of whether we are using proper foreign key identifiers in a CSV file. We came up with a identifier naming structure/scheme to manage these and help in manual reviews of the evidence being ingested.
|
[STR] This seems like a solid short-term priority. It might be a use case for nodegroup transformation wizards. I.e. define a nodegroup that matches legal/complete version of the 10 items listed. Have a wizard/transformation that runs query(ies) to find all graph patterns that do NOT match, hopefully clearly showing what is wrong. -Paul [RTX] [SRIa] These can be accomplished with our proposed constraint checker and by building “stock” queries. [SRIb] This functionality exists and we use it widely [Paul] The comment about foreign keys is disturbing. I think RACK issues 581, 582 go a long way. Do we need a new issue to build a specific report? |
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address / Comments [& Owner] |
---|---|---|
[RTX] Evaluate relevance of the data:
[RTX] Before ingesting data in the RACK, it should be evaluated in a council whether the data provide a usable evidence and that the purpose of this evidence is clear. |
||
What is the Vision/Purpose of RACK |
[RTX] What is the vision of the RACK? That the RACK is the truth, is an evidence broker, or an index of the evidence. Our vision of the RACK is:
[Adelard] What is the purpose of the RACK?
|
|
[STR] Binary level software component information not relevant to ARBITER case | ||
[STR] Gaps in alignment of different TA1 data sets (e.g., reference to common requirements) | ||
Processes | [JHU-APL] Gaps in knowledge where details are redacted from documents | |
Security Related |
[STR]
|
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address / Comments [& Owner] |
---|---|---|
[GrammaTech] Lack of traceability from harness executables in TEST_EXECUTION.csv to COMPONENT.csv
|
Short Term: Brain storming session between GT and TA3 teams – Lucja & Honeywell & others | |
[GrammaTech] Missing dataInsertedBy. It is possible to do this in the nodegroup, but should there be a preference for having it as part of the source input such as the CSV file? | Cannot do as part of the ingestion as that would result in every version of RACK having different “dataInsertedBy”. The Scraping Tool Kit automatically will create this relationship, otherwise it should be capture when generating the CSV. Note: it may be possible to hardcode as part of a custom nodegroup, but then the nodegroup loses utility. | |
[GrammaTech] Missing generatedAtTime. Per Eric Mertens, this can’t be done in the nodegroup because it generates an explosion of slightly-different time values. It is also not clear what generatedAtTime should represent, see a separate discussion in Section C.1 of this document. | SemTK has a rarely-used feature that might be helpful. The ingestion templates support Text of %ingestTime or %ingestEpoch. We ought to use this or be able to find a short-term solution -Paul | |
[GrammaTech] SPECIFICATION.dateOfIssue – is being populated with file date (Unix modtime of the PDF) because we can’t reliably extract the values from the title page
|
||
[GrammaTech] SPECIFICATION.wasAttributedTo → AGENT.identifier – is being populated with the author information in the PDF document metadata because we can’t OCR the signatures
|
||
[GrammaTech] TOOL entities aren’t being used at present | ||
[RTX] Lockheed Martin ontology uses strings to connect component. The connections should be part of the ontology. | ||
Confidence |
[JHU-APL] Unable to determine confidence of a particular piece of evidence
|
|
Requirements |
[JHU-APL] Have to parse lengthy text blobs of requirements (need machine-readable) [Adelard] E.g. poor text extraction of requirements, failure to link up requirement ids, noise generated by ingestion, etc. |
Pertains to: |
|
Long Term / Short Term / Will Not Address / Comments [& Owner] |
---|---|---|
EACF | [JHU-APL] Have to reconstruct EACF representation from low-level RACK ontology mapping; would prefer to have the original EACF | Action: Need team raising the issue to provide an explicit data model and concrete sample data for EACF. |
PROV data model | [LM-ATL] Data models should incorporate widely accepted and well-documented data (meta) models avoiding “reinvention of the wheel.” For example, consider UTP2 for testing (https://www.omg.org/spec/UTP2/2.1/About-UTP2/), or SAE’s EIA649 for configuration management; also, SACM for representing structured arguments. SACM includes an Artifact Metamodel suited for assurance processes and for representing evidence. (From the spec: “The SACM Artifact Metamodel defines a catalog of elements for constructing and interchanging packages of evidence that communicate how the evidence was collected”). |
Will not address. RACK data model is already based on the two most widely accepted and well-documented data models: E-R model and PROV-W3C model. Assurance case domain model built in RACK leverages these to provide a complete model for assurance case evidence. Discussed on 11/29/2021 and closed. |
PROV data model |
[Adelard] In general, I don’t care about who inserted something into the RACK or when it was inserted. Hence, I’m not convinced that the time / origin “meta attributes” are useful - they just clutter up the ontology. What matters are the dependency relationships, not the precise timings of who did what when. Also, dataInsertedBy is recursive. How is the recursion resolved? What was the activity that inserted the activity that inserted … that inserted the data? There is no consistent naming convention for attributes - impactedBy vs. wasImpactedBy. THING
ENTITY
ACTIVITY
|
Will not address. No missing functionality or other functional concern. Will not address. data in dataInsertedBy refers to data, not metadata. The provenance of data should be captured, but the provenance of metadata should not. Will not address. No missing functionality or other functional concern. Will not address. No missing functionality seen. Will not address. Semantics often show up in databases in the form of informative attribute or relationship names. Removing those often leads to more confusion than improvement. Detailed discussion on this at the all TA call on 10/12/2021; we will add some more documentation about it. We may add a relationship from ACTIVITY to a plan. Short term: ACTIVITY and Plan, working on a new idea to ingest information from Apache PSAQ and SDP – Dan & Kit & Greg K. |
PROV data model |
[SRI] Overall: The RACK ontology was based on generic provenance ontology that support all sorts of applications of the base classes (ACTIVITY, ENTITY, etc.). Therefore, there are many attributes (properties) and relationships in the classes that are not useful for assurance evidence and also result in multiple possible transitive relationships across ACTIVITIES and ENTITIES that can cause confusion and/or inconsistencies. Our recommendation is to normalize the relationships (and remove/disable redundant ones) so that there are minimum specific transitive paths to traverse in the graph with specific semantics attached to them. Additionally, we should have some rules/guidelines regarding how to populate (which relationships) and traverse the evidence. (Note: In our TA1 evidence, we populated many such “redundant” relationships because we didn’t know what TA3s were going to use and how they were going to traverse the graph.)
|
Will not address: TA2’s approach from the beginning was to have a very general core ontology that can be used on other programs as well as in ARCOS. Having a property in the ontology does not mean that it needs to be populated, and if there is no data for a property then it will not be part of the query results. |
[LM-ATL] RACK does not offer fine-grained curation. That is, tokens in strings do not map to a common ontology or terminology, which could be used to create linkages. | Will not address. Linkages should be captured explicitly in relationships, NEVER implicitly by string matching. | |
Versioning | [RTX] It is not clear to me how to model life cycle data. For example, version of one requirement (not about timing). I’m not sure if that should be on the RACK. Or if there is a tool that can provide evidence about the variable of a requirement over time. |
Will not address. All Entities have a wasRevisionOf property. The intent is that each version would be captured as unique entities (e.g. R-1v1, R1v2). The turnstile example is being updated was an example of this, however no Boeing data exists. Each ENTITY is a unique piece of evidence. |
[STR] What is scheme to trace from an ARBITER assurance case back to evidence in the RACK?
|
Will not address. Every THING in RACK has a unique, queryable identifier that can be inserted into an assurance case for traceability. But for clarity, will the final assurance cases point to RACK, or use RACK to build an independent assurance case? Do the identifiers need to primarily be meaningful in RACK or to the source data? – Eric |
|
ACTIVITY | [SRI] Tool invocation by activities: Often, an activity invokes a tool in a specific context, passing specific parameters and options to the tool. This needs to be captured independently of the activity. We added a ToolInvocationInstance class and relationship to this from each subclass of activity | Short term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement and RACK PR #540. Has been pushed into master branch. |
Ambiguity | [GrammaTech] Ambiguity for whether TEST applies only to original artifacts, or equally to tests generated by tools like GrammaTech’s A-CERT during the software assurance process. |
Will not address: We can capture the source of TESTS with TEST_DEVELOPMENT activities. We can capture released groups of tests with BASELINEs. Between these two groupings we ought to be able to distinguish the two categories. – Eric Test should apply equally to all tests. Overlays should be used to make distinction between different types of tests. |
Ambiguity |
[JHU-APL] Unable to determine whether evidence is not present for a specific reason, e.g., how to tell the difference between:
|
Will not address: Tool runs that failed are already representable in RACK. Key attributes or relationships are already marked in RACK if they are required to be present. For other data, anything missing was simply not inserted by the data provider. The database cannot conjecture about why. |
Ambiguity in Identifier, etc. |
[GrammaTech] General ambiguity of properties
[RTX] It is not clear the difference between identifier, title, and description. Every performer is using it differently. [GE] is this question coming because of the core ontology or one of the overlay? Possible suggestion (?): in assurance case, use the actual artifact name and not the RACK identifier. Establish some universal best practice for what to put in the title, description, etc., so that there’s consistency. Include version somewhere. |
Will not address: Identifiers are used to reference/link data. Titles are short descriptions suitable for use in a list. Descriptions are the free-form, unstructured prose describing something in detail. Identifiers should be treated as opaque linking tokens. Titles and descriptions should be treated as human-readable opaque content. Any extracted semantics should be broken out into semantic properties on objects. – Eric |
ANALYSIS | [RTX] Analysis is a key component of our approach. It is important to define what is the claim that the analysis provide. For example, “REQUIREMENT_CONSISTENCY_ANALYSIS” analysis if the requirements are consistent | Short-term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement (especially sadl-examples) and RACK PR #540 Has been pushed into master branch. |
ANALYSIS |
[Adelard] Unlike REVIEW, an ANALYSIS is not governedBy a REQUIREMENT or SPECIFICATION. What is the purpose of the ANALYSIS? What property is being analysed and what can be concluded from the results of the analysis? It might be helpful to link the ANALYSIS to an OBJECTIVE from a standard such as DO-178C. ANALYSIS_RESULT (Passed, Failed, Indeterminate) is too simplistic - the purpose of an analysis might be to calculate a value or verify a property. Given the variety of different analyses that are possible, properties like result and metric should be associated with the MODEL for a particular ANALYSIS. The ANALYSIS activity should be linked to the tool that was used to perform the analysis, any tool-specific or analysis-specific configuration, including the purpose of the analysis, and the ENTITY that was analysed. The ANALYSIS_OUTPUT should record the result of the analysis without any pre-judgment about whether the analysis passed or failed - the outcome of the analysis should be captured by an ANALYSIS_MODEL. The purpose of ANALYSIS_ANNOTATION and ANALYSIS_ANNOTATION_TYPE is not clear - they seem to be associated with a particular kind of analysis and therefore don’t belong in the core ontology. |
Short term: ACTIVITY and Plan, working on a new idea to ingest information from Apache PSAQ and SDP – Dan & Kit & Greg K. Short term: We are updating the ontology in V9.0 to provide such structure. See branch analysis-enhancement and RACK PR #540 Has been pushed into master branch. |
DOCUMENT vs. COLLECTION |
[GrammaTech] Ambiguity for when to use COLLECTION in preference to many-to-one edge sets.
|
Will not address: Things should only be modeled as a collection when that collection has some externally recognizable identity. Prefer many-to-one edge sets when there are many things that happen to be associated in the same way. Raw (unsubtyped) COLLECTIONs ought to be quite rare – Eric |
COLLECTION | [SRI] The COLLECTION and MODEL classes were quite useful – we had several subtypes of these in the SRI overlay. We may want to add some attributes to indicate COLLECTION semantics; i.e., what is the purpose of the collection? What does it mean for the entities to be in a collection – do they form parts of a whole? Can an entity participate in more than one collection? | Will not address: Please add needed attributes in your overlay and as necessary they can be promoted to core ontology. Entities are “members of” a collection. An entity can be in multiple collections. |
DOCUMENT |
[Adelard] The various properties of DOCUMENT are too prescriptive, particularly DOC_STATUS. Each organization will have its own set of document attributes and document types. Hence, DOCUMENT should just be a placeholder in the core ontology that can be extended as necessary. DOCUMENT should have a URL attribute that can be used to identify / access the document |
Will not address: Please add needed attributes in your overlay and as necessary they can be promoted to core ontology. Short-term: v9.0 will have an URL for a document. RACK Issue #578. |
ENTITY |
[GrammaTech] ENTITY.generatedAtTime could benefit from clarification
|
Will not address: generatedAtTime and the other descriptive properties on an ENTITY are about the thing being modeled and not the model. In the case of a TEST object, the generatedAtTime would be the time the test itself was created. All the information about the source of the modeling information goes into the dataInsertedBy activity which tells when the data was modeled. – Eric |
FILE |
[GrammaTech] FILE entity type ambiguous in at least two ways:
|
Will not address: All files are covered by the common entity type. A FORMAT attribute is provided. Filenames are purely informative. There’s no meaningful absolute paths that we could capture, and the same file might be stored under many filenames. If we want to model a directory structure or an archive structure or something like that I think we’d need to do that on top of the current FILE representation. |
FILE | [LM-ATL] How to submit a literal computer file (not the FILE concept in the ontology) for storage in the RACK? Describing the file, via the ontology, is not (and will never be) entirely sufficient to describe its contents, so we must also be able to include literal files. What is the authoritatively correct procedure for doing this? | Will not address: RACK doesn’t store files. |
FILE |
[Adelard] I am not sure if it is useful to distinguish between FILE and DOCUMENT or whether FILEvshould be considered to be a kind of DOCUMENT. The satisfies property of FILE does not belong in the core ontology. |
Short-term: In v9.0 we will address FILE vs DOCUMENT in documentation. RACK Issue #579. Will not address. No functional concern identified. |
FILE_FORMAT |
[GrammaTech] FILE_FORMAT not specified with enough precision for entity resolution. Some options:
[Adelard] The fileFormat property of a file is described as the “byte-level encoding” of the file. This confuses character encoding (e.g. ASCII, UTF-7) with data format (CSV, XML, etc.) |
Short-term: In v9.0 we will add a set of pre-defined identifiers. This set can be extended as needed in overlays. Preferred format is “file extension”, all lowercase. Beginning list: [c, txt, csv, doc, pdf, exe, log, xls] RACK Issue #580. Short-term: In v9.0 we will document about fileFormat. (Better yet, add an example in Turnstile.) RACK issue #580. |
HAZARD |
[STR] RACK should contain information about:
|
Will not address: see Hazard.SADL and Requirements.SADL. |
Missing (DO-178C) |
[RTX] A lot of concepts should be based on DO-178C since they are general and supported by the certification community. For example, the definition of test, test cases, test oracles. [Adelard] The following definitions are taken from DO-178C. Not all of the necessary concepts are modelled in the RACK ontology - it would be worth considering each concept in turn and considering how it might be modelled and whether there are any gaps / omissions from the ontology. Some obvious gaps are highlighted in red below.
|
Action: If there is consensus among the TA performers, some of these properties from overlays could move to Boeing.sadl overlay. Some properties will move from SRI.sadl to core ontology (in RACK v9.0). Additional properties can also be added, what we would like for that is
|
Missing | [RTX] There are missing concepts in the ontology, such as: Architecture, Partition, DAL, Algorithms, Hardware Platform, Test Procedure. | Action: Same answer as above. |
Objective/Property | [SRI] GenericProperty and SpecificProperty: To use formal methods for verification and for property-based assurance in general, we added these two classes to give property a first-class status with relationships such as propertyScope and propertyBasis. Then a property can be analyzed by many activities independently. |
Short-term: Plan to have it in v9.0. RACK Issue #575. Has been pushed into master branch. |
Overlay |
[RTX] The overlays should be merged. Before that, it should be analyzed whether the information of the overlay can be modeled with the core ontology.
|
Action: Need team raising the issue to provide an explicit data model proposal and concrete sample data |
Over specified |
[Adelard] As a general observation, I agree with the intention behind the MODEL abstraction, namely that the core RACK ontology should be structural rather than semantic, but this implies that some entities, attributes and relationships should be deleted from the core model. https://github.com/ge-high-assurance/RACK/wiki/Using-MODEL For example, CONFIDENCE does not belong in the core ontology - it’s clearly a semantic model - and even entities like PERSON and TOOL in AGENT are over-specified. Why should a PERSON be identified by an email address or a TOOL by a version? It is sufficient to identify PERSON and TOOL as different kinds of AGENT without being more specific about their attributes. Similarly, I would question whether HAZARD should include severity and likelihood properties, or whether these should be part of a HAZARD_MODEL |
Will not address: The MODEL abstraction causes longer query times by first tracing to the base structure and then selection a model attached to that structure. MODELs should only be used where there are several known distinct models for a structure, rather than used as a general solution. Are there specific structures where specific MODELs have been identified beyond REQUIREMENTS? |
PROCESS |
[Adelard] I like the idea of an OBJECTIVE, but I was surprised to find it defined under PROCESS.
The note refers to “tasks from a process” rather than “activities” - why introduce new terminology unnecessarily? The note also refers to “evidence that must be provided”, but this isn’t reflected in the definition of OBJECTIVE, which simply links objectives to activities. |
Short-term: In V9.0 there will be additional updates to OBJECTIVE. A “note” is just a textual string to try and give some color to the concepts. Somewhat related to this is RACK PR #540 which has been pushed into master branch. |
Process of updating | [LM-ATL] Data model updates are not agile. Improvements to overlays and data models contributed by performers, e.g. SACM do not get disseminated to other performers quickly. |
Will not address: complex data models need significant review and acceptance by users, much like complex software. A versioned release system lowers risk to the program and gives users time to plan for changes. Regarding nodegroups, we now have auto generated nodegroups in V8.0 that performers can utilize, so that should allow you to generate dataset more quickly. |
REQUIREMENT | [RTX] The concept of system/high/low level requirements is general, and it is important to differentiate them. Currently there is types specific to Boeing which are not necessary. It makes the node groups not reusable. | Will not address: This should be defined in the overlay. Generic queries can be made that use the REQUIREMENT class that can grab the hierarchy of requirements. The difficulty with this is how many levels of requirements do you have, not all developments have the same number of levels, this is the reason for using the overlay to capture the project specific details, |
REQUIREMENT |
[GrammaTech] REQUIREMENT granularity is too coarse
[Adelard] System requirements, High-level requirements and Low-level requirements should be distinguished in the core ontology [SRI] Subclasses and collections of REQUIREMENT: As others have pointed out, it would help if we organized subclasses of REQUIREMENT based upon DO-178C guidance which is quite generic. Currently there is one REQUIREMENT class in RACK. It would be better to have subclasses for system-level, software HLR, and software LLR with specific attributes in each and restricted ranges for the relationships. We ended up making such subclasses in the SRI overlay |
Will not address: Granularity of REQUIREMENTs is based on the data provided by TA1s. |
REQUIREMENT |
[Adelard]
|
Will not address: In RACK v8.0, Boeing.sadl already allows for “set of requirements” etc. |
REQUIREMENT |
[GrammaTech] REQUIREMENT subclassing limits applicability to MIL-STD-498/490 derived standards.
|
Short-term: referred to data model decision team for consideration. |
REVIEW |
[Adelard] The concept of a REVIEW_LOG is too simplistic: A review typically results in a set of observations – these might be categorized in some way, for example, as technical vs editorial. Each observation needs to be resolved – accepted, rejected, or perhaps held over to the next version. The agreed changes then need to be applied to produce a new version and there should be some kind of confirmation / approval that the changes have been applied correctly. For this reason, the definition of REVIEW_STATE is too prescriptive. REVIEW_STATE should reflect the status of each observation – Open / Unresolved, Accept, Reject, Hold Over, etc. – and should be modelled as an extensible set of values. The values ReviseWithoutReview and ReviseWithReview are particularly problematic because they mix up review status, version, and approval. In principle, once the review findings have been agreed, a new version of the reviewed artefact should be produced that addresses the review comments, and an independent check should be made that the comments have been acted upon correctly. So a typical review cycle might look like this: Version X -> Review comments -> Agree resolution -> Apply comments -> Approve changes -> Version Y In particular, it should be possible to check that the review comments have been correctly applied and to link each change to a review comment. It is also important to capture the purpose of the review, i.e. the review criteria. What can I conclude after a successful review / revision? It might be helpful to link the REVIEW to an OBJECTIVE from a standard such as DO-178C. The author attribute of REVIEW is unnecessary / misleading. Author is a property of the entity being reviewed, not a property of the review. Also, the person who requests the review is not necessarily the author of the entity being reviewed. |
Action: Modify the relevant overlay. If teams are interested in promoting the revision to the core data model, we will address. |
SOFTWARE |
[Adelard] Suggest separating the ontology for the Software Architecture (SWCOMPONENT) from the ontology of source code / binary artefacts and the activities that generate them. Binary components should be separated from software components – COMPONENT_TYPE should only apply to source code entities (Class, Package, Interface, etc.) There should be a clear mapping from the system architecture to the software architecture, and it should be possible to identify which software components implement particular system functions Similarly, there should be a mapping from the software architecture to the binary artefacts that are deployed on the hardware – something like a UML Deployment Diagram. Is it really necessary to model basic blocks in the RACK or are they only useful for specific analyses performed by TA1? Unless there is a need to share the basic block decomposition between TA1 and TA3, this level of detail is too fine-grained for the RACK. Surprisingly, there does not seem to be an explicit concept of Source Code or Executable Code in the core ontology |
Action: Need team raising the issue to provide an explicit data model and concrete sample data |
SWCOMPONENT | [GrammaTech] Removal of control flow information (SWCOMPONENT.ConditionalSuccessor etc.) that has been indicated to be of interest to some TA3s. | Already being tracked on the Data Model Proposals board: https://github.com/ge-high-assurance/RACK/projects/8#card-60351600 |
SYSTEM |
[Adelard]
|
Action: Need team raising the issue to provide an explicit data model and concrete sample data |
SYSTEM | [SRI] Architecture: System or software architecture specification also deserve first-class status in the ontology. In SRI overlay, we added a class and activities for SystemArchitecture, with specific relationships to entities such as SYSTEM. A SoftwareArchitecture | Short-term: Adding to Boeing.sadl in v9.0. RACK Issue #576 and Data Model Proposal Card. |
TEST | [JHU-APL] Summary of test results in RACK instead of raw test results | Action: Need team raising the issue to provide an explicit data model and concrete sample data |
TEST | [RTX] The only way to group entities is via subtyping (like SubDD, SRS). Example of what we need is, test that are meant to address robustness and test that address normal cases, agents that have a lot of experience from junior ones, etc. | Will not address: This is already possible but it requires added data to an overlay ontology and then using that as part of a query, for example if you added a “jobTitle”, to an Engineer, then you would be able to do searches that could exclude “Senior Test Engineer” or “Test Engineer”. But this is predicated on this information being populated. |
TEST |
[Adelard] DO-178C distinguishes between a Test case and a Test procedure. Roughly speaking, a Test case is the specification of a test and a Test procedure is an implementation of a test – logically, these are not the same thing. The RACK concept of a TEST blurs the distinction between the two concepts – they should be modelled separately. Similarly, there should be two separate TEST_DEVELOPMENT activities, one for specifying a test and one for implementing a test. TEST_EXECUTION has an executed_on property of type AGENT, which is described as “AGENT(s) (e.g. some testing software/machine) running those tests” I think the testing software that executes the test (test harness) should be considered to be part of the test implementation – in contrast, it makes sense to model the computer / execution environment on which the test runs, but there is no MACHINE agent. |
Short-term: To be fixed in RACK Issue #573 |
TEST | [SRI] Test Obligations, Test Cases, and Test Procedures: There is a single class TEST in RACK that is not sufficient to capture the attributes semantics across all the testing-related entities, as others have also noted. It would help to use generic guidance from DO-178C here. In Do-178C, a test case is an abstract specification that consists of test criteria for requirement coverage (i.e., what aspect of requirement clause/subclause does the test cover), test description, trace to the requirement or to test obligation (oracle), test inputs values and expected output values applied to the component under test. A test procedure is typically a script that applies the input values to the component, measures output values and compares them to expected outputs of test case. One may merge test case and procedure as long as all the relevant semantics are captured; but the TEST class in RACK doesn’t have those attributes | Short-term: To be fixed in RACK Issue #573 |
Theories | [SRI] Theories: There should be a discussion of theories and how they are used in assurance argumentation. We used requirement-based testing theory to support a claim that the test oracles test all aspects of requirements’ behaviors per the DO-178C guidelines. Questions: To what extent should a theory be explicitly described in RACK evidence? Should it be opaque (i.e., read a sperate document) or should a theory’s claims/proofs should be explicitly modeled in RACK? | Action: Waiting on specific theories proposal. |
Tool Qual | [SRI] Tool installation and qualification data: Besides the tool version, the actual installation configuration of the tool on a particular PC/OS is essential to establish confidence that the tool invocation produces correct results. The tool installation instructions must come with the installation configuration/verification checklist that should be filled in for this purpose when the tool is installed. Also, tool qualification data (pertaining to the particular tool version) must be included in the evidence | Short-term: Will be there in v9.0. RACK Issue #574. Has been pushed into master branch. |
[JHU-APL] JSON file pointers in data (have to pull reference on filesystem) | Makes sense. |
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address/ Comments [& Owner] |
---|---|---|
[JHU-AP] Unable to programmatically discover new evidence in RACK Suggestions:
|
Much of this is available in the new Explore instance counts. Others are available through the ontology. We could add to python API. Need to convene session to flesh this out and assess priority. –Paul Partially addressed with explore tab: Release 8. Created https://github.com/ge-high-assurance/RACK/issues/583 - but it would be nice to confirm that python is the missing piece for JHU. The system does provide this information in other ways. |
|
[JHU-APL] Have to post-process results of RACK queries to perform joins / intersects / sort / select operations Suggestion:
|
There seems to be consensus this would be useful. It is longer-term to design & build. -Paul Needs team effort to define this task. This could be a big task. Important to talk to folks and get the requirements correct. https://github.com/ge-high-assurance/RACK/issues/584 |
|
[LM-ATL] Query engine lacks a fast method for iteratively identifying all direct links to/from a given object within the RACK i.e. when looking at a particular TEST, I should be able to quickly find all other objects with a direct relationship to that TEST and then, without writing any querying language or performing any drag-and-drop operations, I should be able to select one of the objects that had a relationship to the TEST and find all of its direct relations |
This would be a useful explore tool. It could be built in the medium term. SPARQL fully supports this type of thing. -Paul (Release 9) https://github.com/ge-high-assurance/RACK/issues/585 |
|
Constraints | [RTX] Queries with “minus” have limitations. For example, given me all requirements that do not have a “Passed” analysis. “Minus” is very important since we are always looking for missing data. | Probably a training & doc issue. A meeting to see a bunch of examples might help. -Paul |
Constraints | [RTX] When variables are used in the filter, they need to be returned. If not, the data is not filtered. |
Sounds like a bug worthy of short term solution - Paul Release 8. Is this fixed in v 8.0??? Yes. Confirmed fixed. I can’t find the issue |
Delete |
[RTX] Deleting names of variables
|
Sounds like a bug worthy of short term solution – Paul Need a concrete example. What is a “remoted runtime variable” and what does it mean to “delete” it? |
Nodegroup Editing | [RTX] Having an undo, copy (a specific portion of the query from another node group), and paste mechanism would help a lot (if it already exists then it would be good to learn where to find it). |
Undo would be useful and is doable in the short-medium term. https://github.com/ge-high-assurance/RACK/issues/586 Copy/paste/merge should be part of the “join / intersect” task above. RACK issue 584 -Paul |
Performance |
[RTX] How to group analysis to reduce the scalability problems. RACK could not ingest the data, when we tried to add couple of analysis for each software component (66K). Another example, “REQUIREMENT_CONSISTENCY_ANALYSIS” should one analysis output be connected to all requirements or one analysis output too all of them.
|
All these ingest Performance issues require some careful analysis and improvements may be limited while we’re using desktop docker fuseki RACK. |
Performance | [RTX] I have noticed that during data ingestion process sometimes it needs a few tries for the data to successfully get ingested. The reason for failure (whenever it happens) is the connection timeout issue. | |
Performance | [RTX] It is not clear how to optimize the ingestion queries. | |
Performance | [RTX] Loading data in the RACK is too long. GrammaTech Data took 1 hour. We added connection between software component and files, and it took couple of hours. At the end, we tried to add analysis to every software component and the ingestion never ended. | |
Performance | [RTX] Queries related to software components took longer than other. However, Lockheed Martin’s queries are the one the take the longest, couple of minutes. | |
Performance | [SRI] There should be some support for ingestions of files. If files cannot be physically put in the main database due to performance issues, some support can be provided in SPARQLgraph for navigating to the file based upon the filename in RACK. | |
Query Types |
[RTX] Types of queries.
|
Short term –Paul SPARQLgraph remembers default query type, and REST endpoint availabile in Release 8. https://github.com/ge-high-assurance/RACK/issues/587 Add to python interface https://github.com/ge-high-assurance/RACK/issues/588 Fully support ASK https://github.com/ge-high-assurance/RACK/issues/395 |
Query Types | [Adelard] SparqlGraph does not support aggregate queries (e.g. GROUP BY / HAVING) |
GROUP BY and aggregate functions are in next release. Release 8. done HAVING clause Added HAVING to tech debt https://github.com/ge-high-assurance/RACK/issues/590 |
Query Types | [Adelard] SparqlGraph does not support MINUS queries (set difference) | Yes it does. |
Query Types |
[Adelard] SPARQL does not support functions over attributes
|
aggregate functions are in next release. done |
Pertains to: | Specific Problem |
Long Term / Short Term / Will Not Address/ Comments [& Owner] |
---|---|---|
Shared RACK | [LM-ATL] As soon as is reasonable, preferably by the next assessment, a live, shared RACK instance should be accessible to all TA1s and TA3s for the purpose of pushing and pulling data. The production, proliferation, and utilization of ingest packages as a workaround for all performers having separate, local instances of the RACK was a large drag on productivity. |
Pertains to: | Specific Problem |
---|---|
SADL | [SRI] Ontology: Use of SADL to specify the ontology worked well. A useful feature was to be able to restrict the range of a subtype of a relationship from the parent class. This allowed “type checking” the foreign key references in CSV files to the classes in the restricted range. |
COLLECTION and MODEL | [SRI] The COLLECTION and MODEL classes were quite useful – we had several subtypes of these in the SRI overlay. We may want to add some attributes to indicate COLLECTION semantics; i.e., what is the purpose of the collection? what does it mean for the entities to be in a collection – do they form parts of a whole? Can an entity participate in more than one collection? |
CDR ingestion | [SRI] Capability to ingest CSV files flexibly using nodegroups with blank columns and simple order. We had many cross relationship between objects but the ingestion part wasn’t difficult. Some additional checks/support would be useful as noted in section C (see DesCert-RACK-Gaps-Analysis.pdf, section C. Issues Faced in Creating Evidence for Ingestion). |