ICP8 - smqhw/kdm1 GitHub Wiki
1.Description of the ICP
In this ICP, we will extract subject-object-verb triplets from some text abstracts, construct their ontologies in Protégé, and view them using WebVOWL.
First, our code. This is a simple task: open the desired file, use spaCy to perform the NLP extraction, and print the results. The texts I am using are the same as in ICP5, concerning the nature of quantum physics and consciousness. These are the extracted triples from the first abstract.
Here are the extracted triples from the second abstract.
I received fewer triplets than I had anticipated. As an experiment, I wrote a brief story and ran it through the code. I received more triplets than previous trials:
Now it is time to construct the ontology for these abstracts. I've chosen the `abs3.txt', since I think it is the most apt for an ontology web. Using Protégé, I will render both the subjects and objects as class objects, while the verbs will function as object and data properties.
After defining the objects and properties, and linking them via their triplet relationships, the resulting ontological web looks like this:
-
Video is provided in the Code file
-
Conclusion
This is an awesome visualization of the abstract. An ontology created from an entire paper would appear much denser. I think for the architecture to be as accurate as possible, a weighing system should consider the number of times a particular triplet appears, and give more frequent triplets more importance.