adding_contribution_to_orkg - Sidies/MasterThesis-HubLink GitHub Wiki


title: Adding a new Contribution to the ORKG

If your intention is to use the existing ORKG implementation and add your own data into the ORKG, this page explains how to do that.

Important: When changing data in the ORKG graph, make sure that the first time any experiment is run, the Graph config of the ORKG is set to update the Cache:

# In your ORKG config file, add the following:
"additional_params": {
    "force_cache_update": True, # Set this to True the first time you run the experiment to update the cache.
    "force_publication_update": False # Only set this to True if you want the publication to be deleted and reinserted
}

Adding a new Contribution to the ORKG

First of, the class that is responsible for adding data to the ORKG is the ORKGKnowledgeGraphFactory class located in the knowledge_base/knowledge_graph/storage/factory/implementations folder: ../blob/experiments/sqa-system/sqa_system/knowledge_base/knowledge_graph/storage/factory/implementations. The class is a subclass of the KnowledgeGraphBuilder class and responsible to ensure that the data with which the experiments are conducted is added to the ORKG.

At the core, it uses an additional parameter called contribution_building_blocks which is a list of blocks that are used to add the contributions. It is defined as:

AdditionalConfigParameter(
    name="contribution_building_blocks",
    description=("The building blocks to use for the graph creation"),
    param_type=dict[str, List[str]],
    default_value={"Publication Overview": [
        block.value for block in AnnotationBuildingBlock]},
)

It can for example be added to the configuration file as follows:

"additional_params": {
    "contribution_building_blocks": {
        "Paper Class 2": [
            "paper_class"
        ],
        "Research Level 2": [
            "research_level"
        ],
        "Research Objects": [
            "first_research_object",
            "second_research_object"
        ],
        "Validity 2": [
            "validity"
        ],
        "Evidence 2": [
            "evidence"
        ]
    }
}

This is a dictionary where the key is the name that is used for the contribution and the value is a list of building blocks that are used to add data to the contribution. Consequently, when you want to add data to the ORKG in a way not yet implemented in the SQA system, you need to implement a new building block. The building blocks are located in the knowledge_base/knowledge_graph/storage/factory/implementations/orkg_knowledge_graph/orkg_template_builder/implementations folder: ../blob/experiments/sqa-system/sqa_system/knowledge_base/knowledge_graph/storage/factory/implementations/orkg_knowledge_graph/orkg_template_builder/implementations.

1. Implementation of the Building Block

You need to create a new Python file for the building block in the implementations folder. Here you implement the code that adds the data to the ORKG. You can choose between three types of building blocks from which your implementation can inherit from:

  • PaperContentBlock: This class is used to add data to the ORKG based on the extraction of the fulltext of the paper using the PaperContentExtractor class.

  • AnnotationsBlock: This class is used to add data to the ORKG based on the additional_fields field of the Publication class.

  • MetadataBlock: This class intents to add metadata about a publication to the ORKG. It receives the complete Publication object.

All classes have a build() method that is used to add the data to the ORKG. The method should return a well-defined dictionary according to the ORKG documentation. The structure is as follows:

{
    "P123": [ # The predicate ID of the data to be added to the contribution. Here we add a literal.
        {
            "text": "The text of the data to be added to the contribution",
        }
    ],
    "P456": [ # This example is more complex as it contains a resource and not a literal as object.
        {
            "classes": ["C123"], 
            "values": {
                "P123": [
                    {
                        "text": "The text of the data to be added to the contribution",
                    }
                ]
            },
            "label": "A label for the resource object"
        }
    ]
}

2. Registration of the Building Block

After implementing the building block, you need to register it in the ORKGKnowledgeGraphFactory class. The registration is done in the respective enum AnnotationBuildingBlock, FulltextBuildingBlock, or MetadataBuildingBlock class. You need to import your building block class and add it to corresponding enum.

🥳 That's it! You can now add your building block in the configuration to add a new contribution to your papers. For example:

{
    "additional_params": {
        "contribution_building_blocks": {
            "Your new contribution": [
                "your_new_building_block_1",
                "your_new_building_block_2"
            ]
        }
    }
}
⚠️ **GitHub.com Fallback** ⚠️