Quick start - diegosasso/workshop_ISH2023 GitHub Wiki

Let's describe three traits (1-3) of the hymenopteran insect from the figure above. It is a single individual female organism.

We will describe the traits in Phenoscript language in VScode, followed (optional) by converting it into OWL format and annotated natural language description in Markdown format using the Phenospy package.

Requirements

You need to have the Phenoscipt VS Code extension and the Phenospy package installed.

Note that to start using Phenoscript, one must first configure snippets in the phs-config.yaml file. Phenoscript extension comes with pre-cooked snippets from the AISM and HAO 2.0 ontologies by default, which we will use in this tutorial (no separate configuration is required).

Phenoscript Description: First example

  1. Open your 'my_description.phs' file in VS Code.
  2. Copy-paste the text below into 'my_description.phs'. Save the file again. Done!
OTU = {
  DATA = {
    uberon-female_organism:id-1 .rdfs-label 'Genus species';
  }

  TRAITS = {
    uberon-female_organism:id-1 > aism-protibia >> pato-black;
    uberon-female_organism:id-1 > hao-mesoscutum >> pato-convex;
    uberon-female_organism:id-1 > hao-metasoma >> pato-red;
  }
}

This simple Phenoscript description states that the female labeled 'Genus species' has a black protibia, a convex mesoscutum, and a red metasoma.

Semantic descriptions can be thought of as graphs with nodes representing phenotype entities and edges representing relationships between them. Below is an example of how this graph can be visualized.

graph LR

    A(uberon-female_organism:id-1) -- .rdfs-label --> B(<font color=black>Genus species)
    A(uberon-female_organism:id-1) -- .has_part --> C1(aism-protibia)
    A(uberon-female_organism:id-1) -- .has_part --> C2(hao-mesoscutum)
    A(uberon-female_organism:id-1) -- .has_part --> C3(hao-metasoma)
    C1 -- .bearer_of -->  D1(pato-black)
    C2 -- .bearer_of -->  D2(pato-convex)
    C3 -- .bearer_of -->  D3(pato-red)

    linkStyle default stroke:#0077be
    linkStyle 0 stroke:orange
    style B fill:orange   ;

Note that Phenoscript has the following syntax features. The symbol : represents a personalized tag that is any alphanumeric ID (i.e., id-1); this indicates that a node is the same across different lines of the code. This tag prevents us from having three different individuals (nodes) referring to uberon-female_organism. Furthermore, the symbols > and >> are aliases that serve as shortcuts to indicate has part and bearer of relationships respectively.

Introduction to Phenoscript provides detailed descriptions of the language's rules.

Now, let's try to do some descriptions by ourselves!!!

  • You can use the template below as a quick-start:

OTU = {

  DATA = {

    # Remove hashs and change if desired.
    # The metadata below uses information from additional sources such as Darwin Core and GBIF, and ontologies such as the 'Comparative Data Analysis Ontology' (CDAO) and 'Information Artifacts Ontology' (IAO).

    uberon-female_organism:id-2c5eb8[linksTraits = True, this = True, cls = 'dwc-Organism'] .rdfs-label 'org_Gryonoides_brasiliensis';
    uberon-female_organism:id-2c5eb8 .dwc-Catalog_Number 'XXXXXX';
    uberon-female_organism:id-2c5eb8 .iao-denotes cdao-TU .iao-denotes taxrank-species:id-6aba72;
    taxrank-species:id-6aba72 .dwc-Taxon_ID_TaxonID 'https://www.gbif.org/species/11754664';
    taxrank-species:id-6aba72 .rdfs-label 'Gryonoides brasiliensis';


  }

  TRAITS = {

    this > aism-protibia >> pato-black;
    this > hao-mesoscutum >> pato-convex;
    this > hao-metasoma >> pato-red;

  }

}

Converting Phenoscript to OWL and Markdown

This part is optional, but we will show it as a demonstration.

  1. In your workshop_ish2023 folder, create two subfolders output and md.
  2. Make sure the phs-config.yaml file is configured as explained here. The file phs-config.yaml contains all the necessary information for converting the Phenoscript file format PHS into OWL.
  3. In VS Code, go to "File", then "New File" and create a new file called my_script.py.
  4. Copy-paste the code from below into the my_script.py file. This code reads in the Phenoscript phs file, transforms it into XML and OWL formats (the files are stored in the output folder), and then converts the OWL file into annotated natural language description (the file is saved in the md folder).
  5. If you have the Python extension installed in your VS Code, then just run the code by clicking on the "play" button ("Run Python file") at the top-right corner of the VS Code console.

from phenospy import *
import os

# Get the current directory
current_dir = '/YOUR_DIR/workshop_ish2023'

# -----------------------------------------
# ARGUMENTS
# -----------------------------------------
phs_file    = os.path.join(current_dir, 'my_description.phs')
yaml_file   = os.path.join(current_dir, 'phs-config.yaml')
save_dir    = os.path.join(current_dir, 'output/')
save_pref   = 'Gryo_species'

# -----------------------------------------
# Convert PHS to OWL and XML
# -----------------------------------------
phsToOWL(phs_file, yaml_file, save_dir, save_pref)

# -----------------------------------------
# OWL to Markdown
# -----------------------------------------
# get owl file
owl_file = os.path.join(save_dir, save_pref + '.owl')

# Make NL graph
onto = owlToNLgraph(owl_file)

# Convert NL graph to Markdown
taxon = 'org_Gryonoides_brasiliensis'
file_md = os.path.join(current_dir, 'md', 'Gryo_species.md')
ind0 = onto.search(label = taxon)[0]
md = NLgraphToMarkdown(onto, ind0, file_save = file_md, verbose =True)


# -----------------------------------------
# Save Markdown to HTML
# -----------------------------------------

import markdown

# helper funciton
def save_html_to_file(html_content, output_file):
    with open(output_file, 'w', encoding='utf-8') as fl:
        fl.write(html_content)


html = markdown.markdown(md)
output_html = os.path.join(current_dir, 'md', 'Gryo_species.htm')
save_html_to_file(html, output_html)

  1. Open the Markdown file Gryo_species.md from the md folder in VScode and click Ctrl+Shift+V (Ctrl+Shift+V on macOS). The Markdown will be rendered into a formatted document that looks like this:

Relevant Info

  • Part 1. The main types of semantic constructs (quick overview).

  • Part 2. Hands-on activity (feel free to ask for help).

  • Part 3. Save your .PHS files and drop them in this folder

  • Example of a semantic description of a real organism (Gryonoides).

  • Templates with examples of different types of semantic statements can be found here.