CEVOpen Blog Ideas - petermr/CEVOpen GitHub Wiki

Draft here

Sections

  • Context: Global challenges
  • Problem: Closed publishers, unstructured data. Publishers decide what you see
  • A Potential solution - Power to the reader -> ami
  • Case study - Proof of concept - Invasive species, medicinal activities, or climate change
  • Call-to-action for the readers - Take a look at our open tools and test it out!

Blog's intention:

we intend to introduce scientists (not necessarily data scientists) to open text mining tools and their role in tackling global challenges.

Thoughts on the blog and key messages:

  • Simon Worthington: Input 6 July 2021 SW.
    • See the blog as a stepping stone to the goal of getting 'domain experts' to use the tool set; be able to run Citizen Science projects.
    • For the expert in a field I think we need to answer the question of 'what's in it for them? How will they benefit.
    • Frame the blog as the 'quick start signpost' (which is what is already being one). Suggestions to include: position as Cititen Science project; emphasise the translation part; Citizen Library is a nice concept; why is it better, or different to a search engine - the case needs to be made or anyswers given that will pique the readers interest (I liked the verifyable search - in that there is a data trail), but must be other better examples.
    • Here is the journey I can see my contribution being to the project and that the blogpost plays a 'launch' part in (the media campaign).
    1. Blogpost
    2. Pitch to November 'Global Knowledge Justice' workshop Santiago de Chile. Position CEVOpen and translation project https://docs.google.com/document/d/1hrCnxxD4c2Ocjlr4RBVKCZv_klTPFEt6wGKNOjj0ABA/edit
    3. Prior to workshop build foundations of a quick start guide by listing example projects and writing up case study
    4. Prior workshop make a quick start guide
    5. Post workshop make a full guide
    6. Plan, simulate, trial a Citizen Science project on rapid decarbonisation for a locality, get Open 'Energy community/modelers onboard. Start small and see if we can find a way to have CEVOpen show benefits, rinse repeat. Make guide.
  • PMR:

Present Scenario:

Climate change, pandemics, and so on. How to tackle these challenges as humanity progresses? With the advent of technology and automation, we ought to make the best use of it. But are we really doing it? How can automation and mining text data help in contributing to solve global challenges?

Defining the problem

You have read a newspaper article about climate change, and you’d like to dig deeper. You open up a paper filled with jargon and terms you don’t understand. How would you make sense of it?

Or if you are a researcher and would like to know more about something that is out of your domain expertise. With research articles mainly aimed at communicating knowledge only with the experts of the area, how can penetrate through these barriers? What makes the problem worse is the plethora of scientific papers that are out there.

One other grave problem is that publishers of scientific articles control what you see. This is a problem whose solutions lie beyond just creating tools. The involvement of communities plays an essential role here in changing the current model of publishing.

Explain the manual process of going through a paper You can then correlate to how we can automate some of these steps.

Potential solution

There are several ways of tackling such problems. Various tools aim at doing exactly that. Scholia, open-knowledge map, and so on. We would like to concentrate on the tools we are developing.

Text-Mining and Wikidata integration - one of the solutions At CEVOpen, we are building tools that automate some parts of the workflow. The main focus of pyami is integration with Wikidata. Wikidata, for those who aren’t aware, is a sister project of Wikipedia. You could think of it like a … . Our tools, at the very least, can go through scientific papers and annotate them with the terms we provide. We call the collection of terms -> ami dictionaries. These dictionaries are customizable, and you can create your own ones. Some include country, organization, plant genus, and so on. You can find the full list of available dictionaries here.

What’s interesting with integrating with Wikidata is that we can get simple definitions instantaneously. Wikidata like Wikipedia also stores information in multiple languages. This allows non-native English speakers to look up definitions for terms in their native language.

If all of this seems a little abstract, we’ve got a case study for you to better appreciate the technology.

Case Study: Ethics Statement, Invasive Species or Activity.

An example for putting the technology to use in a targeted fashion.

Ami is still developing, and would very much appreciate any inputs from the early adopters and user community. As we have seen with the pandemic, openly sharing knowledge is key to solve global challenges like climate change, and so on.