10 13 22 Project 4 Data Analysis - rolandgriggs/URE2022_XGXaviGriggs GitHub Wiki

Project 4. Impact of the number of database updates on long term database availability

Research Aim :

Investigate if the number of databases updates impacts the availability of databases. I followed protocol 5 (version 2022)

Research questions:

  1. How many databases have a very high number of published updates (more than 5 updates), a medium range number of updates (5 to 2 updates), and no updates?

we calculated the number of databases that had either 1, 2-5 or more than 5 updates using excel. we have concluded that there were 1632 databases with only 1 update, 588 with 2-5 updates and 122 with more than 5 updates.

  1. How many databases (published more than 10 years ago) have recent updates (updates in the past 10 years)?

we calculated the amount of databases that older than 10 years with no updates, old updates, and recent updates. we have concluded that there are 736 databases with no updates, 263 with recent updates and 236 with old updates.

  1. What is the proportion of available/unavailable databases with a high, medium or no updates? we calculated the number of no update, 2-5 update and 5+ update databases between available and unavailable databases. we have concluded that is the database is available it is more likely to have a higher number of updates. T

Dataset description:

2343 entries (1 entry per database. Excluded the Databases never published online.

Variables included:

  • db_id : Unique identifier for the database in JL_DB dataset
  • resource_name : Name of the database
  • first_publication : Date of the first article publication of the database
  • Nb_of_articles : Number of publications for that database. If equal to 1, then the database had no published updates, if superior to 1, the database was updated.
  • last_publication : Date of the last publication. Equal to first publication if only one article was published for that database.
  • available_2022 : TRUE if the database is available online in 2022, FALSE if not