Data Dictionary Database Overview - NAVADMC/ADSM GitHub Wiki

Output Definitions

There are outputs in 2 formats, a database and in extra "supplemental" files. Both are defined in this document.

Table Definitions within the SQLite database

Name Short Name Description Variable Count
Results_DailyByProductionType DailybyPT By iteration, by day, by production type output variables 223
Results_DailyByZone DailybyZone By iteration, by day, by zone output variables 8
Results_DailyByZoneandProductionType dailybyZoneandPT By iteration, by day, by zone by production type output variables 9
Results_DailyControls DailyControls By iteration, by day, by production type output variables 33
Results_UnitStats UnitStats A herd level summary of selected events 6
Results_resultsversion Version Provides the version of ADSM that results were generated from 3

Concepts

  • One iteration, or run, is one example of what could happen in a simulation.
  • Only one run may be needed, depending on the question you are attempting to answer. An example of this use case is to provide data for an emergency response exercise.
  • For an analytical question, many runs may be needed. Each iteration will simulate the inherent variability in biological systems, providing a range of possible outcomes.
  • The data dictionary has been organized to group similar fields together, and to present definitions in a more intuitive order. The order of fields in the actual database is not intuitive.
  • Comma-separated value (.csv) is a commonly used output method.

Where are the variables I see in the application in Results Home and in the graphs?

  • ADSM uses a SQLite database to store both input parameters and output variables. Since SQLite is a relational database management system, it is necessary to use the correct tool to access the database.
  • A SQLite tool (SQL Explorer) is built into the ADSM interface starting in version 3.5 and can be accessed in the SQL panel on the right side of the screen. Access the full results dataset using SQL Explorer panel
  • SQLite Studio is one example of a free tool that can access a SQLite database. Microsoft Access is another tool, but requires SQLite drivers to make a connection. We cannot install either of these tools for you, as they belong to third-party providers.
  • R can access the SQLite database, but requires a meaningful SQL query to make sense of the data.
  • The SQLite file is in the ADSM workspace, within the scenario folder, named with the scenario name and the extension ".db".
  • Example database queries are packaged in the ADSM workspace, in the directory called Example Database Queries.

Why does this require so many steps to get to the data?

  • ADSM needed a method to process the simulation and provide all output datasets as rapidly as possible.
  • ADSM also needs to work for both small examples and for large, complex examples.
  • Using the combination of SQLite files and csv files, users can get many output variables at a daily level, plus more output variables at the herd level as quickly as possible.
  • Using SQL queries allows manipulation of large amounts of data. For example, the 223 variables in Results_DailybyPT that are returned by iteration, by day and by production type require some arranging and sorting just to get a basic understanding of the output.
  • In addition, SQL can be used to join results to parameters, to provide a more user friendly output, such as a zone name or a production type name
  • If you would prefer to manipulate raw data, it is provided immediately after the iteration is run, in the settings/logs folder by iteration. These folders are overwritten on each run. This is NOT a recommended method of operation for a novice user.

How can I get a csv file?

  • A csv file can be exported using any of the query tools. Access the full results dataset using SQL Explorer panel
  • It is much more efficient to understand the data before dumping a big load of it out in a csv file that requires manipulation with Excel.
  • Example Database Queries are provided in the ADSM Workspace to help users get started with SQL.

What are all these different software packages in this process?

Every part in ADSM has a set of tasks that are organized to manage simulation and analysis efficiently.

Tool Description
ADSM A simulation engine that models the outbreak of animal disease. This software holds the logic.
ADSM User Interface Written in Python, Django with some javascript support, allows users to enter parameters in a user friendly format. Interacting with a simulation engine is a complex task. Most of us don't want to type out commands in a scripting language. The ADSM user interface assists with these commands.
ADSM Database SQLite database that is designed specifically for ADSM data. Every ADSM database has the same table structure. Therefore, queries are easy to move between scenarios to repeat an analysis.
SQLite SQLite is a relational database management system that stores data. The database enforces rules about the way each data table interacts with the other data tables. Databases maintain integrity of data.
SQLite Studio A tool that allows you to connect to a SQLite database.
MS Access Another tool that can connect to a SQLite database, if the correct drivers are installed. It allows a "query by example" interface.
R A widely used scientific analysis tool. Many pre-built routines are available to assist with data manipulation and analysis.
SQL Structured Query Language (SQL) is the programming language that allows a user to ask questions to the data in the database. These questions are called queries, or views. SQL is a commonly used language to work with relational databases.

What is the difference between Results_DailyBy_XXXX files and the Supplemental files?

Results Type Description
Results_dailybyXXXX These files are tables in the database. They hold descriptions of what events happen by iteration, by day across all herds.
Supplemental csv These files are output with a separate file for each iteration. They hold the detail by iteration, by day, by exact unit ID, making these files very large.
Daily_events_X Events include Detection, Tracing, Exam, Test, Vaccination, Destruction
Daily Exposures Exposures includes by herd exposure of direct and indirect contact. Airborne exposure is only included when an adequate exposure happens, as all exposures would be excessive data. The Results_DailyByPT shows the total airborne count.
Infection is also included in this file, meaning that the exposure was adequate to cause an infection, but MAY NOT HAVE if the unit was in a protected state (naturally immune or vaccine immune)
States_X States indicates the non-susceptible disease status a unit is in. Susceptible are not included to reduce the file size.

List of disease states

Transition State Single Character Code Numeric Code
Susceptible S 0
Latent L 1
Subclinical B 2
Clinical C 3
Naturally Immune N 4
Vaccine Immune V 5
Destroyed D 6

There are other tables in the SQLite database that start with ScenarioCreator_. What are those?

  • ScenarioCreator_ indicates that the table is holding input parameters for the model.
  • A data dictionary for the parameters is provide in a separate document.
  • Sometimes, the parameter tables are useful to join into queries to provide a user-friendly name.
  • For example, the database stores the Zone as a numeric identifier. It might be more useful for you to have the descriptive name that you gave the Zone in the user interface. ("Control Zone" vs 1)
  • The example queries packaged with the database show how to join in ScenarioCreator_ tables to get those user-friendly names.

SELECT
iteration,
day,
last_day,
zone_id, -- > not useful, use name instead
z.name, -- > gives us zone name
zonearea,
zoneperimeter,
numSeparateAreas
FROM Results_dailybyzone r
Join ScenarioCreator_zone z
on r.zone_id = z.id
ORDER BY -- Allows organization of results
1, 4, 2

Find more information about the database

⚠️ **GitHub.com Fallback** ⚠️