2.3.Using the LORIS Data Query Tool (DQT) - MontrealSergiy/neurohub_documentation GitHub Wiki
This documentation provides a brief overview on how to perform the data selection (image data and behavioural data) of the COVID-19 datasets in the UK Biobank and it’s visibility into NeuroHub.
Context
ukb-covid.loris.ca holds a reduced set of data for the UKBiobank participants having COVID19 records. From the Data Query Tool (DQT) module, you will be able to query that dataset for any variable (fields) along with T1w provided in our ukbb application. Clicking the Export to NeuroHub button, all selected fields will be saved in a csv file in your NeuroHub project. If the selected fields contain images files, a special file (CBRAIN File List) will be created. This special file will allow users to use the copy of the file that resides on Compute Canada without avoiding the creation of a copy of all those image files.
Prerequisite
You need an approved UKBiobank account before requesting a LORIs account (details on the application process can be found here. After having filled the request account form for UKBiobank data access, you should have received an email from [email protected] containing a password to access a LORIS instance.
Step by step
- go to https://ukb-covid.loris.ca/
- login using your email address and the password from the email.
- If it is the first time you login, you will be prompted to enter a new password
- you are directed to your dashboard
- use the menu Report > Data Query Tool at the top to access the DQT module here
- now, you can see the overview of the DQT
Query the data
Define Fields
- Select Define fields
-Instrument: To select a field, first choose an instrument then select the desired fields. The UKbiobank organized the data in a tree where nodes are categories and the leaves are the fields (variables). The instruments in the DQT are the first parent nodes of each field.
- ex: Vocabulary level is under its parent category here.
- *Special cases for demographics, images and covid19.
- Demographics contain LORIS specific fields about the participants (e.g.: Date of birth, Sex).
-
- Note about date of birth. The dataset only contains the year of birth and month of birth. We set the day of birth to the 15 of each month for every participant.
- use mri_data and select Selected_t1 w to get access to the structural images
- you will see a download icon, indicating that these fields are files by themselves (instead of scalar values), thus they will be handled differently during the Download and Export step
- Search with instrument: Here, you can enter a free text (e.g. T1w)
- Visits: with this option you can further narrow your search
- click on the files you are interested in, these are then summarized under Fields on the right side of the page
- another useful instrument is ukbb_covid19, which includes the fields laboratory, origin, result, specdate, spectype).
Define Filters (optional)
- to add a filter, select an instrument and a field then enter the condition (operator + value)
- filters can be grouped using AND or OR operators. Use the Add Group and Add Rule button to create the desired filters
View data
- click Run Query to see the results. It can take a few minutes to load.
- the data can be downloaded (Download Data as ZIP) in the browser or exported to NeuroHub (Export Data to Neurohub)
- Download Data as a csv
- This option is available at the bottom of the data table using the Download Table as CSV button. The data in the table will be sent to the browser as a single csv file. For images, the url of the file will be “displayed” in the csv.
- Download Data as ZIP
- This is identical to the “Download Data as a csv” with the exception that Image files will be downloaded automatically and zipped in the browser.
- Export Data to Neurohub
- a modal window will appear asking to provide a NeuroHub token
- to obtain that token, go to My Account page in NeuroHub
- copy the token in the API token section (Neurohub API token)
- Download Data as a csv
- the exported data is listed under Latest Updated Files on your Neurohub Dashboard
Statistical Analysis
- this option allows to perform summary statistics (e.g. min, max, avg)
- additionally, an R-square plot (Scatterplot) can be created for the variable of interest