2.4.LORIS Data Query Tool (DQT) - neurohub/neurohub_documentation GitHub Wiki
Using the LORIS Data Query Tool (DQT)
“LORIS (Longitudinal Online Research and Imaging System) is a self-hosted web application that provides data and project management for neuroimaging research. LORIS makes it easy to manage large datasets including behavioral, clinical, neuroimaging and genetic data acquired over time or at different sites.” https://github.com/aces/Loris
LORIS is a core component of the NeuroHub platform. This special instance of LORIS has been adapted to house all the brain images of the UK Biobank dataset along with these participants' “tabular” data.
- 39 677 participants
- 182 instruments
- ~ 20TB of nifti image files (t1, dwi, flair, rfmri)
The DQT module allows users to select variables of interest and apply filters on the population. It is harnessing the power of a “nosql” database by querying prebuilt indexes built from map-reduce functions organized in views. The results can be downloaded as CSV and file attachments or exported to NeuroHub using CBRAIN’s HTTP API.
A step-by-step guide is available via the video from the Workshop: Accessing the UK Biobank and the Data Query Tool!
Step by step
- Login to LORIS https://ukbb.loris.ca/
Example #1
Load Existing Query and run it.
- In the Shared Saved Queries section, choose any Saved queries
- Go to Run Query step and click the Run Query button
Example #2
Create New query
- Go to Define Fields step
- Select the category of your interest in the search bar. Look for t1_structural_brain_mri__2_110
NOTE: In the search bar, you will have to key the category number (here 110) and you will get the category results
- The complete list of data fields linked to that category will be listed under:
NOTE: Data field can NOT be searched by the field ID, you need to search by name or select the fields you are interested in.
- Select the fields of your interest
-
Go to Define Filters
-
Add mandatory filter(s). Note, a minimum of one filter is required to run the query
-
Click on the Add rule button
-
Select the filters of your interest and indicate the necessary criteria(s)
-
Run query
For information on data fields, categories and encoding, please visit the UK Biobank Data Showcase. The schema files used to organize the data in this database can be found at https://biobank.ndph.ox.ac.uk/showcase/schema.cgi
Example #3 Export Data to Neurohub
- Login to NeuroHub
- go to My Account page and generate an API token by clicking on + New API Token
- In the DQT, run query
- Click on the Export Results To NeuroHub button
-
Paste your token in the prompt
-
Click Yes, export
- A pop up window will appear confirming a cbcsv file and csv file have been creating
- In CBRAIN, the files will be available in your personal user project under files
Please note that in order to run a tool in CBRAIN, the required format will be cbcsv
UK Biobank Data Dictionary index
To help you to select the required fields for your query, a Data Dictionary index is available here. This page is to provide guidance on name, description, ID category of the UK Biobank fields. The Data Dictionary and the UK Biobank Data Showcase can be used to identify the variables you are interested in order to create your query.
Using the DQT for the COVID-19 dataset in the UK Biobank
This documentation provides a brief overview on how to perform the data selection (image data and behavioural data) of the COVID-19 datasets in the UK Biobank and it’s visibility into NeuroHub.
Context
ukb-covid.loris.ca holds a reduced set of data for the UKBiobank participants having COVID19 records. From the Data Query Tool (DQT) module, you will be able to query that dataset for any variable (fields) along with T1w provided in our ukbb application. Clicking the Export to NeuroHub button, all selected fields will be saved in a csv file in your NeuroHub project. If the selected fields contain images files, a special file (CBRAIN File List) will be created. This special file will allow users to use the copy of the file that resides on Compute Canada without avoiding the creation of a copy of all those image files.
Prerequisite
You need an approved UKBiobank account before requesting a LORIs account (details on the application process can be found here. After having filled the request account form for UKBiobank data access, you should have received an email from [email protected] containing a password to access a LORIS instance.
Step by step
- Go to https://ukb-covid.loris.ca/
- Login using your email address and the password from the email.
- If it is the first time you login, you will be prompted to enter a new password
- You are directed to your dashboard
- Yse the menu Report > Data Query Tool at the top to access the DQT module here
- Now, you can see the overview of the DQT
Query the data
Define Fields
- Select Define fields
-Instrument: To select a field, first choose an instrument then select the desired fields. The UKbiobank organized the data in a tree where nodes are categories and the leaves are the fields (variables). The instruments in the DQT are the first parent nodes of each field.
- ex: Vocabulary level is under its parent category here.
- *Special cases for demographics, images and covid19.
- Demographics contain LORIS specific fields about the participants (e.g.: Date of birth, Sex).
-
- Note about date of birth. The dataset only contains the year of birth and month of birth. We set the day of birth to the 15 of each month for every participant.
- use mri_data and select Selected_t1 w to get access to the structural images
- you will see a download icon, indicating that these fields are files by themselves (instead of scalar values), thus they will be handled differently during the Download and Export step
- Search with instrument: Here, you can enter a free text (e.g. T1w)
- Visits: with this option you can further narrow your search
- click on the files you are interested in, these are then summarized under Fields on the right side of the page
- another useful instrument is ukbb_covid19, which includes the fields laboratory, origin, result, specdate, spectype).
Define Filters
- to add a filter, select an instrument and a field then enter the condition (operator + value)
- filters can be grouped using AND or OR operators. Use the Add Group and Add Rule button to create the desired filters
View data
- click Run Query to see the results. It can take a few minutes to load.
- the data can be downloaded (Download Data as ZIP) in the browser or exported to NeuroHub (Export Data to Neurohub)
- Download Data as a csv
- This option is available at the bottom of the data table using the Download Table as CSV button. The data in the table will be sent to the browser as a single csv file. For images, the url of the file will be “displayed” in the csv.
- Download Data as ZIP
- This is identical to the “Download Data as a csv” with the exception that Image files will be downloaded automatically and zipped in the browser.
- Export Data to Neurohub
- a modal window will appear asking to provide a NeuroHub token
- to obtain that token, go to My Account page in NeuroHub
- copy the token in the API token section (Neurohub API token)
- Download Data as a csv
- the exported data is listed under Latest Updated Files on your Neurohub Dashboard
Statistical Analysis
- this option allows to perform summary statistics (e.g. min, max, avg)
- additionally, an R-square plot (Scatterplot) can be created for the variable of interest