Data Copy Specification (Related Datasets) - alexpron/shanoir-ng GitHub Wiki

Use cases

Copy from Solr

  1. create a new Study
  2. in Solr, select the datasets you want to copy
  3. click on "copy to study"
  4. select the study
  5. datasets are copied along with their entity hierarchy (exams, acq, ...)

Copy of a whole acquisition / examination / subject / study

  1. on an acquisition details page, clique "copy to study"
  2. select the study
  3. acquisition and all of it's datasets are copied to the study, along with the exam, also a new subject-study is added if needed

This could be developed as a second part.

Current rights data model

Currently a dataset's affiliation to a study is defined, in the data model, by a chain of references : Study < Examination < Acquisition < Dataset. Meanwhile, our current access rights are defined at the subject - study level, meaning for a given [subject, study] every sub data (exams, acquisitions, datasets) are all accessible or all forbidden.

+--------------+
|     User     |
+--------------+
      (*)
       |      +--------------+               +--------------+
       |------|  StudyUser   | (*)-------(*) |    Center    | 
       |      +--------------+               +--------------+
      (*)                                           | (1)
+--------------+                       +------------+
|    Study     | (1)---+           (*) |
+--------------+       |       +--------------+             +--------------+             +--------------+
                       |---(*) | Examination  | (1)-----(*) | Acquisition  | (1)-----(*) |   Dataset    |
+--------------+       |       +--------------+             +--------------+             +--------------+
|   Subject    | (1)---+
+--------------+

Access conditions by model entity

  • visible studies :

    • studies I am a member of
    • public studies
  • visible examinations

    • examinations in a study I am member of
      • but not if the examination is in a forbidden center
  • visible acquisition

    • acquisitions in an examination I can see
  • visible datasets

    • datasets in an acquisition I can see

Possible solutions

Direct relation between a dataset and a study (the old Shanoir way)

The old version of Shanoir had a "related dataset" functionality that was used to share datasets with studies.

----------------                              ----------------
|   Dataset    | (*)----------------------(*) |    Study     |
----------------              |               ---------------- 
                              |
                      ------------------
                      | RelatedDataset |
                      ------------------
                      | - datasetId    |
                      | - studyId      |                      
                      ------------------                      

Consequences on access conditions

  • visible studies :

    • studies I am a member of
    • public studies
  • visible examinations

    • examinations in a study I am member of, optionally restricted to some centers
    • AND examinations linked to a dataset that is related to a study I can see (by the related_dataset table)
      • but only the exam metadata, not its acquisitions and datasets children
      • but not if the examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
  • visible acquisition

    • acquisitions in an examination I can see
    • AND acquisitions linked to a dataset that is related to a study I can see (by the related_dataset table)
      • but only the acquisition metadata, not its datasets children
      • but not if the acquisition's examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
  • visible datasets

    • datasets in an acquisition I can see
    • AND datasets related to a study I can see
      • but not if the dataset's examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?

Pro / Cons

Pros +

  • no metadata duplication

Cons -

  • seems very complicated to re-wright
    • all the "get" and "getAll" services for those entities (examination, acquisition, dataset)
    • the rights check system

Fine rights tuning with multiple right tables

We could also fine tune the entity sharing with relation tables at every level (between Study and Examination/Acquisition/Dataset) :

           +--------------+
           |     User     |
           +--------------+
                 (*)
                  |      +--------------+               +--------------+
                  |------|  StudyUser   | (*)-------(*) |    Center    | 
                  |      +--------------+               +--------------+
                 (*)                                           | (1)
+------(*) +--------------+                       +------------+
| +----(*) |    Study     | (1)---+           (*) |
| | +--(*) +--------------+       |       +--------------+             +--------------+             +--------------+
| | |                             |---(*) | Examination  | (1)-----(*) | Acquisition  | (1)-----(*) |   Dataset    |
| | |      +--------------+       |       +--------------+             +--------------+             +--------------+
| | |      |   Subject    | (1)---+             (*)                          (*)                          (*)       
| | |      +--------------+                      |                            |                            |        
| | |                                            |                            |                            |        
| | |                                            |                            |                            |          
| | |                       +--------------+     |       +--------------+     |       +--------------+     |
| | |                       | RelatedExam  |-----|       |  RelatedAcq  |-----|       |  RelatedDS   |-----|    
| | |                       +--------------+     |       +--------------+     |       +--------------+     |                                        
| | |                                            |                            |                            |
| | |                                            |                            |                            |
| | +--------------------------------------------+                            |                            |
| +---------------------------------------------------------------------------+                            |
+----------------------------------------------------------------------------------------------------------+

Consequences on access conditions

  • visible studies :

    • studies I am a member of
    • public studies
  • visible examinations

    • examinations in a study I am member of, optionally restricted to some centers
    • ? AND examinations linked to a dataset that is related to a study I can see (by the related_dataset table)
      • but only the exam metadata, not its acquisitions and datasets children
      • but not if the examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
    • AND examinations explicitly shared to a visible study
      • with their children
      • only if I am allowed to see the exam center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
  • visible acquisition

    • acquisitions in an examination I can see
    • ? AND acquisitions linked to a dataset that is related to a study I can see (by the related_dataset table)
      • but only the acquisition metadata, not its datasets children
      • but not if the acquisition's examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
    • AND acquisition that are children of an examination explicitly shared to a visible study
      • only if I am allowed to see the exam center
    • AND acquisitions explicitly shared to a visible study
      • with their children
      • only if I am allowed to see the exam center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?
  • visible datasets

    • datasets in an acquisition I can see
    • AND datasets related to a study I can see
      • but not if the dataset's examination is in a forbidden center
        • but how can we forbid a center ? The center might not be in the study ! Add the center implicitly in the study ?

Pro / Cons

Pros +

  • no metadata duplication

Cons -

  • seems very very complicated to re-wright

Data copy

By just copying the entities into another study, no change is needed to the access restrictions rules. Once the data is copied it lives it's life independently as any other data from Shanoir.

Don't duplicate images

We would not copy the images in the PACS, the URI into dataset_expression will we copied so the dataset copy will refer to the same PACS URI. It means that data should not be deleted from the pacs when it is still referenced in the Shanoir database (dataset_expression table).

Copy requires the same center

When copying data in another Study, it's important that the 2nd Study takes place in the corresponding center. So the imported data center should be added to the study.

Pro / Cons

Pros +

  • much simpler
    • no data model big refactoring
    • no problems with the access right control Cons -
  • duplication of metadata in the database
  • harder to cascade changes from originals to copy
  • is it a problem toward ownership if the copy lives is own life independently ?

Pro or Cons ?

  • total control over duplicated data from the second study

Technical thoughts

Copy of selected datasets

  • automatic creation of new subject-studies (the subject is not copied but it's metadata is now accessible via the new study)
  • automatic copy of needed examinations
    • along with their additional files ?
  • automatic copy of needed acquisitions
  • copy of the datasets
  • pacs files are not copied (dataset expressions are copied)

References to copied entities

  • copied datasets (and less importantly examination and acquisition) should have a refernce to their copied (or "parent") entity
    • ex : dataset will now have a property/column "copiedFrom" of type Long
  • when deleting a "parent" entity
    • their children become orphan (copiedFrom = null)
    • OR if a grand-parent exist, it becomes the direct parent instead of the deleted one ?

Rights needed to copy

  • the user has to have rights in both studies, origin and destination
    • origin study, right to have :
      • CAN_ADMINISTRATE
      • OR a new specific right : CAN_COPY ?
        • false by default for any new study-user and for the existing ones
    • destination study, right to have :
      • CAN_IMPORT ?

migration of the old related_dataset reference

When the database was migrated from the old to the current version, the related_dataset table remained.

  • by hand !
  • then delete the related_dataset column