Data sharing and Policy matching - myantandco/RA-BitnobiPilotJuly2020 GitHub Wiki
A data sharing operation is created by the combination of attaching one or more Policies to a Workflow.
**Policy **- sets "who" can access specific resources, **Workflow **- sets "what" specific data is to be shared.
The workflow specifies which datasource, columns and rows can be shared. A workflow can also be used to hide sensitive data and create a unique identifier if needed.
The sections below provide more details on how the sharing works.
Note that workflows can also be used by the Data Consumer (e.g. researcher) to do additional filtering or do basic analysis with the shared data.
- when you create a new workflow and run it, initially no one will have access to its result set except for the owner.
- to share a workflow result set with other users, you must attach a policy to it. Any users matching the policy will be able to use it as a data source in their workflows or reports.
- for a policy to match a user, the user must have attribute/value pairs that exactly match every attribute/value in the policy. For example if the policy only has one attribute (e.g.
organization = Bitnobi
) then this will match all users that have an attributeorganization = Bitnobi
. If a policy has two attributes (e.g.organization = Bitnobi
,department=IT
) then all users that belong to the Bitnobi organization and are in the IT department will match. - each policy must have at least 1 attribute.
- policies created by one user are visible only to the owner. For example every user that wishes to share workflow results with all users of the Bitnobi organization must create their own policy with
organization = Bitnobi
. - the admin user can create a policy that can be "globally shared" with all Bitnobi users. This is controlled by a checkbox on the Policy editor. This allows the admin to "pre-populate" a bunch of policies that normal Bitnobi users can the use for Workflow access control.
- By default, other users cannot download or transfer data that you have shared with them out of Bitnobi. Data Transfer access control allows you to selectively enable specific Data Transfer methods if necessary.
- A Bitnobi user must have Data Transfer permission to use Data Transfer to download as .csv or upload to JupyterHub. The admin user controls this through policies and policy resources. If the Policy Resource for Data Transfer is disabled then the users that match the policy will not see the Data Transfer page in the Bitnobi UI.
- If you create a workflow that uses only datasources created by you, then all Data Transfer types are allowed for you. The Data Transfer Type settings for a workflow do not apply to the owner.
- If you share a workflow with another user, the Data Transfer Type settings on that workflow will control which Data Transfer operations are allowed for other users. For example if I restrict Data Transfer to
Jupyter
only, then other users will not be able to download my workflow results as a.csv
file, nor download the results of any derived workflow. - If you create a workflow using multiple shared datasources, then the most restrictive Data Transfer settings of any datasource will apply to your workflow.
First, let us define some users with the following attributes:
User | Attributes | |||
---|---|---|---|---|
bitnobi_user_1 | organization=Bitnobi | projectA=true | projectC=true | |
bitnobi_user_2 | organization=Bitnobi | projectA=true | projectB=true | projectC=true |
external_user_3 | organization=External | projectB=true | ||
empty_user_4 | ||||
data_owner | organization=Bitnobi |
Next we define some policies with attributes as below:
Policy | Attributes | Attributes | Attributes | Attributes | Users match count |
---|---|---|---|---|---|
Bitnobi | organization=Bitnobi | 3 | |||
projectA | projectA=true | 2 | |||
projectB | projectB=true | 2 | |||
External | organization=External | 1 | |||
BitnobiABC | organization=Bitnobi | projectA=true | projectB=true | projectC=true | 1 |
empty | 0 |
Now let us create some workflows, attach policies and see which users can access their result sets.
Note that by attaching 2 or more policies to a workflow, this creates a logical OR condition for granting access. For example for workflow5
below, any user that belongs to projectA
or projectB
can access the resultset.
In contrast, when multiple attributes are set in a policy, this creates an AND condition for granting access. For example the policy BitnobiABC
requires that a user must be part of the Bitnobi organization, and be a member of projectA
and projectB
and projectC
. Thus for workflow7
, asside from the data_owner
, only bitnobi_user_2
can access its result set.
Workflow | Access control | Users able to access |
---|---|---|
workflow1 | data_owner | |
workflow2 | Bitnobi | data_owner, bitnobi_user_1, bitnobi_user_2 |
workflow3 | projectA | data_owner, bitnobi_user_1, bitnobi_user_2 |
workflow4 | projectB | data_owner, bitnobi_user_2, external_user_3 |
workflow5 | projectA, projectB | data_owner, bitnobi_user_1, bitnobi_user_2, external_user_3 |
workflow6 | External | data_owner, external_user_3 |
workflow7 | BitnobiABC | data_owner, bitnobi_user_2 |
workflow8 | empty | data_owner |
Workflows that each user should have access to as datasources (for workflows, reports and data transfers):
User | w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 |
---|---|---|---|---|---|---|---|---|
bitnobi_user_1 | ✔️ | ✔️ | ✔️ | |||||
bitnobi_user_2 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | |||
external_user_3 | ✔️ | ✔️ | ✔️ | |||||
empty_user_4 | ||||||||
data_owner | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |