Keep Patients Module - synthetichealth/synthea GitHub Wiki

Synthea supports the ability to filter the patient records that are exported, using a module built with the Generic Module Framework. The basic process is as follows:

  1. Synthea is run with some combination of arguments, including a "keep patients" module specified with the -k flag, e.g.: run_synthea -p 100 -k sample_module.json Utah Sandy

  2. For the given number of patients, Synthea will:
    i. Select random demographics for a new patient
    ii. Select a new random seed
    iii. Generate a candidate patient using the given demographics and seed
    iv. Run the candidate against the "keep patients" module, if the final state is called "Keep" then the candidate patient will be kept and exported, and this "slot" is filled. If the final state is anything else, the candidate will be discarded, and the process will loop back to step 2.ii.
    v. If Synthea loops over 1000 attempts and fails to generate a patient for the given demographics, it will abort and leave that slot empty. The resulting number of patients in the final population will be decreased by 1. (For instance if 100 patients were requested, now only 99 will be exported)

[!IMPORTANT] It is very easy to build a "keep module" that only keeps impossible patients. (For instance, no patient under the age of 18 can get diabetes in synthea. Trying to generating a population with ages 1-100 where every patient has diabetes will never finish, because if it tries to generate a patient under 18 it will never succeed and keep looping forever as it tries with a different seed. To successfully generate a dataset of "only diabetes", make sure the age range is at least 18+) Use at your own risk. Make sure the demographics settings that you are using align with the patient characteristics you are expecting.

Note also that this uses a reduced version of the module framework to run through the module. The "keep module" should primarily include Initial, Terminal, and Simple states, with transitions using whatever logic is needed to select patients. Use of other state types may produce unexpected results.

Sample Keep Patient Modules

1. Keep patients with attribute 'diabetes' == true

{
  "name": "keep",
  "remarks": [
    "A blank module"
  ],
  "states": {
    "Initial": {
      "type": "Initial",
      "conditional_transition": [
        {
          "transition": "Keep",
          "condition": {
            "condition_type": "Attribute",
            "attribute": "diabetes",
            "operator": "==",
            "value": true
          }
        },
        {
          "transition": "Terminal"
        }
      ]
    },
    "Terminal": {
      "type": "Terminal"
    },
    "Keep": {
      "type": "Terminal"
    }
  },
  "gmf_version": 2
}

2. Keep patients with an active condition of either diabetes or hypertension

{
  "name": "keep",
  "remarks": [
    "A blank module"
  ],
  "states": {
    "Initial": {
      "type": "Initial",
      "conditional_transition": [
        {
          "transition": "Keep",
          "condition": {
            "condition_type": "Or",
            "conditions": [
              {
                "condition_type": "Active Condition",
                "codes": [
                  {
                    "system": "SNOMED-CT",
                    "code": 44054006,
                    "display": "Diabetes"
                  }
                ]
              },
              {
                "condition_type": "Active Condition",
                "codes": [
                  {
                    "system": "SNOMED-CT",
                    "code": 59621000,
                    "display": "Hypertension"
                  }
                ]
              }
            ]
          }
        },
        {
          "transition": "Terminal"
        }
      ]
    },
    "Terminal": {
      "type": "Terminal"
    },
    "Keep": {
      "type": "Terminal"
    }
  },
  "gmf_version": 2
}