Name Data - synthetichealth/synthea GitHub Wiki
By default, the project contains English-style and Spanish-style names suitable for the United States in the src/main/resources/names.yaml
file.
You can change those names, or add names for other languages. The primary language people speak determines their name. The language they speak is determined by their ethnicity. If you want to add other languages, or change how languages are assigned, you'll need to modify the src/main/java/org/mitre/synthea/world/geographic/Demographics.java
file. Specifically, these two functions:
public String ethnicityFromRace(String race, Person person) { ... }
public String languageFromEthnicity(String ethnicity, Person person) { ... }
Names File Format
language:
M: [MaleName1,MaleName2,...,MaleNameN]
F: [FemaleName1,FemaleName2,...,FemaleNameN]
family: [Surname1,...,SurnameN]
street:
type: [Street,Straat,Rue]
secondary: [Apt]
Languages currently supported (but not necessarily defined):
english
,spanish
,french
,french_creole
,german
,russian
,portuguese
,polish
,greek
,chinese
,hindi
,arabic
Numbers in the names?
The synthetic patients have numbers appended to their names so it is painfully obvious that they are fake and no one could mistake them for real patients. If you want to turn this capability off, edit the following property to false
:
# If true, person names have numbers appended to them to make them more obviously fake
generate.append_numbers_to_person_names = true