Contributing - osmlab/name-suggestion-index GitHub Wiki
The data/*
folder contains a lot of files - one file per category.
Category files are organized in a tree/key/value path. Each category file contains all the items that share an OpenStreetMap key/value tag.
- tree - The highest level of organization - each tree contains categories that follow a similar approach to naming and linking to Wikidata.
- key - An OpenStreetMap tree key (e.g. "amenity")
- value - An OpenStreetMap tag value (e.g. "fast_food")
The name-suggestion-index currently supports these trees:
-
brands - Branded businesses like restaurants, banks, fuel stations, shops
identified by
brand
/brand:wikidata
tags -
operators - Organizations like post offices, police departments, hospitals
identified by
operator
/operator:wikidata
tags -
flags - Flagpoles hoisting common kinds of flags (national, regional, religious, advertising)
identified by
flag:wikidata
tag -
transit - Transit networks (bus, rail, ferry, etc.) and related infrastructure
identified by
network
/network:wikidata
tags
For example:
-
data/
brands/amenity/fast_food.json
brands/shop/supermarket.json
operators/amenity/post_office.json
flags/man_made/flagpole.json
transit/route/bus.json
- and so on…
Each category file contains:
-
properties
- Object containing category-wide properties -
items
- Array containing the items in the category
For example brands/amenity/fast_food.json
(comments added for clarity):
"properties": { // CATEGORY PROPERTIES:
"path": "brands/amenity/fast_food" // "path" - the tree/key/value path for this category
…
},
"items": [ // An array of items belonging to this category
…
{ // ITEM PROPERTIES:
"displayName": "McDonald's", // "displayName" - Name to display in summary screens and lists
"id": "mcdonalds-658eea", // "id" - a unique identifier added and generated automatically
"locationSet": {"include": ["001"]}, // "locationSet" - defines where this brand is valid ("001" = worldwide)
"tags": { // "tags" - OpenStreetMap tags that every McDonald's should have
"amenity": "fast_food", // The OpenStreetMap tag for a "fast food" restaurant
"brand": "McDonald's", // `brand` - Brand name in the local language (English)
"brand:wikidata": "Q38076", // `brand:wikidata` - Universal Wikidata identifier
"cuisine": "burger", // `cuisine` - What kind of fast food is served here
"name": "McDonald's" // `name` - Display name, also in the local language (English)
}
},
…
There may also be items for McDonald's in other languages! For example, this is how McDonald's should be mapped in Japan:
…
{ // ITEM PROPERTIES:
"displayName": "マクドナルド", // "displayName" - Name to display in summary screens and lists
"id": "マクドナルド-3e7699", // "id" - a unique identifier added and generated automatically
"locationSet": { "include": ["jp"] }, // "locationSet" - defines where this brand is valid ("jp" = Japan)
"tags": {
"amenity": "fast_food",
"brand": "マクドナルド", // `brand` - Brand name in the local language (Japanese)
"brand:en": "McDonald's", // `brand:en` - For non-English brands, tag the English version too
"brand:ja": "マクドナルド", // `brand:ja` - Add at least one `brand:xx` tag that matches `brand`
"brand:wikidata": "Q38076", // `brand:wikidata` - Same Universal wikidata identifier
"cuisine": "burger",
"name": "マクドナルド", // `name` - Display name, also in the local language (Japanese)
"name:en": "McDonald's" // `name:en` - For non-English names, tag the English version too
"name:ja": "マクドナルド", // `name:ja` - Add at least one `name:xx` tag that matches `name`
}
},
…
The displayName
can contain anything, but it should be a short text appropriate for display in lists or as preset names in editor software. This is different from the OpenStreetMap name
tag.
By convention, if you need to disambiguate between multiple brands with the same name, we add text in parenthesis. Here there are 2 items named "Target", but they have been assigned different display names to tell them apart.
In brands/shop/department_store.json
:
"items": [
…
{
"displayName": "Target (Australia)",
"id": "target-c93bbd",
"locationSet": {"include": ["au"]},
"tags": {
"brand": "Target",
"brand:wikidata": "Q7685854",
"name": "Target",
"shop": "department_store"
}
},
{
"displayName": "Target (USA)",
"id": "target-592fe0",
"locationSet": {"include": ["us"]},
"tags": {
"brand": "Target",
"brand:wikidata": "Q1046951",
"name": "Target",
"shop": "department_store"
}
},
…
Each item has a unique id
generated for it.
When adding new data, you don't need to add an id
.
Running npm run build
will generate the id
automatically.
If you don't know how to run npm run build
don't worry: a maintainer will be able to run it for you.
The identifiers are stable unless the name, key, value, or locationSet change.
Each item requires a locationSet
to define where the item is available. You can define the locationSet
as an Object with include
and exclude
properties:
"locationSet": {
"include": [ Array of locations ],
"exclude": [ Array of locations ]
}
The "locations" can be any of the following:
-
Strings recognized by the country-coder library.
These include ISO 3166-1 2 or 3 letter country codes, UN M.49 numeric codes, and supported Wikidata QIDs.
Examples:"de"
,"001"
,"conus"
,"gb-sct"
,"Q620634"
👉 A current list of supported codes can be found at https://ideditor.codes
👉 See below for US States and international regions. -
Filenames for custom
.geojson
features. If you want to use your own features, you need to add them under thefeatures/*
folder of this project (see Feature Files for details)
EachFeature
must have anid
that ends in.geojson
.
Examples:"de-hh.geojson"
,"us-nj.geojson"
-
Circular areas defined as
[longitude, latitude, radius?]
Array.
Radius is specified in kilometers and is optional. If not specified, it will default to a 25km radius.
Examples:[8.67039, 49.41882]
,[-88.3726, 39.4818, 32]
locationSet
Tips:
- The M49 code for the whole world is
"001"
- A current list of supported codes can be found at https://ideditor.codes
- You can view examples and learn more about working with
locationSets
in the @ideditor/location-conflation project. - You can test locationSets on this interactive map: https://location-conflation.com/
States within the United States
The NSI project already has .geojson
files for each state within the USA. These state names follow the ISO 3166-2:US standard, so if you wish to assign an area for Florida, you may use the .geojson
file us-fl.geojson
, for example:
"locationSet": {"include": ["us-fl.geojson"]}
You can also use multiple states if need be, such as Florida & Georgia, for example:
"locationSet": {"include": ["us-ga.geojson","us-fl.geojson"]}
Or exclude states, such as everywhere in the USA except for Florida & Georgia for example:
"locationSet": {
"include": ["us"],
"exclude": ["us-ga.geojson","us-fl.geojson"]
}
The two exceptions are Alaska and Hawaii, which are built-in to the country-coder project, and can be referenced directly without the .geojson
file extension, for example:
"locationSet": {"include": ["us-ak"]}
"locationSet": {"include": ["us-hi"]}
Features outside of the United States
Similar to US States, other areas of the world have their own internal (or administrative) borders, which can be referred to as either States, Provinces, Regions, Counties, Districts, etc. Some of these are already available within the NSI, and follow the local area's ISO_3166-2 naming conventions.
List of available ISO_3166-2 compatible Features
Below is a simplified list of available ISO_3166-2 compatible features within the NSI, which all require a .geojson
extension, with a couple of exceptions where the .geojson
extension is not needed.
Location | Features | ISO Standard | Examples | Exceptions |
---|---|---|---|---|
Australia | States within Australia | ISO_3166-2:AU | Queensland (au-qld.geojson) Victoria (au-vic.geojson) |
Tazmania (au-tas) |
Austria | States of Austria | ISO 3166-2:AT | Vienna (at-9.geojson) Carinthia (at-2.geojson) |
None |
Canada | Canadian Provinces | ISO_3166-2:CA | Alberta (ca-ab.geojson) Quebec (ca-qc.geojson) |
None |
France | Regions within France | ISO_3166-2:FR | Île-de-France (fr-idf.geojson) Provence-Alpes-Côte-d’Azur (fr-pac.geojson) |
None |
Germany | States within Germany | ISO_3166-2:DE | Berlin (de-be.geojson) Hessen (de-he.geojson) |
None |
Japan | Prefectures of Japan | ISO 3166-2:JP | Chiba (jp-12.geojson) Aichi (jp-23.geojson) |
None |
New Zealand | Regions within New Zealand | ISO_3166-2:NZ | Marlborough (nz-mbh.geojson) Wellington (nz-wgn.geojson) |
None |
United States | States within the United States | ISO_3166-2:US | New Jersey (us-nj.geojson) California (us-ca.geojson) |
Alaska (us-ak) Hawaii (us-hi) |
There is partial support for Counties in Great Britain (UK) which follow the ISO_3166-2:GB standard. One exception is gb-lon.geojson
, which represents London as a whole, but gb-lon
isn't part of the ISO_3166-2:GB standard, as each London Borough has its own ISO value.
Each item requires a tags
value. This is just an Object containing all the OpenStreetMap tags that should be set on the feature.
Note for brands, that common tags with additional details such as "website" or "facebook" is not accepted directly. Instead a tag referencing the brand's wikidata entry ("brand:wikidata") containing those details can be included. If no such exists, consider adding it.
Offen related Wikidata property:
- P856-official website
-
P2002-Twitter username (don't include
@
symbol) - P2013-Facebook ID (everything that follows URL part 'https://www.facebook.com/')
Brands are often tagged inconsistently in OpenStreetMap. For example, some mappers write "International House of Pancakes" and others write "IHOP".
This project includes a "fuzzy" matcher that can match alternate names and tags to a single entry in the name-suggestion-index. The matcher keeps duplicate items out of the index and is used in the iD editor to help suggest tag improvements.
matchNames
and matchTags
properties can be used to list the less-preferred alternatives.
"properties": {
"path": "brands/amenity/fast_food" // all items in this file will match the tag `amenity=fast_food`
…
},
"items": [
…
{
"displayName": "Honey Baked Ham",
"id": "honeybakedham-4d2ff4",
"locationSet": { "include": ["us"] },
"matchNames": ["honey baked ham company"], // also match these less-preferred names
"matchTags": ["shop/butcher", "shop/deli"], // also match these less-preferred tags
"tags": {
"alt_name": "HoneyBaked Ham", // match `alt_name`
"amenity": "fast_food",
"brand": "Honey Baked Ham", // match `brand`
"brand:wikidata": "Q5893363",
"cuisine": "american",
"name": "Honey Baked Ham", // match `name`
"official_name": "The Honey Baked Ham Company" // match `official_name`
}
},
…
👉 The matcher code also has some useful automatic behaviors…
You don't need to add matchNames
for:
- variations in capitalization, punctuation, spacing (the middots common in Japanese names count as punctuation, so "V・ドラッグ" already matches "vドラッグ")
- variations that already appear in the
name
,brand
,operator
,network
. - variations that already appear in an alternate name tag (e.g.
alt_name
,short_name
,official_name
, etc.) - variations that already appear in any international version of those tags (e.g.
name:en
,official_name:ja
, etc.) - variations in diacritic marks (e.g. "Häagen-Dazs" already matches "Haagen-Dazs")
- variations in
&
vs.and
You don't need to add matchTags
for:
- Tags assigned to match groups (defined in
config/matchGroups.json
). For example, you don't need addmatchTags: ["shop/doityourself"]
to every "shop/hardware" and vice versa. Tags in a match group will automatically match any other tags in the same match group.
👉 Bonus: The build script will automatically remove extra matchNames
and matchTags
that are unnecessary.
You can optionally add a note
property to any item. The note can contain any text useful for maintaining the index - for example, information about the brand's status, or a link to a GitHub issue.
The notes just stay with the name-suggestion-index; they aren't OpenStreetMap tags or used by other software.
{
"displayName": "United Bank (Connecticut)",
"id": "unitedbank-28419b",
"locationSet": { "include": ["peoples_united_bank_ct.geojson"] },
"note": "Merged into People's United Bank (Q7165802) in 2019, see https://en.wikipedia.org/wiki/United_Financial_Bancorp",
"tags": {
…
}
},
Sometimes multiple brands use the same name - this is okay!
Make sure each entry has a distinct locationSet
, and the index will generate unique identifiers for each one.
You should also give each entry a unique displayName
, so everyone can tell them apart.
{
"displayName": "Price Chopper (Kansas City)",
"id": "pricechopper-9554e9",
"locationSet": { "include": ["price_chopper_ks_mo.geojson"] },
"tags": {
"brand": "Price Chopper",
"brand:wikidata": "Q7242572",
"name": "Price Chopper",
"shop": "supermarket"
}
},
{
"displayName": "Price Chopper (New York)",
"id": "pricechopper-f86a3e",
"locationSet": { "include": ["price_chopper_ny.geojson"] },
"tags": {
"brand": "Price Chopper",
"brand:wikidata": "Q7242574",
"name": "Price Chopper",
"shop": "supermarket"
}
},