Updating Suica database - metrodroid/metrodroid GitHub Wiki

Notes for replacing Suica database

There's a database for the IC cards in Metrodroid, but it's quite old. We should update this, and make things better.

Upstream data

Source: http://www.denno.net/SFCardFan/index.php

This file contains an Excel spreadsheet dump: "Excel 形式で保存(汎用)"

The file is mojibake (unreadable) in LibreOffice. The encoding is actually cp932 (MS-Kanji), which can be read in xlrd.

None of the files have romaji transliterations of the place names. :(

Data sources

We need some data source for translating the station names.

Ideally need something that:

  • station name in Japanese (normally kanji)
  • station's English name
  • line name
  • geographical position of the station

Nice to have:

  • station codes (JY-13, H-04)
  • line colour (Yamanote = green)
  • line geometry
  • station name in romaji (different to English name, eg: Tokyo Teleport station vs. Tōkyō Terepōto eki)
  • station name in kana
  • additional transliterations (Korean, Chinese, Russian...)

Modern station signage examples for Shimbashi station on two different lines:

To support some of these features would require modifications to MdST. Need to figure out a way to handle the transliterations better.

Issues with present data

Incorrect readings of station names and lines:

In Metrodroid, the in Japanese is listed incorrectly as
Mita line 三田 Sanda
Yamanote line 山手 Yamate

etc.

Suspect that the present data is the product of machine translation.

Farebot / Google Translate

State: rejected; requires closed API (Google Translate) which has quality issues

Farebot got an implementation of this fairly recently.

However, this is a bunch of Ruby scripts that pass the station names through Google Translate. This is highly error prone, and probably against Google Translate's terms.

GTFS data

State: rejected; very limited open data available

There are few railway companies in Japan who have open data:

  • The Japanese government developed a GTFS-JP standard (GTFS with Japan-specific extensions)

  • Some smaller operators are opening their GTFS data

  • Tokyo Challenge (includes JR East data) is an annual open data contest, available for limited times (2017-12 to 2018-03), and registration is closed. Only some rail companies provided data. There was an award ceremony in May 2018, but that doesn't mean that the data will always be open.

  • There is one company (Jorudan) who aggregates GTFS data for Japan, but this costs money and would have license conditions that aren't compatible with Metrodroid's distribution.

OSM/Nominatum

State: will investigate further

Selecting feature types requires adding string hints: https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases/EN

Example query, gives a station:

https://nominatim.openstreetmap.org/search/?format=json&addressdetails=1&limit=1&countrycodes=JP&q=Railway%20Station%20%E5%BB%9A%E5%B7%9D

[{"place_id":"772333","licence":"Data © OpenStreetMap contributors, ODbL 1.0. https:\/\/osm.org\/copyright","osm_type":"node","osm_id":"264201692","boundingbox":["35.093716","35.103716","139.0599174","139.0699174"],"lat":"35.098716","lon":"139.0649174","display_name":"Kinomiya, 小田原山北線, Atami, Shizuoka Prefecture, 413-0013, Japan","class":"railway","type":"station","importance":0.078258595754832,"icon":"https:\/\/nominatim.openstreetmap.org\/images\/mapicons\/transport_train_station2.p.20.png","address":{"station":"Kinomiya","road":"小田原山北線","city":"Atami","state":"Shizuoka Prefecture","postcode":"413-0013","country":"Japan","country_code":"jp"}}]

vs without hints, gives a river:

https://nominatim.openstreetmap.org/search/?format=json&addressdetails=1&limit=1&countrycodes=JP&q=%E5%BB%9A%E5%B7%9D

[{"place_id":"86785682","licence":"Data © OpenStreetMap contributors, ODbL 1.0. https:\/\/osm.org\/copyright","osm_type":"way","osm_id":"60536299","boundingbox":["39.373892","39.378038","140.533919","140.580487"],"lat":"39.376563","lon":"140.557086","display_name":"厨川, Misato, Semboku, Akita Prefecture, 0130008, Japan","class":"waterway","type":"river","importance":0.08125,"address":{"river":"厨川","town":"Misato","county":"Semboku","state":"Akita Prefecture","postcode":"0130008","country":"Japan","country_code":"jp"}}]

OSM (dumps)

maybe use overpass api?

Overpass query:

area["name"="日本"];
(
node
  [railway=station]["name:ja"]["name:en"]
  (area);
);
out;

Railway stations without an english and japanese name set (124 total). Many of these are remote or freight.

area["name"="日本"];
(
	node
		[railway=station][!"name:ja"][!"name:en"]
    	(area);
);
out;

Need a way to get the relations with that query, like:

https://www.openstreetmap.org/relation/332866

https://www.openstreetmap.org/relation/1830049

That should give a route, which can be used for more data matching.

OSM example:

Node 2574536692
Tags:
name=九品仏
name:en=Kuhombutsu
name:ja=九品仏
name:ja_kana=くほんぶつ
name:ja_rm=Kuhombutsu
note=National-Land Numerical Information (Railway) 2007, MLIT Japan
operator=東京急行電鉄
public_transport=station
railway=station
ref=OM11
source=KSJ2
source_ref=http://nlftp.mlit.go.jp/ksj/jpgis/datalist/KsjTmplt-N02-v1_1.html
wikidata=Q6442177
wikipedia=ja:九品仏駅
Node 1926355400
Tags:
colour=gray
name=六本木
name:en=Roppongi
name:fr=Roppongi
name:ja=六本木
name:ja_kana=ろっぽんぎ
name:ja_rm=Roppongi
name:ru=Роппонги
operator=東京地下鉄
public_transport=station
railway=station
ref=H04
station=subway
subway=yes
transport=subway
wheelchair=limited
wheelchair:description=Exit 4B has a stair assist.
wikidata=Q1326723
wikipedia=ja:六本木駅

Other data