IP lookups enrichment - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki
HOME > SNOWPLOW SETUP GUIDE > Step 3: Setting up Enrich > Configurable enrichments > IP lookups enrichment
JSON Schema iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/1-0-0 Compatibility 0.9.6+ Data provider MaxMind
This enrichment uses MaxMind databases to look up useful data based on a user's IP address.
There are five possible fields you can add to the "parameters" section of the enrichment configuration JSON: "geo", "isp", "organization", "domain", and "netspeed". Each of these corresponds to looking up information one of five MaxMind databases, and so needs to have two inner fields:
- The
database
field contains the name of the database file. - The
uri
field contains the URI of the bucket in which the database file is found. Can have either http: (for publically available MaxMind files) or s3: (for commercial MaxMind files) as the scheme. Must not end with a trailing slash.
The below table describes the five types of lookup. Note Snowplow only works with the legacy binary formats (.DAT) which you should have access to with any subscription to MaxMind.
Field name | MaxMind Database name | Lookup description | Accepted database filenames | Fields populated |
---|---|---|---|---|
"geo" |
GeoIPCity or GeoLiteCity | Information related to geographic location |
"GeoLiteCity.dat" or "GeoIPCity.dat"
|
geo_country , geo_region , geo_city , geo_zipcode , geo_latitude , geo_longitude , and geo_region_name
|
"isp" |
GeoIP ISP | Internet Service Provider | "GeoIPISP.dat" |
ip_isp |
"organization" |
GeoIP Organization | Organization name for larger networks | "GeoIPOrg.dat" |
ip_organization |
"domain" |
GeoIP Domain | Second level domain name associated with IP address | "GeoIPDomain.dat" |
ip_domain |
"netspeed" |
GeoIP Netspeed | Estimated connection speed |
"GeoIPNetSpeed.dat" or "GeoIPNetSpeedCell.dat"
|
ip_netspeed |
Field name is the name of the field in the ip_lookups enrichment configuration JSON which you should include if you wish to use that type of lookup. That field should have two subfields: "uri" and "database".
MaxMind Database name is the name of the database which that lookup uses.
Lookup description describes the lookup.
Accepted database filenames are the strings which are allowed in the "database" subfield. If the file name you provide is not one of these, the enrichment JSON will fail validation.
Fields populated are the names of the database fields which the lookup fills.
For each of these services you wish to use, add a corresponding field to the enrichment JSON. The fields names you should use are "geo", "isp", "organization", "domain", and "netspeed".
Here is a maximalist example configuration JSON, which performs all five types of lookup using the MaxMind commercial files:
{
"schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/1-0-0",
"data": {
"name": "ip_lookups",
"vendor": "com.snowplowanalytics.snowplow",
"enabled": true,
"parameters": {
"geo": {
"database": "GeoIPCity.dat",
"uri": "s3://my-private-bucket.s3.amazonaws.com/third-party/maxmind"
},
"isp": {
"database": "GeoIPISP.dat",
"uri": "s3://my-private-bucket.s3.amazonaws.com/third-party/maxmind"
},
"organization": {
"database": "GeoIPOrg.dat",
"uri": "s3://my-private-bucket.s3.amazonaws.com/third-party/maxmind"
},
"domain": {
"database": "GeoIPDomain.dat",
"uri": "s3://my-private-bucket.s3.amazonaws.com/third-party/maxmind"
},
"netspeed": {
"database": "GeoIPNetSpeedCell.dat",
"uri": "s3://my-private-bucket.s3.amazonaws.com/third-party/maxmind"
}
}
}
}
Here is a simpler example configuration (which exactly duplicates the behaviour of Snowplow 0.9.5):
{
"schema": "iglu:com.snowplowanalytics.snowplow/ip_lookups/jsonschema/1-0-0",
"data": {
"name": "ip_lookups",
"vendor": "com.snowplowanalytics.snowplow",
"enabled": true,
"parameters": {
"geo": {
"database": "GeoLiteCity.dat",
"uri": "http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind"
}
}
}
}
This example uses the free GeoLiteCity database hosted by Snowplow.
The only input value for this enrichment comes from ip
parameter, which maps to user_ipaddress
field in atomic.events
table.
This enrichment uses 3rd party, MaxMind, service to look up data associated with the IP address. MaxMind offer industry-leading IP intelligence data updated weekly.
Below is the summary of the fields in atomic.events
table driven by the result of this enrichment (no dedicated table).
Field | Purpose |
---|---|
geo_country |
Country of IP origin |
geo_region |
Region of IP origin |
geo_city |
City of IP origin |
geo_zipcode |
Zip (postal) code |
geo_latitude |
An approximate latitude (coordinates) |
geo_longitude |
An approximate longitude (coordinates) |
geo_region_name |
Region |
ip_isp |
ISP name |
ip_organization |
Organization name for larger networks |
ip_domain |
Second level domain name |
ip_netspeed |
Indication of connection type (dial-up, cellular, cable/DSL) |