industy_mapping rollout plan - GovWizely/webservices GitHub Wiki

Plan

6. Adjust Google Sheet

...so that it can be auto-imported into IM. All I'll need to do is rename the column headers.

5. Establish a “feedback” process

...that gathers industry strings from new sources and ensures they are added to IM.

Add functionality to IM that records unfulfilled lookups and provides a tool for admin users to go through these and add them to the DB as necessary.

4. Use IM in all existing webservices importers

TODO:
  1. Adjust all importers so that they query IM for industry field values. We'd use the industry_mapping_client gem.
  • MRL integration is in progress done.
  • Trade Leads and Events will come next.
  • For other sources, we still need to decide whether or not to use IM.
  1. Adjust query params of all endpoints so that they accept an "industries" (plural) param and that they each work in a consistent way.
Notes:
  • When writing specs, we should use the VCR gem to record results from IM, so that our specs don’t access the network every time they run.
  • We're going to have to set up caching of results in webservices, or add bulk-lookup functionality to IM, or both. After experimenting with MRL, we found 1 lookup per source doc to be unacceptably slow.

Mostly done already:

1. Open source the industry_mapping (IM) app

TODO:
  1. Compile list of files that have sensitive data which should not be OS’ed.
  2. Create new IM repo in GW account, init it with single commit containing all code minus sensitive files (and Capistrano config?)
  3. Fork GW repo in ITA account, set up IM production hosting.

2. Set up AWS Stack for IM staging

Notes:
  • We’d use staging to test code changes to IM. However, when mapping industries during import on production and staging, we should communicate with IM production (so that we are getting production data).

3. Make IM production accessible via DNS lookup

Right now in order to access it you must add an entry to the system's /etc/hosts file.

[Will fall out of 1.3.]