milestone - Gukie/building-recommend GitHub Wiki
version 1 (accomplished at 2017.11.08)
Architecture: JSoup+MySQL+Spring micro services
- crawl data from websites (JSoup)
- store them into DB (MySQL)
- generate excel report (Apache POI)
- send email to recipient.(Gmail OAuth2)
version 2
This version is mainly to learn Big Data tech. Change architecture to: MongoDB+Redis+ELK
- MongoDB (Store original Data)
- ELK (Index )
- Redis (store ES index)
The whole process might be like following:
- crawl origial data into mongodb
- build aggregated data from MongoDB, and index them to ES
- store the ES index to Redis for speeding up searching.
- store the already existing data into Redis, rather than in Memory
1. Design:
ELK will have 2 collection:
- collection1: the aggregate building data for avg-price in each plate.
- collection2: trimmed data from mongoDB (maybe the whole data )
- collection3: flagged data from mongoDB.