Google Parsing - malparty/keywords GitHub Wiki
Solution exploration
After basic HttpClient trials, I quickly encountered a 429 too many requests
error message.
To enable parsing more keywords, I explored different approaches:
- Put the threads in 3s sleep between each request.
This was too slow for an acceptable parsing speed.
- Adjust sleep duration dynamically (success = faster / failure = slower).
I realized that when a
429 too many requests
error was triggered, waiting several minutes did not enable to continue. Restarting the app was faster.
- Then I tried parsing google search via Proxy service.
Instead of Google sending a 429 error, our proxy service was the one sending it. So this did not solve the problem.
- The current application is solving that problem by disposing its HttpClient whenever a 429 error is received.
This enable to reach 600+ parsing at fast pace (single thread). Which is acceptable for our first product version
Not a perfect solution
Later in the project, I realized that AdWords are not shown to HttpClient requests.
Google knows my app is not an human.
You can find more explorations in the Improvements page of this wiki.