Improvements - malparty/keywords GitHub Wiki

Every project has space for improvement, so this 2 weeks part time work surely has too! Here is a selection of some "next steps" that could be done for this project.

Google Parser

AdWords detection

Today, Google does not see our app as a Human: It is not showing any AdWords link, nor merchant campaigns. This makes our AdWords count wrong (always 0).

Potential solutions to be tested could be:

  • Using a real web browser instead of a HttpClient. For instance, a full-header Chrome. If needed, with a real UI.
  • Using, with user permission, the user web browser with a provided plugin (many exists). Warning though, this is not 100% bot-detection proof. You might be confronted at some extends to Google Captchas and might need to invest in a Captcha Solver service (at some price)
  • Using human services like Amazon Mechanical Turk. This could provide the best "anti-bot" service for prices still acceptable. Though it's not a bot.

Scalability

My tests showed that under 600 keyword per 5 minutes, the system works well (disposing our HTTP client). If you want to scale up this configuration, you might want to build an adapter pattern to perform the Google search via different proxy providers. Whenever the server sends a 429 too many requests error code right after a HTTP Client disposal, you can switch to another proxy. This would scale up approximately by the amount of Proxy you can provide to the system.

Application & UX

Search form

At the moment, we have 2 different search forms: File Search & Keyword Search. It could be interesting to merge both of them into a centralized search, with auto-completion & result suggestions. Results would the be shown as aggregates (a grouping for file-related results, another grouping for keyword-related results).

Besides, adding some search tools like (`key*' to search all words starting by key) could be useful too.

SignalR - Notification on other pages

Today, push notifications are supported only on the home page. It can be interesting to include these notifications in the other pages.