How does Analytics collection work and why it might not - mitikov/KeepSitecoreSimple GitHub Wiki

Magical server-based analytics that works only for alive visitors

Analytics transforms a site browse into a magical show that is dynamically composed per each individual needs.

A full trust between visitor and a system is needed to show the real personalization magic. magic

Since robots do not appreciate Analytics tricks, no miracles are shown to them.

A powerful technology always stands behind. We`ll take a brief look on each stage that takes place during wizardry.

Initial browser request

One must prove being alive to participate in a phenomenal show.

The VisitorIdentification control is responsible for establishing trust. The simplified logic is:

Robots would not perform any browser activity (mouse move / touch screen).

robot

Please ensure that all your layouts either define the VisitorIdentification control, or have robot detection disabled.

Data available to start the spectacle

A Referer might show search engine with keywords, User-Agent shows the device used, IP might show the region where user is located at once in case value is cached.

Both Analytics and ASP.NET session cookies are set on first request.

The Analytics cookie would be used to track every interaction originated from this browser.

Individual, not a set of browsers

The approach with detecting unique persona by cookie might be wrong as people consume content using personal phone, tablet, a few PCs with different browsers installed nowadays.

browsers-benchmarks-april-2015-5

The data shown must be consistent for same user across all devices, and here the Sitecore technology comes in the game.

Relevant content show continues once visitor has identified himself using the gadget/browser.

pc_phone_mac

ASP.NET and Shared session states

ASP.NET Session is browser specific meaning every browser has its own unique data storage that is not shared =
It is still good to store pages visited during interaction, and goals achieved from one device.

A shared session that would store general info about visitor (f.e. address, facets, engagement states) and be shared across different gadgets/ASP.NET Sessions.

All the collected information must be flushed to database when visit is over - Session_OnEnd event.

Sadly to say only InProc supports Session_End in providers shipped with ASP.NET Framework.

Data collection

One can browse site using different devices (different ASP.NET sessions), but all the information is tied together by user identifier (f.e. email, name).

No concurrent data modification takes place as sessions are assigned to requests in a row (no parallel request processing).

If a user got enrolled into engagement plan from phone, the enrollment would be active immediately for any other device as well.

The analytics information is accumulated in sessions during site visit, and flushed to database when session expires.

Pitfalls

  1. InProc stores the data in process memory and sensitive to application restarts. The collected data would be lost in case restart takes place in less than ASP.NET Session timeout interval (20 min by default).

  2. Sitecore Applications prolongate/keep ASP.NET session alive. Collected data would be flushed only 20 minutes after all Sitecore interfaces are closed.

  3. Collected data is not flushed in case Session State provided does not support Session_End event

  4. No data is collected in case visitor considered as robot. Layouts must have VisitorIdentification control.

  5. Analytics Tracking configuration must be active to power data collection.

Conclusions

I was trying to give Troubleshooting xDB data issues material in non-formal way.

An Analytics high-level overview hopefully makes things a bit more understandable, and brings more light on:

  • Why 2 sessions are needed
  • When to expect data appear in database
  • Why may data not be collected

The process described here does describe clustering, identification, aggregation and other scary terms.

Further reading

Brave knights are welcomed to read community docs to deepen knowledge in the area.