Why you see (other) in Google Analytics reports and a ninja tip to get around it.

<ninjachop/>

I wanted to explain this last week but the solution requires advanced segments and most GA users don't know how they work. Last week I explained how advanced segments work, this week I explain how to use them to get rid of (other).

The reason why (other) occurs is to make reporting faster.

Google Analytics reports are generated by processing visitor, session, and hit data. To make the product really fast, GA pre-calculates each standard report. Each pre-calculated report only stores 50,000 rows per day, where the top 49,999 rows get actual values. The the last 50,000th row gets the value of (other) and the sum of all the remanding row values.

So you see (other) in Google Analytics because you are sending more than 50,000 values per day for a standard report. Generally this works fine. The totals are always correct. Also most people only view the top 100 results and don't jump to the 49,900 row.

But what if '(other)' is a top entry in your reports and you want to do some long tail analysis?

Well, you can create an advanced segment to match all sessions and apply that segment to a standard report. For example you can create an advanced segment for the dimension Visitor Type that matches the regular expression .*

(Note: This is NOT the same as applying the "all visits" segment)

I was recently looking at the GA reports of a large sites where the Top Pages report had 49,999 pages. When I applied the Visitor Type segment above, I could see 250k+ pages. Thats a big difference.

Cool..ya?...ok why this works.

As I discussed before:
- Advanced segments include or exclude entire sessions when reports are being processed.

So advanced segments are applied at the time of report processing, and therefore, reports that use advanced segments can never be pre-calculated. Instead reports with advanced segments use the raw session and hit data to re-calculate the report on-the-fly.

Typically, advanced segments are used to include or exclude sessions from being processed. But when you create one to match all sessions, you end up only by-passing the pre-calculated reports and force the entire report to be re-calculated.

Really cool...

The reason why the numbers between pre-calculated and on-the-fly calculated reports differer is that each type of report has different limits.

Pre-calculated reports only store 50k rows of data per day but process all sessions (visits).

Reports calculated on-they-fly can return up to 1 million rows of data, but the only process 500,000 sessions (visits). After the 500k visits, sampling kicks in.

So this solution works best when you have less that 500k visits in your date range. (You can find the number of sessions in the date range by looking at the visits metrics in the traffic overview report)

Hopefully this tip helps you better understand how advanced segments work as well as give you a new tool to access more of your long tail data.

If you find these types of tips helpful, let me know.
Shared publiclyView activity