Shared publicly  - 
 
Introducing Bot and Spider Filtering

Many of you have shared with us that it’s hard to identify the real traffic that comes to your pages. That’s why I’m pleased to announce that we’re adding bot and spider filtering. 

You can simply select a new checkbox option which would be included in the view level of the management user interface. This option would be labeled "Exclude traffic from known bots and spiders". Selecting this option will exclude all hits that come from bots and spiders on the IAB know bots and spiders list. The backend will exclude hits matching the User Agents named in the list as though they were subject to a profile filter. This will allow you to identify the real number of visitors that are coming to your site. 

Nestlé has been testing it and has found great benefit:  

“The Bot filter solution is essential for getting deeper insights. View level availability let us stay fully aligned with Best Practices provided to all site owners. Very easy to use, understand and communicate across thousands of Google Analytics users.”
-  Katarzyna Malik, Nestlé Google Analytics Specialist

Happy Analyzing!

Posted by Matthew Anderson, Google Analytics Team
797
401
Arthur Strang's profile photoBoris Loukanov (Борис Луканов)'s profile photoHaryo Tri Wicaksono's profile photoCorey Padveen's profile photo
181 comments
 
This sounds amazing if it includes companies like semalt... Love the fact it's view level so you can choose an unfiltered view! Hopefully on by default though? 
 
YA! Just noticed this on one of my client accounts this morning. Great addition!
 
Is this feature only available for Universal Analytics?
 
It doesn't seem to working on all accounts. Is there a rollout schedule?
 
Would be nice to be able to send raw traffic logs to GA for processing to have a more accurate picture of traffic including those hits that don't run javascript that get missed.
 
I don't see it on ours either.
 
Does anyone know if this is retroactive or moving forward?
 
+Thanos Lappas +Srikar Reddy +Eric Brandt   I looked through a bunch of properties we manage. I did not see this option in any of the sites that still have traditional GA installed. I did see it in some of the sites that have UA installed
 
I'm sure it's not retro Joe. Nothing in GA is retroactive. 
 
I managed to do this on two of my websites, but for the third I couldn't see the option available. Could you advise on this please?
Ani Lopez
+
2
1
2
1
 
Wait a sec! Generally speaking we took for granted that bots and spiders were not affecting accuracy because they don’t execute JS (well, most of them) but if Google offers now the option to filter them out probably the impact was higher than expected what leads me to think Google own bots have something to say here.

More than just excluding them I would prefer to recognize them to be able to decide what to do. For SEO reasons in complex international – multilingual scenarios (or just simple SEO) I was including them in a separate view in GA and the insights are pretty interesting. When your pages are first discovered, revisited by bots, which ones and so forth so most likely I won’t be using this feature.   
 
+Ani Lopez in this case this feature aims to exclude traffic from bots that execute javascript (like uptime checkers, performance monitors, any bot that needs to run Javascript).

I will give it a try :)
 
May be give it a try on a duplicate profile to see what effect it has. I saw Google said they do read some JS in there bots now so the guess is others do. But most will miss the GA call js. 
 
Why isn't it available in all views? It can't be an advanced analytics thing, as I'm missing the option even on properties where that is installed...
 
Good point, +Sean Doggendorf - I was thinking the same thing. I'm curious to see how much of a difference it'll make so I'm getting ready to "flip the switch" myself.
 
I also can't see the option on any of the accounts I am managing.
 
This seems like a weak approach to fixing a problem that may not exist. Google analytics already includes several effective hurdles to prevent robot traffic from being recorded. the IAB list of blacklisted user agents is closed source, so you never know who you are excluding. The User-Agent client http header field is optional and forge-able, so no evil botmaker is going to be thwarted at this level.
 
Is there a way to easily apply this setting to all existing properties we have? I'd hate to have to iterate over all properties to set this for each and every one of them...
 
Can someone share the link for IAB spiders and bots list? Thanks!
 
Totally agree with +Ani Lopez when he says "we took for granted that bots and spiders were not affecting accuracy because they don’t execute JS". So this new option let me think that in my GA traffic there are also bots...Let's see how many (in a separate view, of course)!
 
+sean dreilinger Semalt is a nuisance in my opinion. I had to use an .htaccess addition to block them. Glad Google did something more intuitive for the average analytics user.
 
I thought bots were already excluded (from tracking) by virtue that they don't execute most JavaScript.
 
+Mark Mehl Semalt based domains/subdomains were getting around it and showing as referral traffic in a lot of users' analytics.
 
Would be great if there was actually a bot traffic report.
 
Would users be able to reverse engineer this to only see bot/spider activity, similar to how one would investigate server log files?
 
+anthony dazhan not via GA, though I'd advise creating a view for this feature so you can compare it against the unfiltered view(s).

One solution would be a server-side script that detects bots against a known-bot database and asynchronously sends relevant details about the bot to GA, such as this one from +Adrian Vender: http://www.adrianvender.com/universal-analytics-for-search-bots/

Another solution would be a next-gen DNS service like +CloudFlare (http://cloudflare.com/), which even the free-tier service reports bot activity regardless of JS execution, and gives you a nice accuracy comparison against your GA numbers too.
 
+Google Analytics could you please share with us a list of bots and spiders which you are excluding in default?
cc: +Daniel Waisberg +Matthew Anderson  help us
 
This is exactly what I needed! Thanks +googleanalytics
 
+Gerry White I'm sure you already know this but you can also create a filter to remove SemAlt.com data too. Create a "With Filters" profile, then New Filter > Exclude > Referral > semalt.com
 
+Roy McClean of course, I just know that they could just easily change the referrer and the filter, across twenty accounts would be useless... Its not just them, I've seen similar activities from other companies too... 
 
After a bit of searching have found it! Great new feature though.
Iona M
+
1
2
1
 
+Google Analytics - I've turned this feature on but my figures look exactly the same.  We don't run advertising on the site but get about 1m PVs a month.  Any ideas?
Iona M
 
+Google Analytics +Matthew Anderson just read '3. The setting only applies moving forward and not retro-actively'.  Will there be a way to see the view count both with and without bots moving forwards? As a publisher site that turning this on now will warp our monthly reporting
 
+Iona M I would think GA would have already been filtering out Googlebot, or that Googlebot would skip over executing any GA JavaScript encountered during crawls. Seems plausible that Bingbot would too.  I figured this feature had to do with all the other known bots out there that execute JS during crawls.

Has anyone seen official confirmation from any major engines that their crawlers skip execution of common tracking JS, or GA specifically?
 
Thanks +Gerry White This furthers the need/importance for multiple views per property then. (Example: Main, Bot test, Raw, etc..) As well as one of my favorite segments "non bounce" for Ecom insight.
 
It's not working for me. I enabled the Exclude all hits from known bots and spiders yesterday and i am still getting a lot of bot traffic. :( 
 
it would be nice to see the list of what's getting filtered out or a page with a count by # of views by blocked user agents. I love the blocking, just curious to see what's actually getting blocked.
 
Anyone know if either semalt or kambasoft will be filtered out with this enhancement?
 
Finally! I've been getting crazy amounts of bot traffic and its destroying my analytics
 
Does this view backdate once selected? I activated it on my account and am still seeing Semalt for past visits. I know I can block with a manually created filter, but I want to see if GA's new built-in solution works! 
 
No. It only factors in going forward. 
 
Appreciated but what about unknown bots or spiders? 
 
Does it only remove going forward or will it filter out past traffic too?
 
I would think this is to Analytics as the Junk Folder is to email and should just be automatically filtered in all views or added as a segment like +Google Analytics has for new visitors etc.
 
This doesn't seem to get rid of the annoying semalt visits
 
Turned on the option three days ago in one view of a property - leaving it off in a second view -> both still show exactly the same figures on ie. sessions, users,pageviews,...

any ideas why there is no difference?
 
Thanks for the tip +Shelly Cihan ! Have you experienced any negative effects of excluding semalt through their own function, such as those sites/companies being put on some kind of leads list and contacted by Semalt?
 
+Peder Einarsson I have not seen any negative repercussions of this opt-out tool. They do not ask for anything but the domain and you can add an entire list in one entry. Every site that I have added has dropped to 0 referrals from this source. I personally feel like if enough users opt-out that will be a stronger signal to Semalt than filtering it server or GA side (though if you read the comments online about it, they must have a clue how much their traffic is disliked in GA).
 
I checked the Bots filtering and has made no difference with Semalt, anyone else?
 
Still getting semalt and kambasoft referrals. 
 
Updating my Semalt suggestion to use their opt-out form to filter their visits: this appears to only remove visits from semalt.semalt.com. Though this has been effective in removing the majority of visits, I am starting to see various (upwards of 30) other subdomains appear that are not filtered (398.semalt.com, 40.semalt.com, etc). I am now going to resort to GA referral exclusion filtering as well.
 
Hi, I've noticed significant spikes in traffic, which under investigation have come from old versions of firefox - leading me to believe it's BOT traffic - however I apply this filter and my session/users data doesn't change (identical) I don't believe the website which receives considerably traffic is bot free - so why would the numbers not update? Thanks
 
Anybody know if DFP has a similar setting? Having an overclicking problem on a work site, thinking it might be bots.
 
Does this thing works? Still seeing semalt.
 
+Guney Ozsan Bot filtering in GA won't exclude Semalt. You should remove it using an htaccess rule. If you don't have the ability to do that, you can remove a large portion of the referrals using http://semalt.com/project_crawler.php and then exclude the remaining subdomains using a filter in GA. This bot results in high bandwidth issues as well which is why htaccess is the best route if possible.
 
It would be nice if Google can provide a list of what they are excluding.
 
Great, it's very helpful but we'll see what will happen from now.
 
Would be nice if there was a way to enable this feature for all your GA sites with one check box.
Bruno B
+
3
4
3
 
Does this work for Samara referral spam? An extensive list would be helpful.
 
What about iloveitaly.com (which redirects to aliexpress or alibaba) and forum.darodar.com which is the same? Those are illegitimate ways to get backlinks and commissions on sales, I would say.
 
Darodar is hitting Google Analytics directly and not visiting pages, so .htaccess changes won't have an affect on it.
 
+Ty Cahill
What do you mean that is hitting GA directly and not visiting pages? Could you elaborate a little bit more? Thank you.
 
+Angela Marcos 
I think he means that the traffic from darodar, ilovevitaly,... is not real. The bot generates a fakes traffic/visit and uses the analytics tracking codes to trick Google.
 
Where do I find this setting?  I'm fairly new to Analytics and still learning.  Thanks!
 
Too bad this doesn't work for a bunch of the annoying sites doing it currently - econom, ilovevitaly, darodar AND GA filters seem to be basically broken to filter them out. Also to all the geniuses suggesting .htaccess - not everyone runs their site on Apache ya dinguses, plus that doesn't work if these spammers are calling the GA client-side calls directly instead of visiting your site. 
 
Hi there, Could you confirm if this also blocks semalt style bots, therefore making any custom filters set up to remove these from your data redundant? Would be great just to tick a box!
 
Nope! Because semalt are visits from a bot at your site. The better way to exclude them is by htaccess. There are a lot of good tutorials. Just Google semalt and spam ;)
 
+Klaus Aßmann 1) Not everyone runs their site on Apache, 2) .Htaccess is NOT the real solution to this issue, yes it works but - do you really want to play whack a mole forever with these sites? What if you run hundreds of sites? +Google Analytics  needs to handle this issue on their end, rather than having millions of webmasters creating millions of GA filters and htaccess rules.
 
Why on earth doesn't analytics have a checkbox next to the referer that you can select and simply click SPAM. That way it'll be removed from the records forever. Setting up a filter is a pain. I could quickly select all the bots and tap one button. I can do it for email, so why not for bots?
 
Click referer to set as spam. DOOOO ITTTT. Get this feature in the backlog and make millions happy. Better yet, conglomerate the user input and auto-flag referrers if known to be common spammers. 
 
Will this filter out Semalt, and other crap like Darodar? I keep getting visits from these people, and I do not know how to block them from having crawlers on my site.
 
No, this will not affect semalt or darodar. Other filter techniques are needed (as I point out in my post listed above).
 
it is ok for Creepy Bots but what about Ghost Referrals ? 
 
I did that but I still see them in my dashboard, how long does it take to take them out of there?
 
It will block future visits, but will not get rid of existing data in your reports. You will need to create a segment to do that.
 
We really need to have a way of dealing with Ghost Referrals, this is causing a great deal of work for thousands of webmasters - I imagine Google are desperately trying to get something to work.
 
Did it but I still see NEW referrals from darodar and semalt, referrals from these sites keep growing every day, is that right?
 
+Arthur Radulescu Thanks for posting the screencast!  Only way I could find it!  I have almost a third of my traffic (and bounce rate!) coming from darodar, located it says in Samara, Russia.  Yuck.  :(
 
Ahh, but as I see this won't even help with those... SIGH.  Come on Google!
 
I'm not finding this setting under Reporting and basic
 
I did find a filter under Admin and All Filter
Focus your analysis on region-specific data.
Analyze larger data sets based on your sales regions. Create a filter that includes all of the countries in that region so you can easily focus your analysis on just that traffic.
For example, if you have a North America region that includes the United States and Canada:
Filter Name: North America
Filter Type: Custom > Include
Filter Field: Country
Filter Pattern: United States|Canada

I'm not just getting high bounce rates from Russia but South America and China as well.  I set mine for just United States so hopefully I will get get bounce rate analysis for the US only. 
 
+Google Analytics  I have updated the settings based on recommended steps , although I still see unwanted bots in my analytic. Is there an alternative solution?
 
Good Lord....a simple instruction would be nice:
Click on "Admin" at top
then (for me at least) in the "view" column click "view settings"
 
Google could simply de-index Semalt and any company that does this sort of thing as punishment. It would be a good start anyways.
 
same problem here, getting 75% traffic stats from samara russia, 
 
Do you have a technology where i am just going to put the urls in excel file where GA can filter it automatically?
 
so not working. Don't let the spam bots beat you at your own game Google!
 
bit of a pitty that the bigegst ones like semalt & social-buttons.com are not excluded, although well know... would make sense for Google to update list on a regular basis so feature stays relevant
 
Please - add semalt, social-butons, buttons-for-web and best-seo to this list!
 
May as well add sitejabber too
 
When will this be updated? I have a client's site that has 348 visits last month, and only 13 of them were real. This makes GA completely worthless 
 
Is there a viable alternative to Google Analytics that doesn't suffer from this problem? How about Clicky?
 
Turned this feature on but still can see some bots traffic in the report. I think Matthew Anderson should update the list of bots & spiders to stop showing in reports on a regular basis.
 
Checked the GA option for bots about a month ago and it really helped but, now they seem to be getting through again. Does this option have to be re set from time to time?
 
I have a site in the building process, no domain or anything yet. I have 378 hits from all over the world. GA is useless if this is not fixed.
 
All referral spam now just shows up as direct traffic and a location of Not Set so can't filter anything anymore. GA is pretty much useless now.
 
Why isn't this enabled on all accounts by default? It seems dumb to stick it in a random sub menu. If Google blocked all these stupid bots aggressively, maybe the spammers would give it up and get a real job. And by aggressively, I mean bust knee caps!
 
This clearly is only working.  Having set numerous filter up on Analytics and set Bot Filtering I still get to many multiple "not set" visits which are impossible to deal with.
 
+Enno E. Peter Still works for me?  Try this: https://www.google.nl/?gws_rd=ssl#hl=nl&q=removing-referral-spam-from-google-analytics%EF%BB%BF+viget
 
I didn't use Google Analytic for a while now,  because it doesn't show right data with all this bot activity. I hope this filter works, otherwise we need some other tool then GA.
 
Good news. Death to the bots!
Done every thing from .htaccess (minimal impact) to GA filtering. Fingers crossed I have clean data soon.
 
This tick box is nowhere to be found, in my dashboard I can't find anywhere this reporting view settings, how do we get there?!
 
Doesn't matter as it does not work the bots have won and Google ain't doing anything about it. Goggle Analytic's are now usless as the bots show up as a direct connection with a location of not set so you can't even filter them manually anymore. Just delete Goggle Analytic's as it has become useless.
 
doesn't work.  what am I doing wrong.  still see bots referrals
 
See my post above this does not work anymore and there is no proper way to filter the bots anymore. 
 
This no longer appears to work. Valid hostname filter isn't working all that well either. Google probably should fix this, enable it by default, and keep on top of it. When you create an analytics account and put it on your fresh new site, within a week you "supposedly" have hundreds of hits (almost all spam). This of course seems to ramp up with time. Then the user has to spend hours researching and trying to fix it, just to have it not work all that well. The end user gets frustrated and wants to find something else. Of course, then the user thinks something along the line of "If they can't do this right, they probably aren't doing other things right either". Bad business practice.
 
This doesn't seem to be working at all. :(
 
 
Doesn't seem to be working at all, for me as well :(
 
+Dermot Sweeney  Are your fingers still crossed? Come on Google...  I am guessing that right now there may be 1-2 thousand serious referral assholes... BLOCK THEM ON YOUR END
 
Cool, now please keep this updated as much as possible. It's getting crazy spam out there!
 
This is useless Google. I've also got 7 filters packed full of domains and the bot traffic still gets in and screws up my results. I'd say GA is pretty much useless now. The only reason these bots do this is because you Google, spider web server logs. If you stopped doing this there would be no reason for them to pollute our data.

Take away the root cause, don't try and apply sticking plasters. you'll always be one step behind the likes of Semalt.  
 
+Jon Boyes They spam because every so often some noob/idiot goes to their website and buys something. Since it pays, they keep doing it. I do agree though that Google needs to fix their filtering as Analytics is basically useless now days unless you get so much traffic that 200-500 extra hits a day won't do much to the stats. However based on previous experience with Google, it's their way and only their way. Their users never get listened to. The only reason I use any Google products is SOME of them work decently enough. A lot is of poor quality or buggy.
 
+Mark R I can't believe Google can't find a way to stop referral spam. Until they do GA data is very much useless for my domains. Ah well...! 
 
+Jon Boyes  They can find a way. There are ways that a user can implement that block most of the spam. However with time it has to be updated usually due to the spammers evolving. It's a royal pain in the backside though to set up such measures. Search for "valid hostname filter". Then search for "htaccess referrer spam". Between the two, most of the spam is blocked. The valid hostnames filter blocks the analytics side of the spam (the bots that never visit your site but uses your public code to spam you) and the htaccess helps block the actual spam visits.
Add a comment...