Shared publicly  - 
 
Let's say I wanted to build an algorithm that would rank websites/brands based on how important, influential and successful they are. What public, numerical metrics could I use that would be reasonably defensible?

I think Compete, Quantcast (unless Quantified), Google Trends for Websites, Hitwise, Alexa, etc. are all terrible due to accuracy issues (see http://www.seomoz.org/blog/testing-accuracy-visitor-data-alexa-compete-google-trends-quantcast).

Currently, I'm considering:
-> # of searches for the brand name according to Google AdWords
-> # of mentions of the brand name in Google News
-> Domain Authority of root domain via Mozscape
-> # of linking root domains
-> # of subscribers to their RSS feed (via Google Reader)
-> # Facebook fans
-> # of Twitter followers

Any other suggestions? I'd hope for metrics that are relatively indicative of overall influence, not just effort in a given channel (which makes Twitter/Facebook/RSS more questionable).
5
1
Jesper Jørgensen's profile photoRand Fishkin's profile photoAdrian Palacios's profile photoJosh Cohen's profile photo
29 comments
 
If you're scraping the Web anyway, you could track brand name mentions as well as sentiment--not just in Google News.
 
Social media signals could be one measurement. Will be difficult to quantify traffic due to the accuracy issues you pointed out.

Perhaps there could be a crawler for the influential news outlets that searches for news articles or mentions of the website/brand.
 
Social medial should give a good insight for how influential sites and brands are. Not only number of follower and fans but also the interaction / engagement level, how many retweet, likes, ecc.... Same goes for # comments on articles, sharing a social signals on the site.
 
I think it depends on what kinds of businesses the tool is supposed to measure. Some industries don't use social media.
 
Reiterating +Sandro Lonardi's comment on measuring engagement. It's too easy to buy fans/followers. And even brands that don't outright buy them still artificially inflate the counts different ways.
 
I am working on a system to judge ROI on interactive programs for our global industrial clients. My new system doesn't add a lot of categories to what you have above, but it assigns a weight to each kind of interaction. In other words, for us, social media followers are weighted as 1, whereas, I am weighting visits to a product landing page as 10 and visitors to the page on the main site where they get the technical information for a purchase as 20. So my system is trying to calculate the value based on how far into the sales funnel you are. That said, I can't wait for your system!
 
True +Graham Stanton but the engagement it's much harder to fake. You can have lot's of follower but if none is interacting, commenting or sharing it means that the influential / successful level is low. The algorithm should be something quite similar to Google Page Rank, the people / sites refer to a resource the more valid this resource probably is. The more people retweet or comment the more interesting / influential it is. Something between Google PR and Klout?
 
-> # of Pinterest followers
-> # of Google+ circles / +1's
-> # of YouTube channel subscribers
-> # of Yelp reviews (high # means they have a lot of reach)
-> Brand mentions on comscore.com (think about it for a second)
 
I don't think any of them are reliable at all. They are all based around # of X that users do.

Should quality be measured by quantity? I mean, if you were to use the same algorithm to figure out the quality of different works of art, I'm sure lolcats pictures would be "higher quality" and more "influential" than say works by Michelangelo, or at least in the same quality spectrum.

From this point on it just goes into much deeper philosophical ideas and I'm going to stop before I get lost. I don't have a solution, I dont' know that there is a solution, just something that always troubles me when I see different ranking algorithms.
 
You should add offline factors. Local media.
 
Brand recognition plays an important role in the purchase and decision-making process, and a company's voice gets louder with a bigger marketing budget. How should a search engine relate to companies with large marketing budgets?


The algorithm should be able to prevent companies with large budgets from being able to rank higher just because they're able to implement costly marketing strategies and tactics that other companies can not afford.

People’s (buyers) behavior and judgment is based on their PERCEPTION, they don’t see reality it self. They interpret what they see and call it reality.
A "result-based algorithm" ranking companies based on their (bottom line) results, documented ACHIEVED & VERIFIED by a credible third party, would be amazing!

It could be called 'The Hard Facts Algorithm' ;-)
 
The more sources you can aggregate, the better your outcome, despite the flaws of each individual source.

You've already got great inlink data from opensiteexplorer, that will help tons on the website influence ranking, I would also free-ride of google and bing either by pulling in data for each site from keyword spy, or by scraping the search engines yourself.
 
+Rand Fishkin An "engagement" metric. Like number and percentage of actions due to a post/tweet/share etc.

For example, for every tweet;
- number/percentage of replies
- number/percentage of favorites
- number/percentage of re-tweets
--- Averaged across channels over time :-)
 
Is number of linking root domains really useful when you're already using Domain Authority? I don't think it would make it more accurate but would leave it more open to the influence of spam at the lower end of the scale.
 
Social bro app can be an inspiring example and useful for the algorithm
 
Hi Rand

You are facing somevof the challenges Google are facing. I believe you need to use twitter/facebook/rss data, but find a way to ignore or at least turn down their effect in markets where thet are not widely used.+Tom Anthony knows how to map the social graph with the web graph.

Otherwise you got a fine set of metrics, but need to combine them with relative scores, just as the pie chart of the "SEOmoz Broad Algorithm" from 2011.


 
#of Twitter followers and Facebook fans mean nothing on their own, both can easily be acquired without having any influence. To truly measure the influence of a brand you need to measure engagement AND sentiment. I think +Dan Shure has it spot on with measurement of engagement, but if you only measure this you don't take into account the possibility that engagement is being triggered by negative feelings about the brand ie lots of people are complaining about them. I would therefore also use NLP techniques to measure positive and negative sentiment. See here for more details on how this can be done http://mashable.com/2010/04/19/sentiment-analysis/
 
Why limit yourself to public data? Google is almost certainly paying for access to private databases at this point. If you want to replicate Google, you have to go where they go.
 
Hi Rand... you have some great suggestions when it comes to importance, influence and success, which were the three things you wanted to measure. Wouldn't it be great, though, to have a measure of quality?

Although many people would argue that quality is purely subjective (because on the web quality always shades off into 'aesthetically pleasing') I think there are certain things which we can agree on and even measure. For a start, how about numbers of paid-for links as a measure of poor quality? As a small business owner who is trying to do the right things on the web, it infuriates me that competitors buy their influence from crappy low quality directories and link schemes, Panda etc notwithstanding.

Anyone else agree that 'quality' should be a measure of success, influence and importance?
 
The problem, as I see it, is one of public honesty. Links can be bought, as can FB "Likes", G+1 votes, and twitter followers/retweets. Adding link modifiers like nofollow can work for and against quality recommendations as it is a generic default and not content specific.

Basically, if the public can add a citation it can be "gamed". (I just saw an ad in LinkedIn "1000 facebook likes for Rs 500).
The larger the budgets, the more marketing and social media exposure can be utilized, not always the best for authority.
The only thing that cannot be faked is the relevance between linking and linked pages.

Setup a system that is a count of the links on RELEVANT pages ONLY, with the degree of relevance indicating the PR assigned instead of the computation based on all links, as page rank (was) done.

Ignore nofollow, scrap the old PageRanks numbers and recalculate links, (PR), based on relevance only. Once you have the indexing in hand, modify it by the CTR and bounce rates.Use these numbers to indicate authority, but do not apply this metric to the SERPs, as authority is not always a measure of relevance.
One could factor in minor factors like the length of registration past and future, and add modifiers like actual physical addresses, phone numbers and listings in local directories.

If this sounds like the new PageRank, so be it.
 
I don't think anyone should give to much about fb/twitter followers, because then you have to weed out all the spambots and fake accounts to get an accurate number. Also, for some topics you wont have this data because the topic is very unpopular - who would follow "Herpes" on FB or twitter? :)
 
Thanks for all the great feedback everyone. Going to work on this and see what we come up with.
 
It is difficult to rank on the number of facebook, twitter and google plus followers as these can be manipulated, it would be better to track retweets and reshares, viral lift as addthis put it. Actual interaction with their pages on the relevant social networks would be good, posts on their walls etc.... citations across the web would be a good ranking signal especially from sources such as trust pilot etc. and onsite citations with the use of schema.
 
Thanks it is an interesting read, I am not suprised that Google's products carried the biggest impact as they don't want to rely on twitter and facebook forever but still leaves it open to manipulation.
 
Late to the party, but here's my $0.02:

-> # of searches for the brand name according to Google AdWords
Yes, definitely
-> # of mentions of the brand name in Google News
Hmmm...this could skew. For instance, what if you're BP and suddenly your rig is pouring millions of gallons of oil into the Gulf?
-> Domain Authority of root domain via Mozscape
Of course :-)
-> # of linking root domains
Yes
-> # of subscribers to their RSS feed (via Google Reader)
Yes
-> # Facebook fans
-> # of Twitter followersI shy away from social media metrics. They can be gamed. Sentiment analysis on Tweets is hard. The sample may not be representative. Etc, etc.

My other thoughts:
-Hitwise data (it's a little less crappy than Quantcast, et al)
-If the company is public, use their financial data: www.google.com/finance -Research how other companies measure brands; for instance, Interbrand tracks the "Top 100 Brands" every year and details their methodology here: http://www.interbrand.com/en/best-global-brands/best-global-brands-methodology/Overview.aspx
For local:
-BBB ratings: http://www.bbb.org/us/Find-Business-Reviews/
Add a comment...