Shared publicly  - 
 
From counting citations to measuring usage: People got in the habit of counting how many papers a research produce. This is silly. Nobody counts how many novels a writer has produced as a way to measure his importance. So people have started counting citations. This is better because it is a bit less obvious how you might game it, but it is still silly because 90% of citations are shallow: most authors haven't even read the paper they are citing. We tend to cite famous authors and famous venues in the hope that some of the prestige will get reflected. So, that's the current state-of-the-art. But why stop there? We have the technology to measure the usage made of a cited paper. If you merely cite a paper "in passing", that's rather easy for a computer to measure this effect. Some citations are more significant: for example it can be an extension of the cited paper. Fairly elementary machine learning techniques should suffice to measure the impact of your papers based on how much following papers build on your results. Why isn't it done?

Update: I wrote blog post based on this G+ post: http://lemire.me/blog/archives/2012/03/20/from-counting-citations-to-measuring-usage-help-needed/
19
5
Ross Kang's profile photoManu Sridharan's profile photoEran Yahav's profile photoArun Kumar Sridharan's profile photo
22 comments
 
Academic should steal a page from Google's book. It will always be an arms race between metric designers and metric gamers. We need a much more agile and open marketplace of metric designers.
 
As long as you're looking for ways to improve the process, throw in some ranking boost for authors that publish results of efforts to reproduce results. Don't want to get (any more) over-focused on original content.
 
+Philippe Beaudoin I agree. But we could actually use Google to solve this problem. After all, it is a search problem. You want the most influential research paper on topic "X". Merely looking at citations is like merely looking at inbound links. It is not bad, but you can do better.
 
One argument against is that putting a comparatively opaque layer of machine learning in there that mysteriously spits out a reputation score is a bit frightening, especially if it directly impacts people's careers. At least citation counts (and most related metrics) are reasonably concrete. But anyhow, I'm sure someone out there is itching to create a "Klout for Academics" startup.
 
+Christopher Batty Who says you can have only one metric? We could/should have competing metrics. And anyone who is ranking researchers based on some "Klout score" is a moron. Period.
 
I see what you're getting at Daniel. There would have to be some content-analysis. In combination with other metrics like those being developed at Alt Metrics (http://altmetrics.org/tools/) e.g. Total Impact (http://total-impact.org/) you'd get a much more "rounded out" picture. There's also the metric "how many startups did this paper generate" (referring to Bollen et. al. "Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena")
 
The point is valid. But there has been work on understanding (for example) the "flow" of research as indicated by how ideas are cited and percolate. It's a little tricky because you need sufficient understanding of "research English" to parse the way in which a reference is being made. A related example of this, using blogs and news sources (instead of research papers) was done in KDD 2009: http://dl.acm.org/citation.cfm?id=1557077
 
anyone who's published knows there are two types of citations, and it's easy to tell them appart. When I was in grad school, and I had to read a new paper, I'd first collect all the works cited in more than one section, excluding intro, previous works, and future work. Basically, it becomes clear that the new work is based on whatever it cites in the actual meat of the paper, and more than once. If a cited work only ever appears in a list of more than 3 other works, that's also a clue it's not really a building block of the new work.
With these guidlines, an author is forced to pick which papers are cited prominently, or otherwise degrade their work with irrelevant tangential remarks.
 
I bet that someone will do this soon enough. I also bet that very simple technical criteria would already give a pretty good picture of how much is a citation used in a paper. E.g., for every time that a citation appears in a paragraph that has k citations in it, it gets a score of 1/k.
 
"Fairly elementary machine learning techniques should suffice to measure the impact of your papers based on how much following papers build on your results."

+Daniel Lemire Just out of curiosity, what kind of machine learning techniques do you have in mind?.
 
Isn't machine learning completely opaque to people who are not versed in it?

Counting citations, as much as it is flawed, involves little subjectivity and is transparent; someone in psychology or medicine will easily understand that it's just a tally of times a work is mentioned. This makes it easily verifiable and repeatable.

On the other hand, if you have something similar to "we plug articles mentioning you in this random forest and the classifier outputs a number," it's completely opaque to someone who is not versed in machine learning; it's not much better than saying "we plugged numbers into some proprietary software, so they must be right."

However, since the network of citations is a digraph, we could differentiate between "shallow" citations and "deep" citations. Suppose that there are four papers {A, B, C, D} with links (B -> A), (C -> B), (C -> A) and (D -> B); in this case, while both A and B have an indegree of 2, we consider A more important because C found it worthwhile to also cite A (through B), essentially curating the citation. However, I have no idea if that would really work. :)

I think though that the more we move away from proxy metrics like citation counts and impact factors --- I know a researcher who got asked by the editor of a journal to cite more papers from the same journal for acceptance --- the better off we'll be, focusing on science instead of gaming the system.
 
+Daniel Lemire +Christopher Batty I agree with both of you here. Multiple competing algorithms was what I was hinting at with open marketplace. If there is a single algorithm, even if it's designed by consensus and with good intentions, it will be gamed when deployed at a large enough scale. As we're slowly moving science out of its historical ivory towers, the scalability of our ranking algorithm will become more and more of a concern.

Now, for an open marketplace to exist we need customers. In that case, we need people who care enough about the quality of the ranking algorithms to keep ranking providers on the edge of their seat. In the case of Google's web search, the customers are obvious: the entire world has to sift through the net every day. In the case of research papers, it's a little less obvious... Historically, the customer has been the government distributing grants, but it's a notoriously bad customer. Maybe crowdsourced research funding could be a game changer here?
 
Daniel: the reason it's not done it's simple: there is a semantic cliff Simple measures based on syntax (like counting links) can be done efficiently, understood easily, and even though everyone agrees they are a (more or less rough) approximation, it's pretty clear what it's doing. More sophisticated techniques, including some involving learning (some have already been mentioned in other comments) may add a degree of added value. But the original goal was to evaluate the influence of ideas; as soon as you try to do it by analyzing content (substitute here 'meaning' or other like terms) you find out that (a) it's extremely hard to formalize; and (b) the payoff is uncertain -if it exists at all. Why has IR had such a hard time moving beyond bag-of-words? Because most things that have been tried add quite a bit of complexity (cannot scale to Web size) while not really giving a commensurate improvement.
Once this said, I agree that we should try more sophisticated measures (ideally, several, put together in an ensemble. If they are diverse enough, they will improve over any individual measure). I'd love to move away from citation counts -but it's gonna be hard.
 
+Jeff Erickson one might argue that the unpopularity stems not from the computer rankings themselves, but from the idea that there isn't a head-to-head playoff system for sports.
 
+Jean-François Im I think editors of journals who ask for gratuitous and self-serving citations should be named and shamed. I realize it's not your place to out your colleague, and there's a big gray area between suggesting relevant works that really should be cited and asking for greater numbers of citations to arbitrary and unspecified papers in the same journal, but allowing this sort of behavior to stay under covers is implicitly condoning it.
 
+Suresh Venkatasubramanian Yeah, that's part of it, but the fact that nobody understands how the BCS system works, even though lots of rabid sports fans are huge statistics nerds, is at least as significant.
 
+David Eppstein I completely agree that they should be. Unfortunately, I have no idea what the name of the journal is, all I remember is that it is in the social sciences and had, unsurprisingly, a relatively high IF.

The problem with the system as it is is that it encourages gaming it; publishing in high IF journals is considered prestigious by some and encourages all participants to keep playing the game to get more citations in an academic Ponzi scheme. From what I understand, bibliometrics are used in performance reviews in some universities, providing an added incentive to game the system; perhaps then the solution is to have everyone in a department legally change their names to Ike Antkare. :)
Add a comment...