More OpenRefine recipes...
Finding (Nearly) Duplicate Items in a Data Column
Suppose you have a dataset containing a list of Twitter updates, and you are looking for tweets that are retweets or modified retweets of the same original tweet. The OpenRefine Duplicate custom facet will identify different row items in that column that are exact duplicates of each other, but what about when they just don’t quite match: for example, an original tweet and it’s appearance in an RT (where the retweet string contains RT and the name...
one plus one
Shared publicly•View activity
- Like it. Did you mention your other recipe to get the data in the first place http://blog.ouseful.info/2012/10/02/grabbing-twitter-search-results-into-google-refine-and-exporting-conversations-into-gephi/Nov 15, 2012