Open Data and Content Mining
Talk at Sydney University by
Peter Murray-Rust
University of Cambridge and Open Knowledge Foundation

Seminar Room, Level 2 Fisher Library, 2 pm, Wednesday October 31st, 2012

The publicly funded research in the Scientific Technical Medical (STM) literature contains multibillion dollars of unused value. Most scientific articles contain names, numbers, places, chemicals, organisms, graphs, tables, etc. which can be extracted and re-used. This leads to better science, new information products, startup companies, better information for policy makers and much more which I have estimated  at "low billions" for chemistry alone. For STM, especially medicine, the figure is much higher. Yet this is currently unavailable for the reasons: (a) publishing uses PDF which is a very poor way of conveying the information (b) publishers active prevent mining of the content to preserve their revenues.

We must change this, and soon, though (a) evangelism of the opportunity (b) lobbying for our rights (c) building the next generation of tools. I shall cover all these, including our Manifesto on Open Content Mining and demonstrations of AMI2 - a weakly intelligent amanuensis for the scientist (based initially on understanding PDFs). This offers great opportunities for citizenry in general to liberate this vast resource of valuable information.

All welcome, no RSVP needed
Host: +Matthew Todd  School of Chemistry
