Profile

Cover photo
Christopher Southan
Works at TW2Informatics
Lives in Sweden
75 followers|387,246 views
AboutPostsPhotosVideos+1's

Stream

Christopher Southan

Shared publicly  - 
 
I thought the abstract (pasted below the bad news) was pretty good actually, but you can't win  'em all can y'er  (so I won't be going  to Fulda  :).  Any offers for a seminar ? (manuscript in gestation so please dont scoop me)

"Due to the large amount of oral contributions some reassignments became unavoidable. Therefore we ask for your kind understanding that your contribution "Challenges of curating approved drugs: will the correct structures please stand up?" (Reference code: 5179-12120) cannot be accepted as a lecture."   GDCh congress team

******************
Challenges of curating approved drugs: will the correct structures please stand up?

The molecular structures of approved human medicines  represent the crown jewels from approximately half a century of global drug R&D. Paradoxically  however, there is neither a consensus “gold standard” set of structural representations, nor even agreement on the total count.  The cheminformatic problem was highlighted in a 2009 comparison of database subsets of approved drugs that recorded only 807 exact structures in-common (PMID 20298516).  In this work we have explored analogous intersects that can now be generated within PubChem. For example, selecting DrugBank “approved” maps 1533 substance submissions (SIDs) to 1504 compound identifiers (CIDs). Performing CID intersects with other sources expected to capture the same drugs shows a stepwise drop-off. Starting with ChEMBL produces 1358 matches-in-common, adding in the FDA Substance Registration system drops to 1028, Therapeutic Target Database 808 and “INN or USAN”  745, respectively.  Thus adding in each source reduces the consensus by ~10% and the final 5-way intersect is only ~50% of what we might expect. We have generated other metrics to dissect out some of the contributing factors.  For example, each of the 754 consensus CIDs had, on average, 93 submitters.  This popularity for drugs is unsurprising but it inevitably introduces representational noise. We explored this problem via the PubChem “same connectivity” operator. This established that each of the 754 drugs has, on average 59 variants (i.e. different CIDs). We dissected this “structural multiplexing” for the representative case of Taxol/paclitaxel. All five sources had chosen CID 36314 (i.e. it was in the 754) as did 194 other sources. However it is related to 135 “same-connectivity” CIDs from no less than 694 SIDs (although some were split mixtures).  Further results will be shown that indicate the major causes of multiplexing.  These including alternative salt forms, stereo enumerations and E/Z resolution as well as virtually deuterated drugs from patents. Our results should not be taken as a criticism of valuable sources that curate drugs. However the discordance between those examined above (and others inside or outside PubChem) highlights the challenges of database representation. The problems apply not just to approved medicines but all pharmacologically active chemical structures.  We have also observed an increase in drug structure multiplexing in PubChem over the last couple of years that is at least partly due to expanding patent extraction and vendor submissions. This has led to IUPHAR/BPS Guide to PHARMACOLOGY checking CID consensus sets and, where appropriate, adding cross-pointers to alternative structures as part of our drug curation process (PMID 24234439). The continued need for this strategy indicates definitive lists will remain elusive until there is a) more collective engagement  for standardisation and b) that pharmaceutical companies move beyond just paying  lip service to transparency by provenancing all  their clinically tested drug structures  in public databases. 
1
Egon Willighagen's profile photoFrank Oellien's profile photo
2 comments
 
Unfortunately, I could not participate this time in the ranking and selection process because I am completely busy with the EuCO-CC organization. Otherwise, you would have received a high ranking from me. :-)
Add a comment...

Christopher Southan

Shared publicly  - 
 
Some companies really make it tough to find out what their clinical leads actually are.  Trying to unravel the Merck BACE1 Alzheimer's candidates
1
Add a comment...

Christopher Southan

Shared publicly  - 
 
Something for protease afficianados and a chance to increas the exposure of your own (human) protease medicinal chemistry efforts

http://cdsouthan.blogspot.se/2015/07/proteases-hard-days-curation.html
1
Add a comment...

Christopher Southan

Shared publicly  - 
 
Molecular mappings to complement the Cell review paper on the MLP achievements
1
3
Andrew Perry's profile photoRajarshi Guha's profile photo
Add a comment...

Christopher Southan

Shared publicly  - 
 
Post-publication chemical and protein identifier links.
2
Add a comment...

Christopher Southan

Shared publicly  - 
 
An update on the peculiar surfacing patterns of  links between clinical candidates and their molecular structures not declared in journal papers.
1
Add a comment...
Have him in circles
75 people
Niklas Blomberg's profile photo
Kerstin Forsberg's profile photo
Antony Williams's profile photo
Vilnis Liepins's profile photo
Stefano Fuschetto's profile photo
GDCh-CIC Vorstand's profile photo
Pratim Chakraborty's profile photo
Abhik Seal's profile photo
George Papadatos's profile photo

Christopher Southan

Shared publicly  - 
 
Enjoyed meeting contacts at ACS Boston. My slide sets:

Resolving cryptic needles to molecular structures: The GtoPdb experience
http://www.slideshare.net/cdsouthan/southan-needles-acs

Multiplexing analysis of 1000 approved drugs in PubChem
http://www.slideshare.net/cdsouthan/multiplexing-analysis-of-approved

Opening up and connecting antimalarial data: Progress with caveats
http://www.slideshare.net/cdsouthan/southan-malaria-acs

Causes and consequences of automated extraction of patent-specified virtual deuterated drugs
http://www.slideshare.net/cdsouthan/causes-and-consequences-of-automated-extraction-of-patentspecified-virtual-deuterated-drugs
2
Add a comment...

Christopher Southan

Shared publicly  - 
 
Chemistry can be easy on the eye sometimes
1
Add a comment...

Christopher Southan

Shared publicly  - 
 
Seminar: Analysing curated protein drug targets in @GuidetoPHARM, tomorrow, June 11, 11:15 Uni. Basel, Dept of Biomedicine , be grateful for forwarding within Roche, Novartis and other Basel institutes

1
Add a comment...

Christopher Southan

Shared publicly  - 
 
Seminar: Analyzing curated protein drug targets in @GuidetoPHARM, June 11, 11:15 Uni. Basel, Dept of Biomedicine (ciculation within Roche, Novartis and other local institutions appreciated
1
Add a comment...

Christopher Southan

commented on a post on Blogger.
Shared publicly  - 
 
Agree with your point about NatRevDiseasePrimers.  As we might say  "Nice graphics - but shame about the total absence of linking"   (in or out)
A P.falciparum isoprenoid biosynthesis pathway (WP2918). Event 1 The Nature Publishing Group (NPG) has launched a new journal, which you probably did not miss. There is founding editorial titles From mechanisms to management ...
1
Add a comment...

Christopher Southan

Shared publicly  - 
 
Its an informatics  jungle out there ....
1
Add a comment...
People
Have him in circles
75 people
Niklas Blomberg's profile photo
Kerstin Forsberg's profile photo
Antony Williams's profile photo
Vilnis Liepins's profile photo
Stefano Fuschetto's profile photo
GDCh-CIC Vorstand's profile photo
Pratim Chakraborty's profile photo
Abhik Seal's profile photo
George Papadatos's profile photo
Places
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Currently
Sweden
Links
Contributor to
Work
Employment
  • TW2Informatics
    Principle, present
Basic Information
Gender
Male
Christopher Southan's +1's are the things they like, agree with, or want to recommend.
FlightAware Flight Tracker – Android Apps on Google Play
market.android.com

Free, live flight tracker and flight status app from FlightAware for Android!This app allows you to track the real-time flight status and se

Maps
market.android.com

The Google Maps app for Android phones and tablets makes navigating your world faster and easier. Find the best spots in town and the inform

Journal of Cheminformatics | Full text | Quantitative assessment of the ...
www.jcheminf.com

Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequenc

Journal of Cheminformatics | Article Statistics |
www.jcheminf.com

Related literature. Cited by; Google blog search. Other articles by authors. on Google Scholar. Southan C · Várkonyi P · Muresan S. on PubMe

A tale of two drug targets: the evolutionary history of BACE1 and BACE2
www.frontiersin.org

The beta amyloid (APP) cleaving enzyme (BACE1) has been a drug target for Alzheimer's Disease (AD) since 1999 with lead inhibitors now enter

Listen to BBC
market.android.com

Listen to BBC Enjoy unlimited accesses to most of the BBC radio stations, including International Radio 1, Radio 1x, Radio 2, Radio 3, Radio

T-COFFEE Multiple Sequence Alignment Server
tcoffee.crg.cat

T-Coffee is a multiple sequence alignment server. It can align Protein, DNA and RNA sequences. You can use T-Coffee to align sequences or to

Journal of Cheminformatics | Abstract | Extracting and connecting chemic...
www.jcheminf.com

Exploring bioactive chemistry requires navigating between structures and data from a variety of text-based sources. While PubChem currently

Brothers In Arms® 2 Free+
market.android.com

Now you can play the highly acclaimed Brothers in Arms series for FREE! Prepare to step onto the most intense and explosive battlefields of

Approved Drugs of 2012 in PubChem
dx.doi.org

Approved Drugs of 2012 in PubChem

ELIXIR Database Provider 2009 Survey Report Appendix
dx.doi.org

ELIXIR Database Provider 2009 Survey Report Appendix

NCATS and AZ/MRC repurposing cpds, PubChem CIDs and patent links
dx.doi.org

NCATS and AZ/MRC repurposing cpds, PubChem CIDs and patent links

Halo 3 (Full Campaign and Cutscenes)
www.youtube.com

The trilogy continues as Master Chief arrives on Earth to finish the interstellar war he started fighting two games ago (and some fictional

The ChEMBL-og - Open Data For Drug Discovery: Paper: Fuelling Open-Sourc...
chembl.blogspot.com

As it was announced last year, some of our collaborators in GSK Tres Cantos just published the results of a large antimycobacterial phenotyp

Improving Online Chemistry One Structure at a Time « ChemConnector Blog
www.chemconnector.com

Last week I was in the United Kingdom for numerous meetings and at the end of the week struggled to drive north to Macclesfield to the Astra

http://www.chemicalize.org/
www.chemicalize.org

Find chemical structures on web pages and provide predicted data for each structure using ChemAxon's Name to Structure parsing and structure

A 50th post: the quirky story of AZD5904
cdsouthan.blogspot.com

I’m more interested in visitor milestones (and thanks to everyone for pushing these towards 15K and ~1.7K pm) rather than posting metrics, b

Kudos to PubChem and a look at the top-10 sources
cdsouthan.blogspot.com

Any way you look at, it the progress of PubChem over the last 8 years has been impressive and many biological chemistry domains have been tr

Patent and PubChem mining the MMV390048 antimalarial
cdsouthan.blogspot.com

Update: two comments below the post. Since I have been helping out with database searches for the OSDD antimalarial work in Sydney I was int