Profile

Cover photo
Chemistry Development Kit
876 followers|72,251 views
AboutPostsPhotosVideos

Stream

 
Some things are unpredictable. For example, the impact of PaDEL by Chun Wei Yap. While seemingly just providing a simple API around CDK's descriptor and fingerprint functionality, it's impact is significant. Higher than, for example, that of Bioclipse and AMBIT which also provide GUIs around that functionality.

Web-of-Science lists 71 citations of this work, while Google Scholar guestimates it at around 120. Well done!
PaDEL-Descriptor is a software for calculating molecular descriptors and fingerprints. The software currently calculates 797 descriptors (663 1D, 2D descriptors, and 134 3D descriptors) and 10 types of fingerprints. These descriptors and fingerprints are calculated mainly using The Chemistry ...
1
Add a comment...
 
More information on the ECFP/FCFP implementation...
As of now, the latest version of the popular open source Chemical Development Kit (CDK) has its own implementation of the highly regarded ECFP and FCFP classes of chemical structure fingerprints (s...
3
2
Add a comment...
 
 
Changes in CDK 1.6 #3: Constructors that now require a builder
The advantage of the builders in the CDK  is that code can be independent of data class implementations (and we have three of them in CDK 1.6, at this moment). Over the past years more and more code started using the approach, but that does involve that mor...
The advantage of the builders in the CDK is that code can be independent of data class implementations (and we have three of them in CDK 1.6, at this moment). Over the past years more and more code started using the approach, but that does involve that more and more class constructors take a ...
View original post
1
Add a comment...
 
Paper where +Ola Spjuth, +Arvid Berg, Sam Adams, and me where we outline of the InChI is integrated into the CDK and used in the +Bioclipse.
1
1
Add a comment...
 
Uses the CDK to predict a number of properties for compounds.
A; Accounts of Chemical Research · ACS Applied Materials & Interfaces · ACS Catalysis · ACS Chemical Biology · ACS Chemical Neuroscience · ACS Combinatorial Science · - Journal of Combinatorial Chemistry · ACS Macro Letters · ACS Medicinal Chemistry Letters · ACS Nano · ACS Photonics ...
1
Add a comment...
 
"STITCH is a database of protein–chemical interactions that integrates many sources of experimental and manually curated evidence with text-mining information and interaction predictions."

This paper use tanimoto calculations to remove similar compounds:

"To avoid biases, we first excluded highly similar chemicals, enforcing a maximum Tanimoto similarity of 0.9 using 2D chemical fingerprints calculated with the chemistry development kit."

BTW, much data in this database has a Creative Commons license flavor
1
4
Add a comment...
Have them in circles
876 people
Anu's profile photo
Nuncius Australis's profile photo
EurasiaCat Erasmus Mundus's profile photo
Aitzaz Riaz's profile photo
Matt Glover's profile photo
王坤梁(John)'s profile photo
Camille Rougier's profile photo
Riziero Concetti's profile photo
Jennifer Mills's profile photo
 
 
Part II... 
This post follows up on the previous to report some timings. I've checked all the code into GitHub (johnmay/efficient-bits/fp-idx) and it has some stand alone programs that can be run from the command line. Currently there ar...
View original post
1
Szymon Wójtowicz's profile photo
Add a comment...
 
 
You can also read an SDfile more efficiently by repeated calls to MDLV2000Reader. The pattern is similar to BufferedReader.readLine() in a while loop.  
This post in a series about API changes in CDK 1.6 is about the iterating reader for SD files, which are basically a list of MDL molfile (Symyx, ... I lost track) complemented with properties for each structure. Since the CDK IO readers have a representation of the file format in the class name, ...
View original post
1
Egon Willighagen's profile photo
 
But a MDL molfile doesn't have "> <FIELDS>"... ??
Add a comment...
 
 
CDK Release 1.5.6
View original post
1
1
Add a comment...
 
Metabolomics paper by +Steffen Neumann and others where +Rajarshi Guha's rcdk package is used to calculate tanimoto similarities.
Mass spectrometry (MS) has become the analytical method of choice in plant metabolomics. Nevertheless, metabolite annotation remains a major challenge and implies the integration of structural searches in compound libraries with biological knowledge inferred from metabolite regulation studies. Here we propose a novel integrative approach to process and exploit the rich structural information contained in in-source fragmentation patterns of high-r...
1
1
Add a comment...
 
This NanoQSAR paper uses the CDK to calculate molecular descriptors for coating components.
1
Add a comment...
People
Have them in circles
876 people
Anu's profile photo
Nuncius Australis's profile photo
EurasiaCat Erasmus Mundus's profile photo
Aitzaz Riaz's profile photo
Matt Glover's profile photo
王坤梁(John)'s profile photo
Camille Rougier's profile photo
Riziero Concetti's profile photo
Jennifer Mills's profile photo
Links
Story
Tagline
The Open Source Cheminformatics and Bioinformatics Toolkit
Introduction
This G+ Page will be used to share news around the CDK, like links to pages discussing new release of software that uses the CDK, blog posts that analyze CDK functionality, etc.
Contact Information
Contact info
Email