Lately I’ve been working on a new extension to the InChI identifier which is intended to broaden its domain to include the universe of non-organic compounds and all of the insane diversity of exotic bonding types. Preliminary results of the first stage are up on GitHub. Continue reading
It’s been awhile since I’ve posted anything, but not for lack of activity in the world of sciencey-informatics. Next week I’ll be at the BioIT World FAIR Data Hackathon in Boston, along with several other members of the Research Informatics team of Collaborative Drug Discovery. Right now we’re tooling up a customised instance of the BioAssay Express (for which the most uptodate standard version can be found here) so that we can deploy several different proposed templates for evaluating whether a published article abides by FAIR principles. The plan is to evaluate as many articles as we can, and produce a scoresheet at the end of the day. I don’t know what the answer will be, and it will be interesting to find out!
One of the pitfalls of using multiple public ontologies is that sometimes there are two teams doing great work that overlaps, but neither is a superset of the other. This has come up for the BioAssay Express project, which uses both the Cell Line Ontology and BRENDA cells & tissues.Continue reading
Awhile back I described the idea of bond artifacts, which are layered on top of a core cheminformatics representation to give the rendering engine the hints it needs to make the visual diagram look like what chemists want to see (without breaking the underlying machine readability). Now this enhancement has been added to the open source WebMolKit framework and the derived SketchEl2 drawing app. Furthermore, the artifacts can survive a round trip encoding with the industry standard Molfile CTAB format. Continue reading
One of the things I’ve been investigating lately is the open access segment of PubMed, which is a rather massive collection of open access medicine-relevant publications, with accompanying full text. Similarly with the ChEMBL database, which is focused on structure-activity data traceable back to the original literature document from which each datapoint was curated. This is all for the purpose of advancing the BioAssay Express mission of making the world’s bioassay protocols machine readable (aka FAIR). Continue reading
As of now, there’s a KNIME plugin that can be used to access data from the BioAssay Express. The plugin uses the existing API functionality that can grab all of the available bioassay protocols, or a subset as defined by a query, and bring them into the KNIME ecosystem as a table which can be processed using the multitude of other node types. Continue reading
ACS time is sneaking up on us, as it tends to do: in about 2 months from now – August 2018 – a large proportion of the world’s chemistry community will be converting in Boston, including myself. This particular city is probably my favourite venue, simply by virtue of being such a massive hub for various forms of chemistry and derived industries, such as small molecule drug discovery.
My presentation will be on the Wednesday:
|Paper ID||2972556 / CINF 150|
|Title||Bringing assay protocols into the age of informatics|
|Session||Drug Discovery: Cheminformatic Approaches|
|Section||Division of Chemical Information|
|Time||1:30pm, Wednesday 22 August 2018|
As for the last couple of meetings, I will be presenting aspects of the BioAssay Express project, on behalf of Collaborative Drug Discovery. Our team has been quietly expanding in numbers and working hard making the product more powerful and more polished, so there are lots of new features to talk about!