About a year ago I described the results of a preliminary project to scope out the possibility of making the InChI identifier play nice with inorganic & organometallic complexes. There’s now a followup increment that does likewise for some of the higher valence stereochemistry centres that are found in these complexes.
Continue readingOnline mixtures demo, with MInChI generator
Drawing chemical mixtures can be done online, with a conversion feature to generate Mixtures InChI (MInChI) notation. Pseudomixtures from Molfiles are now enumerated automatically when pasted in. The tools for working with machine readable mixtures, using the web platform, are open source. Continue reading
Adding in anti-viral model for COVID-19 predictions
A newly available collection of antiviral structures from Chemical Abstracts has been made available, and has now been shoehorned into a model that can be used online to evaluate potential antiviral drug candidates for COVID-19. The tool can be found at https://molmatinf.com/covid19. Continue reading
More cheminformatics from quarantine: reaction transforms and atomic Bayesians
Further to yesterday’s post about making model resources for COVID-19 available online, the resource page now shows the reaction transforms used to propose potential new candidates, as well as atom highlighting for applied Bayesian models. Continue reading
Cheminformatics from quarantine: some interactive COVID-19 resources
Structure-activity datasets for small molecules targeting the global COVID-19 pandemic are starting to emerge. Some preliminary modelling and interactive tools based on one of these datasets is available at https://molmatinf.com/covid19. Continue reading
Mapping drug target ontologies in BioAssay Express: a narrow use case for Excel
The BioAssay Express project captures a lot of public bioassay data using public ontologies, and quite often these ontologies overlap with each other: this happens particularly often with the target concept. We needed to map common terms between the Drug Target Ontology and Uniprot (via the Protein Ontology) with a mapping file. The amount of data involved was a little too much to do manually, but not quite enough to justify writing a custom script. Because it had to be done exactly once, the task was an ideal use case for Excel. Continue reading
New Molecular Materials Informatics website
As part of spring cleaning for the new decade of 2020, the website for Molecular Materials Informatics has been re-done. The original plan for the site in 2010 was to make it information rich, but a lot of it had gotten stale over the years: products appear, disappear and change. Now it’s more of a placeholder with links to projects that have their own sites wherever possible, e.g. documentation and open source projects are hosted on GitHub.
InChI for inorganics
Lately I’ve been working on a new extension to the InChI identifier which is intended to broaden its domain to include the universe of non-organic compounds and all of the insane diversity of exotic bonding types. Preliminary results of the first stage are up on GitHub. Continue reading
FAIR Data Hackathon / BioAssay Express
It’s been awhile since I’ve posted anything, but not for lack of activity in the world of sciencey-informatics. Next week I’ll be at the BioIT World FAIR Data Hackathon in Boston, along with several other members of the Research Informatics team of Collaborative Drug Discovery. Right now we’re tooling up a customised instance of the BioAssay Express (for which the most uptodate standard version can be found here) so that we can deploy several different proposed templates for evaluating whether a published article abides by FAIR principles. The plan is to evaluate as many articles as we can, and produce a scoresheet at the end of the day. I don’t know what the answer will be, and it will be interesting to find out!
Overlapping biology: Cell Line Ontology and BRENDA

One of the pitfalls of using multiple public ontologies is that sometimes there are two teams doing great work that overlaps, but neither is a superset of the other. This has come up for the BioAssay Express project, which uses both the Cell Line Ontology and BRENDA cells & tissues.
Continue reading