In the spring of 2021, writing from the comfort of my home office, it’s hard to even remember what it’s like to pack my bags and head out to stand in front of a room and tell people about what I’ve been doing lately. The delights of airport security, jetlag, hotel wifi and bad coffee are not exactly missed, but pretty much everything else is. Results do still get communicated, though, and one nice thing about delivering a webinar is that you can be pretty sure it will be recorded for the benefit of the whole internet for all time (or as may be the case, some of the time).
A lot of the work I’ve been doing lately is in support of the chemical mixtures project, which is part commercial with Collaborative Drug Discovery and part open standards with IUPAC/InChI Trust. The short background on this project is that cheminformatics has been delivering software for handling molecular structures for almost half a century now, but in the real world, chemicals are pretty much always encountered as a mixture of multiple things (deliberately or otherwise). The lack of any common machine readable datastructure means that – like with many corners of scientific practice – communication is done using text jargon. That was fine for a long time, but in the data science era, not so much.
We introduce the subject in quite a bit of detail in Journal of Cheminformatics 2019: Capturing mixture composition: an open machine-readable format for representing mixed substances. The paper is open access and available here.
The first online presentation we did was part of the Collaborative Drug Discovery regular series, in which we introduced the idea of capturing mixtures in a machine readable standard form, and discussed the various use cases. It was recorded in December 2019 and is watchable here. I was visiting family in New Zealand at the time, and got to participate in the webinar slightly before dawn. Shortly after I got back to Canada, the world went mad and travel ceased… but the online presentations continued.
The second presentation is themed: we were invited to present at the Royal Society of Chemistry Formulation 4.1 symposium which happened in November 2020. Having a machine readable datastructure for formulations (broadly defined) is very important and relevant, and we made the case for it: the presentation was watchable here, but since last time I checked, the YouTube video has been made private. So I guess we all [re]learned a valuable lesson about persistence… at least the slides are still on Slideshare, which is likely to continue to be the case for a bit longer.
The third presentation was for the Cambridge Cheminformatics Network Meeting (that’s the original Cambridge in England) in February 2021, entitled Mixtures as first class citizens in the realm of informatics. Back in the face-to-face age I was never in the right place at the right time to participate in one of these, so going fully online was actually an opportunity. The slides are on slideshare, and the recording is still available (starts at 1:05:00). There’s no reason to think it’ll get yeeted, but in case it does, the subject of the talk is bringing chemical mixture datastructures into the fold, so that our decades worth of single-molecule cheminformatics tools can be put to work.
The last presentation was in March 2021 and given at the InChI Virtual Workshop, hosted by NIH (National Institutes of Health: normally we do this on their campus in Bethesda, MD). This talk is entitled Mixtures InChI: A Story of How Standards Drive Upstream Products, and it describes how the original Mixtures InChI proposal brought us (Collaborative Drug Discovery) on board and resulted in a happy partnership. The recording is available online (starts at 2:22:00) as well as slides on slideshare. If you want to watch the recording without sound, you can find out how well the lucky person in charge of adding closed captions dealt with my “mid-Pacific” accent.