The Living Molecules app has been out for awhile now, and in a few days I’m going to embark on a road trip Rhode Island to participate in a Gordon Conference about visualisation in scientific education. I’ll be bringing with me a poster that describes the virtues of using the Living Molecules app to create molecular glyphs to embed in posters. These glyphs function like QR codes, except that instead of embedding generic data (usually a URL), they encode a way for the reader to lookup chemical data, which can be downloaded and displayed. There’s a reason for the name of the app, other than having a bit of a ring to it.
Chemical data exists in silico in two distinct categories: machine readable form and presentation form. A more vivid (and perhaps more useful) description of these two types might be living and dead.
If you receive a manuscript on a piece of paper that has chemical structures on it, that is usually enough for your chemist brain to interpret the information which, prior to the information age, was pretty much the only thing you could do with a chemical structure. If you want to add it to a database and mark it up with associated data and do calculations on it, your best bet is to manually draw it with sketching software. You could scan it and use a fancy algorithm to try to recreate the information automatically, but if you care at all about whether the answer is correct (and I can’t imagine why you wouldn’t) then that’s basically a pretty bad idea.
On the other hand, if somebody sends you a picture file of a molecule, don’t be mistaken: the information contained within is just as dead as it would be if it arrived on a piece of paper. The only difference is that you don’t have to scan it. So when that chemical structure PNG file arrives as an email attachment, or those graphics appear on the web page, or you view a twitpic, chances are even though you can understand it, the computer can’t, because the cheminformatic representation, aka the life force of the chemical structure, has been sent to chemical heaven.
There are a few rare exceptions, though. The SketchEl application can export structures as SVG files (vector graphics), and it surreptitiously embeds the chemical structure inside the XML payload. It is capable of pulling these back out later on, so the picture content is just playing dead. It just needs the right conditions to wake up. The molsync.com chemical data sharing service pretends to look like a fairly ordinary sharing site for pictures of chemical structures, but that’s just a front: the site keeps only the raw data, and manufactures pictures on demand, and it provides a variety of other services, such as downloading the data in different formats or visual styles.
So when you see chemical structures in files or on the web, they might be dead and preserved in formaldehyde like the poor butterfly shown above, but sometimes they’re not. Unfortunately when a molecule is printed out on a poster or a paper manuscript, it’s all but certainly lifeless. That is, unless somebody happens to have included an adjunct glyph looking something like this:
The glyph is what allows the Living Molecules app to use your iPhone/iPad camera to go off and fetch the living version of the chemical data, i.e. the raw machine readable content, which fully describes the scientist’s conception of the chemical structure, which computer software can manipulate in an infinite variety of ways, from showing a simple picture, to integrating into databases, to performing elaborate calculations.
The intention is to encourage people to maintain a thread of connection between their human-only presentations, so that it is always possible to fetch the content in a form that can be used by machines, too. Chances are you created the graphics using a machine in the first place, so there’s no need to murder your data!