MolSync overhaul: back to the web, now with reactions too

molsync_overhaul1Things have been a bit quiet in these parts lately, but not due to inactivity: far from it. In between working on some exciting projects with Collaborative Drug Discovery, I have been quietly making rapid progress on several important key technologies. These include the OS X Molecular DataSheet (XMDS), presiding over a growing collection of reaction data, and most recently a complete overhaul of the MolSync website, which provides cheminformatics support services of various kinds.

As mentioned in earlier posts, the XMDS desktop Mac app has been a high priority pet project for data entry of chemical reactions. Designing an editor for chemical reactions seems like it should be quite easy – just a handful of extra features layered on top of the core functionality of a structure editor – but the reality is nothing of the sort, especially if that editor aspires to capturing the data in its full informatic glory, and making it maximally ergonomic for data entry. One of the handy things about writing software that one is personally qualified to use and have a real need for is that a lot of refinement can be done without having to persuade someone else to use it. At this point I have drawn out literally hundreds of chemical reactions, in detail. They are taken from various sources, all of which are not machine readable. For example, some of the reactions from ChemSpider Synthetic Pages:

molsync_overhaul2

… and an editing session:

molsync_overhaul3

Curating this data is an interesting exercise in and of itself, since the source content – despite being online – is not really machine readable. The reactions that I have carefully recreated are interpretable as properly typed molecular species in balanced reactions with atom-to-atom mapping and material quantities.

This exercise has an obvious double purpose: improving the data entry software, and also creating a body of reaction data, which is very highly specified – accurate and precise – relative to what people generally do with these things. One of the nice characteristics about chemical reactions is that although there are millions of published experiments, the majority of organic chemistry can be captured with a number of reaction types that is numbered in the tens or hundreds, depending on thoroughness. These reaction types can be illustrated quite well with maybe an order of magnitude more specific examples, which means that the amount of data entry needed to create a database that can provide useful reference information and guidance is quite well within the range of what one person can create singlehandedly (with some appropriately excellent software tools, of course).

In terms of derived functionality, there’s nothing especially functional just yet, but toward that goal, the MolSync service has been subjected to a major overhaul. On the surface it still performs the same functions as it has for years, e.g. sharing molecules:

molsync_overhaul4

The aesthetics are ever so slightly different, which is a consequence of the JavaScript part of the service being completely rewritten. Back when I started building this, web development was quite different. Although some important new technologies like vector graphics rendering were making their way into mainstream browsers, the platform was an abominable mess. At the time I decided to use Google Closures as the core library. This is a project that was the right technology for its time: it essentially wrapped everything you could possibly think to do in JavaScript, so that Google’s engineers could painstakingly ensure that everything was perfectly cross-platform, behind the scenes. It was a very heavy handed solution, and the code was ugly as hell, but it was the lesser evil. Despite being lesser, though, the amount of evil was too much for my liking, and so web development went on the shelf for a long time. It was just too painful to build even mediocre products, and not really possible at all to make them good.

Fast forward to now: it seems that the major browsers have pretty much converged on a common feature set that is actually adequate for creating reasonable quality products with a relatively modest amount of unnecessary pain. Two technologies have matured very nicely indeed: JQuery, which seems to be almost ubiquitous, and TypeScript, of all things. As someone who was a practicing software engineer during the Lost Decade of Software – the 1990s – I have an excessive, irrational and paranoid hatred of anything from Microsoft. However, the TypeScript technology had an intriguing mandate: to add compile-time typing to JavaScript, and coerce the syntax into something a bit more familiar and sensible (i.e. Java/C#-esque), but otherwise coexist nicely. Add to that the recently released Visual Studio Code, which is not only free, but runs really well on Linux, it adds up to a mature and well designed developer user experience.

So long story short, the JavaScript/Google Closures codebase has been ported to TypeScript/JQuery, and tidied up and extended along the way. Back to the subject of chemical reactions, there is a very basic proof of concept page that randomly picks a few of the reactions that I have painstakingly entered, and renders them onscreen:

molsync_overhaul1

Right now this interface does nothing but display content, but that will change. You may notice that the top of the page includes a couple of blank rectangles which look like placeholders for a chemical reaction, which indeed they are. These are to be linked to a work-in-progress, which is a porting effort to bring the MMDS/XMDS sketcher to the web platform. The web-based sketcher was actually started a long time ago, but fell victim to prioritisation. Now with an invigorated core platform, web sketching is back on the menu:

molsync_overhaul5

The sketcher concept is analogous to the desktop implementation found in XMDS, which combines the “low input bandwidth” sketching technology designed for touchscreen mobile devices (toolbars on bottom and right) with a more conventional array of tools on the left, which are familiar to every chemist. At the moment it is just a few features away from being able to do some basic drawing, although the finishing touches will take a bit longer.

Having a web-based sketcher is a priority and a rate limiting step, because the goal is making collected reactions useful, starting with ways to search them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s