2012 redux

Now that the Mayan calendar has flipped its highest bit, and as far as I can tell everything is still where it was the day before except for the afterparty mess, it’s as good a time as any to review what has happened over the last year as pertains to Molecular Materials Informatics and its relevant domain, namely the intersection of cheminformatics and mobile+cloud products.

One of the most noticeable outward-facing changes to the product lineup is the addition of products for Android, which had been on the roadmap since the beginning, but now it’s a reality. The Android ports benefitted substantially from hindsight, since some of the codebase is in some sense now 4th generation: early experimentation on the desktop platform and reincarnation as a server-based tool stack; the initial MMDS app deployed for BlackBerry; and the hasty rewrite for iPhone/iPad once it became clear that BlackBerry was locked in a death spiral rather than suffering a bit of turbulence. The Android port is designed as library first/apps second, which takes into account the expectation that there will be many apps build on the same core technology, rather than one big app and perhaps a few spinoffs.

The first product is a bare minimum viable product implementation of MolPrime for Android, which is free like MolPrime for iOS, but has an interface that looks almost the same as MolPrime+. The feature set is still quite bare: it allows drawing of molecules, collecting them in a set, and performing a handful of functions with them, which underlies the fact that the primary objective is to get the technology running on the new platform, get it field tested, find all the bugs, and map out all the gotchas for the Android platform. Once the MMDSLib core library + MolPrime app reached an acceptable level of maturity, the ChemSpider Mobile app was ported to Android, on behalf of the Royal Society of Chemistry.

Meanwhile, the iOS version of ChemSpider Mobile got an overhaul, to add substructure searching capabilities, as well as some general improvements. The app itself is close to having 15,000 users. If that figure was claimed for a general interest consumer product, the followup question might be “You mean new downloads per day, right?”, but things work a bit differently here: fifteen thousand is a lot of chemists!

The Android suite was followed up by an app called Lab Solvents, which is a next-generation version of the original Green Solvents app, which adds in the GSK solvent selection data to supplement the ACS Green Chemistry Institute’s solvent guidelines, and also adds a lot more features. In this case the Android development has leapfrogged the iOS equivalent, though that may be a temporary edge.

Working with Sean Ekins and other researchers brought together by Collaborative Drug Discovery, we designed an app called TB Mobile, which makes available a curated set of molecular structures with their known pathways for inhibiting tuberculosis. The data is hard to come by ordinarily, but it’s now available as an app for iOS and Android, providing various features for searching, recalling and utilising the data. The apps are all free, and if anyone uses them to make even a tiny step toward a better cure for the disease, then the whole project was worth it and then some.

The user interface for the TB Mobile app was modelled on an earlier app for iOS called Approved Drugs. This app was an excuse to develop several new core features, one being a revised user interface designed for extreme simplicity in the presentation of chemical structures in a fixed browseable catalog, yet allowing some high value complex features to be exposed for anyone who is interested. For example, the app allows the user to sketch a structure and have the 1300 FDA-approved drugs sorted by structural similarity to that structure – all done within the app itself, making it quite a useful way to explore the landscape of organic molecules that have the stamp of approval for turning into a pill and dispensing to humans. This has implications for drug repurposing, since finding new uses for an already studied molecule is a way to circumvent the Herculean task of establishing the safety of a new drug. The app also explores some new functional areas (for me, anyway), such as displaying of 3D structures using OpenGL. The drug conformations can be viewed and rotated within the app, which isn’t a unique feature amongst chemistry apps in general, but it means that the MMDSLib codebase now contains the necessary functionality to deliver 3D in other contexts, without having to start from zero.

The SPRESImobile app, which is a menage trois between InfoChem, Eidogen-Sertanty and Molecular Materials Informatics, underwent a couple of overhauls. It is now possible to gain access to the entire SPRESIweb collection, if you have an account, or want to initiate a trial; and it is also possible to draw reaction queries, and search for reactions using several different algorithm patterns.

The back-end libraries that provide many extra services to apps, internally called com.mmi (based on the Java naming) and hosted on molsync.com, have added a lot of new capabilities, only some of which are currently made available. There are property calculation features (e.g. log P, tautomers) that are used by the MolPrime+ app, web sharing features that are used by a number of apps, and scaffold-substructure matching and deconvolution, which has been integrated with the SAR Table app. The SAR Table app also added a new matrix view feature, which allows plotting of two properties against each other, e.g. scaffold vs R1, R1 vs R2, R1 vs property value, etc. Each cell of the matrix is annotated by colour-coded property values, allowing visual insight into the structure activity relationship for the current dataset.

Near the beginning of the year, Sean Ekins pitched me on the idea of building an app that scrapes social networks like Twitter for scientific content relating his domain interest of rare & neglected diseases, and giving it an interface inspired by the highly acclaimed Flipboard app. We took on the challenge in a rush, and built the Open Drug Discovery Teams (ODDT) app, to allow him to take a prototype with him to a Pistoia Dragon’s Den session: while the slides looked like they were drawn on a whiteboard, having an actual product, which we skunkworked in about a week, was helpful. I still have my 50% cut of the mickey-mouse dollars handed out as prize money.

Over the year we have added a number of features, and although it’s still in a quasi-beta phase, you can follow its progress through the threads on this blog. We started exploring some options to make the project financially sustainable, including an IndieGoGo campaign, though it turns out that making use of crowdfunding successfully isn’t quite as straightforward as some would have you believe. Progress on the ODDT app has gone hand in hand with improvements to the back-end cheminformatics software (com.mmi), which drives the service on the back end.

One of the organisations that has sat up and taken notice of the potential for apps to positively disrupt the pharmaceutical industry (in the same way they are doing to every other industry) is the Pistoia Alliance, which has been working on its own app strategy. Since chemistry apps are so new, expertise is still hard to find, so I’ve been helping out with the planning of a new app store for life sciences R&D apps. The first phase should be ready to unveil in January: initially it will be more of a catalog than a store, but in the early stages the project can supply one very important piece that is missing from the general purpose app marketplaces, which is somewhere that scientists can go to find a curated list of relevant apps, and assemble to discuss what they need in order to solve real problems, and where app creators can respond with their own ideas on how to deliver useful products. For more information, see the webinar that I delivered not long ago.

2012 has been quite a good year for publications: we’ve described chemistry workflows, the open drug discovery teams project, apps for green chemistry, and another manuscript that is accepted and due out early next year. This is the first time that I’m convinced that I’m actually forgetting something, which is another way of saying that there’s been more action than usual!

It has also been quite busy presentation-wise: the first year that I’ve attended both the national American Chemical Society meetings, the first time I’ve given more than one talk at the same meeting, and also the first time I’ve been to a regional meeting (in Raleigh, NC). Check out the slides:

I also volunteered to do the first CINF (Chemical Information) webinar, which is chemistry-workflow themed, and is available in all its glory on YouTube. Whether we like it or not, we’ll all probably have to get used to attending more webinars and taking less plane flights, but there are some benefits: the coffee is better, and some of us get to stay in our dressing gowns.

No doubt there are plenty of small and medium developments that I’ve left off this list, but now it’s time to go over some predictions and ideas for 2013. I will be attending the ACS meeting in New Orleans, with two talks: one of them in the COMP division, comparing the state of mobile+cloud cheminformatics workflow capabilities to the existing desktop equivalents. I haven’t written the talk yet, but I’m looking forward to getting started. I will also be describing the Pistoia Alliance app strategy: where the project is, and where it’s going.

To my great delight, I have been invited to speak at the Gordon Conference in the summer, in the Visualization in Science & Education meeting, specifically visualisation in mobile environments. While I haven’t decided what to talk about, it’s because there’s too much choice, and it’s 6 months way, so that gives me plenty of time to generate new material.

When the Pistoia Alliance AppStore launches, in January (fingers crossed), it may be taken as given that all of the iOS and Android apps will be submitted, and I will be actively encouraging anyone with an interest in using these tools for their own work to sign in and join the conversation. The anonymous disconnection of app creators & app users works adequately for high volume consumer apps, but chemistry is a small and tightly integrated group of individuals, and it’s essential that we have the opportunity to get together and discuss ways that we can help each other solve problems, and of course make sure there’s enough funding to make it happen. Ensuring that Pistoia has everything they need to advance this goal is a huge priority, since Molecular Materials Informatics has so much invested in this space.

As was hinted in the CINF webinar, the SAR Table app has an experimental feature which allows you to use a webservice to automatically build a prediction model that uses structures + existing activities to full in the blanks for missing activities. This feature is not public yet, because the model building tools were good enough for a demo, but not to stand up to any level of detailed scrutiny. This is about to change: the back end services now make use of a subgraph-based group contribution datastructure combined with a genetic algorithm for automated model building. It is almost ready to make available as a minimum viable product which – at first, anyway – should be taken with a grain of salt before relying on, but it’s looking good, and will be available soon, for the small (20-200 compound) collections typically managed by the SAR Table app.

The same methods will be turned onto larger datasets, in order to expose more property calculation methods for use within apps. Experience will tell what kinds of properties work best, but some of the ideas we have include “green” properties based on data provided by the Environmental Protection Agency, particular for unwanted side effects such as toxicity in its various guises.

On the subject of green chemistry, there are some blueprints already for a lightweight app that combines the properties of a synthetic lab notebook (i.e. reactions and quantities) with a diverse variety of useful features, data sources and algorithms for green chemistry metrics. It’s too soon to give any details or a timeline, but this blog is the place to look for news.

Expect the Android platform to continue to receive attention. Unfortunately the additional burden of maintaining separate products for iOS (Objective-C) and Android (Java) is a major drain on resources, but both platforms now have the MMDSLib core cheminformatics library, which makes the creation of new chemistry apps very rapid, at least relative to starting from scratch. New apps will no doubt appear for both platforms, seemingly spontaneously, with almost no warning!

The Open Drug Discovery Teams (ODDT) app/server combo will continue to evolve, which includes adding support for more rare or neglected diseases, and further fine tuning of the way the user interface allows people to access and contribute to the data. We’re making this up as we go along, so the app mutates fairly quickly. One of the approaches we will be pushing fairly aggressively is increased cheminformatics power on the back end, which means we’ll be looking for more way to feed in structures, reactions and data, giving us more reason to improve our techniques for dealing with it. We plan on adding some of the expected features of a server that deals with structures, i.e. searching. Once we combine powerful search & retrieval methods, with intelligent classification of activity and property data, and crowd-based curation for filtering out the junk and elevating the gems, we intend to be able to provide a genuinely useful source of structure-activity data for various purposes, including modelling.

Also the way in which ODDT allows content to be released and annotated will be extended from the very bare bones capabilities it has now. We always promised that the project would evolve into a kind of “micropublishing platform”, whereby people can release data in any state of readiness, and have it make its way out into the community. The idea is to supplement the opposite extreme of peer reviewed publications, which takes seemingly forever (on the Web 3.0 Twitter timescale!), and tends to cherry pick only the results that sound most impressive. There’s no reason why your second-rate data (“just as good, but not as interesting”) shouldn’t be available too: considering the number of bits the internet makes available for timewasting frivolity, it can spare a few more to cover all the bases.

The current app zoo has some ability to search chemical databases, which includes the MMDS webservice protocol, and integration with other apps like ChemSpider Mobile, SPRESImobile and Mobile Reagents which can take over the process of searching for structures or reactions. There are plans in progress for making it more convenient to search public databases (such as PubChem and ChEBI) from inside a variety of apps – and also to bring in the ability to search the Open Drug Discovery Teams collection, too.

Expect existing collaborations to continue, and if all goes well, new ones to be established. Several are on the drawing board: for example, Cycle Computing is a company that specialises in scalable high performance cloud computing, with some expertise in highly demanding chemical calculations. It turns out that chemistry + cloud + mobile interests give us enough overlap to work together. There all kinds of other far fetched ideas, some insane, some unrealistic, and some of which will actually happen inspite of (or because of!) this. One of these includes borrowing from some of the ideas used by the “augmented reality” crowd to visualise chemical data – but that’s not going down as a product prediction, since I haven’t even started with the techno due diligence needed to figure out if this makes any kind of sense, though I am sure it would be pretty cool.

So that’s 2012 and 2013: no doubt I’ve missed plenty, but it should be enough to keep busy for a healthy chunk of it!

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s