Panel of Bayesian screening for BIA 10-2474

The around town in drug discovery right this moment seems to be focused on BIA 10-2474, which my frequent collaborator Sean Ekins has weighed in on at the Collaborative Chemistry blog. In a spur of the moment effort to see if we could use some of our work-in-progress technologies to learn something about what’s going on, we ran it through a series of 1800 Bayesian models that we extracted from ChEMBL. For a detailed view, check out this link on molmatinf.com. The file is close to 20MB, so be patient if you’re on a slow connection.

The background of this work involves a data mining exercise, which starts be reorganising the hierarchical fields within ChEMBL to make the data suitable for feeding into a model (see corresponding literature reference). As a followup from the automated extraction and model building, we put together a script for taking a list of molecules (using a selection of “discontinued drugs” to start with) and for each molecule generate a lengthy report, which involves running it through all 1829 Bayesian models. These correspond to a diverse variety of targets, some desirable, others not, with some redundancy with regard to different organisms (human/mouse/rat/etc.) and different kinds of measurements. The report starts with a ranked list of how well the molecule was predicted to be with each Bayesian model, and then for the more promising cases, a more detailed view which shows the model’s ROC plot, the atom-coloured Bayesian prediction, and a Honeycomb cluster of similar compounds from the dataset. The latter is intended to provide a reality check: nobody should ever place high trust in a number that came out of a model without first digging into the details.

These reports (including the one we just made available) are a bit unwieldy, but there’s quite a bit of information in there, some of which may be useful. In fact, the static web page with embedded graphics is the direct precursor to a more friendly implementation: the PolyPharma app, which you can check out on iTunes. It’s free, and nicely interactive, though we didn’t quite manage to squeeze all of the ChEMBL models in there.

bial1 bial2 bial3