Tox21 toxicity measurements and PolyPharma

toxpolypharma01The PolyPharma app is currently getting some major new content added to it, namely a bunch of new models for toxicity. The models are derived from measurements made on the EPA’s Tox21 collection, recently published in nature, and released via PubChem.

This nice little development lent itself immediately to the treatment that was recently applied to the ChEMBL dataset, which involved chopping up all of the different target groups, and selecting categories for merging into model-ready source collections. The total number of targets and off-targets was more than 1800, of which a selection of them were picked for inclusion in the PolyPharma app. The EPA Tox21 measurements on the other hand make up a much smaller number of targets – 29 to be exact – and these can be pulled down out of PubChem with a relatively simple script. The measurements have already been classed as either active (i.e. bad) or inactive, which makes life easier – no need to figure out a threshold prior to sticking into a Bayesian model.

Generally speaking, the structures that went into the dataset were of high quality, though a number of corrections are still necessary: washing out salts and adducts, a small number of egregious impossible molecules, and quite a lot of inorganics/organometallics with an inconsistent approach to dealing with non-boring bond types, some of which cannot be represented using normal Molfiles anyway. These had to be fixed manually, which was quite labour intensive.

Long story short, using ECFP6 fingerprints to generate Bayesian models resulted in quite agreeable statistics:

Target ROC Actives/Size
ATAD5 0.8955282143671504 372/9819
DT40 Rev3 0.8358695790539805 2470/8564
DT40 WT 0.8481449024919616 2489/8964
P53-bla 0.8671812374075394 659/9403
RE-bla 0.8199006118936651 1131/7322
HSE-bla 0.7807963830271731 497/8376
romatase 0.8564348950930594 378/7846
mitochondria toxicity 0.8975806303770444 1229/7958
hR-luc 0.897262258206716 1058/8878
R-bla agonist 0.8614046773463251 332/9299
R-bla antagonist 0.8637624381527989 647/8316
R-MDA-luc agonist 0.7773459658086779 414/10133
R-MDA-luc antagonist 0.8322296191717147 468/8269
ER-BG1-luc agonist 0.7402058726456726 1017/8340
ER-BG1-luc antagonist 0.868486076379126 472/8666
ER-bla agonist 0.8224802258757133 560/9545
ER-bla antagonist 0.8269233837880261 432/8297
FXR-bla agonist 0.7711240470264435 118/8674
FXR-bla antagonist 0.8486710970456315 255/7833
GR-bla agonist 0.8291952546091927 211/9256
GR-bla antagonist 0.8502523415129595 452/8094
PPAR-delta-bla agonist 0.8070517362777032 112/8274
PPAR-delta-bla antagonist 0.7698014607724666 92/7937
PPAR-gamma-bla agonist 0.8513509469037785 277/8898
PPAR-gamma-bla antagonist 0.8354619508727978 428/7610
TR-beta-luc agonist 0.7290186160244048 64/9422
TR-beta-luc antagonist 0.826791127196751 419/7391
VDR-bla agonist 0.6806460193156809 22/8437
VDR-bla antagonist 0.811523182164013 83/7699

So far so good. The next step is to actually do something with these models, in the same way as we have been for the ChEMBL extracts. Currently this consists of two pathways: one is creating detailed reports that can be perused and scrolled through when applied to individual molecules, such as discontinued drugs, and the other is to add the functionality to PolyPharma, so that anyone with an iThing can try it out.

All of the targets have been added to the mix, and there is now a default Profile called “Toxicity” which focuses entirely on these newly added models:

toxpolypharma01

toxpolypharma02

toxpolypharma03

toxpolypharma04

The updated version of the app hasn’t been submitted to the AppStore just yet, but it will be shortly, after a bit more testing. Stay tuned for the next version!

Leave a comment