Scaffold searching using the SAR Table app

sarsearch1The third app to implement a front end for the new general search capabilities is SAR Table. This one works a bit differently, because instead of offering the standard name/structure/substructure/similarity quartet, it offers a fifth type: searching by scaffold template.

To initiate a search, the first thing to do is to create a new row in the table, and draw or select a scaffold, as shown to the right. In this collection of tyrosine kinase, there are five R-groups, each of which are left undefined in the row, and are appropriately decorated on the scaffold.

Selecting the search action brings up a confirmation request, then sends the query out to the server, which currently uses PubChem and ChEBI as its source material. Once the search is complete, the results are shown in a list:

sarsearch5

The result list looks very much like the implementation in MMDS, but the underlying content has some key markup. The giveaway in the screenshot above is that all of the structures are displayed with scaffold highlights and orientation… the same as the query. This is not a coincidence, which becomes more evident after selecting some number of them and importing them into the table:

sarsearch3

In this screenshot, the selected 4 structures have been added as new rows, and each one is decomposed into scaffold/substituents/whole construct, which is how things work in the SAR Table app. This is because the search aggregator which runs on molsync.com has already analysed the structures, and broken them up into scaffold + substituents. The app is presenting these results for perusal after having glued them back together again with its special case depiction algorithm.

Digging deeper into the method, the way that the middleware carries out this functionality is by chopping off the “R-group” decorations from the scaffold query before sending them off to the public search engines (PubChem, ChEBI). As the substructure matches come in, each of the results is submitted to the middleware’s higher level scaffold-substructure matching system, which checks the legitimacy of defined R-groups, removing invalid implied symmetry, and also expanding out multiple non-degenerate ways to define R-groups (which uses the same algorithms as the scaffold matching feature). By the time the post-filtered results get back to the app, they have a lot more markup.

If the row used to formulate the query has any of its substituents defined already, these are made a part of the query constraint, so if you set R1=phenyl, that would act as a further constraint. The search also makes sure to not include any of the compounds already in the table, which means that a query can be described as asking the question “What am I missing?”. It is an effective way to use public databases to supplement your structure-activity series: if you are making and testing compounds, and there are known compounds that belong in the series that could be prepared from a recipe, or purchased, or better yet already measured against the same target, then that’s probably something you need to know.

The new version should be submitted to the app store soon, after a bit more polish & testing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s