BioAssay Express is now open source

11 September 2023 

The BioAssay Express is being released as an open source project, under the Apache 2.0 license. The short description of this license is that it is permissive, and essentially the only restriction is acknowledgment.

BioAssay Express is a grant-funded project to bring semantic web annotations to bioassay protocols, using vocabularies such as the BioAssay Ontology (BAO) to enrich descriptions that are primarily stored as text. Because of the universality of ontology terms, this means that annotated assays take on standardized meaning and can be processed by machines as effectively as they can be understood by scientists. This is a canonical example of the application of FAIR data principles (F = Findable, A = Accessible, I = Interoperable, R = Reusable).

The project formally started at Collaborative Drug Discovery in 2014 with the development of a new natural language/machine learning technique to assign BAO terms to text assay descriptions [Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation: A.M. Clark, B.A. Bunin, N.K. Litterman, S.C. Schürer, U. Visser, PeerJ, 524, 2014 ]. The project continued with the development of the Common Assay Template for shepherding a multitude of ontologies into a couple dozen categories that capture a large amount of valuable information about bioassays [BioAssay templates for the semantic web: A.M. Clark; N.K. Litterman; J.E. Kranz; P. Gund; K. Gregory; B.A. Bunin, PeerJ Computer Science, 2, e61] and subsequently into a web-based interface at www.bioassayexpress.com. The web interface has been used by domain experts to curate ~4000 bioassay protocols from PubChem, which can be used as a gold standard for assay annotation. More recently it has been adopted by the Pistoia Alliance, having designed a custom template to annotate in a greater level of detail.

As a prototype under active development for multiple years, the BioAssay Express was used to pioneer a large number of technologies and ideas, and received a considerable amount of exposure to the drug discovery community. The project has an extensive scope of functionality that falls under the umbrella of creating and using annotated assay information to improve drug design – but like so many collections of experimental technologies, some of them end up being very successful while others don’t end up solving a problem as well or as urgently as was hoped. What we got from extensive dogfooding and community outreach is a clear understanding of the 20% of the functionality that provided 80% of the value, as well as which important features brought the most challenges. With this knowledge, we began planning the next commercial phase of the project: adding assay annotation to our flagship product, CDD Vault. Vault itself is already a sophisticated product that is used daily by thousands of scientists around the world, and these capabilities are now directly accessible within CDD Vault (see July Technical Release Notes here: https://www.collaborativedrug.com/cdd-blog/cdd-vault-update-july-2023), making the new ontology-driven annotation functionality mesh seamlessly with existing functionality for describing and analyzing protocols. For the foreseeable future, we will continue to host the bioassayexpress.com domain, allowing anyone to explore annotated assay content, or join in on the curation process. For anybody who wants to look under the hood, the whole project is now available to the public on GitHub [https://github.com/cdd/bioassay-express]. The source code and constituent data resources can be browsed or downloaded. Anyone who wishes to can set up the web application on a server and use it for their own assays, or create a quick Docker container to explore the functionality locally. By design, all of the important data structures such as ontologies, templates and protocol annotations share the same format between the open source software and the commercial CDD Vault product, which is in line with our desire to encourage an ecosystem to develop around these FAIR principles.

Leave a comment