As part of spring cleaning for the new decade of 2020, the website for Molecular Materials Informatics has been re-done. The original plan for the site in 2010 was to make it information rich, but a lot of it had gotten stale over the years: products appear, disappear and change. Now it’s more of a placeholder with links to projects that have their own sites wherever possible, e.g. documentation and open source projects are hosted on GitHub.
It’s been awhile since I’ve posted anything, but not for lack of activity in the world of sciencey-informatics. Next week I’ll be at the BioIT World FAIR Data Hackathon in Boston, along with several other members of the Research Informatics team of Collaborative Drug Discovery. Right now we’re tooling up a customised instance of the BioAssay Express (for which the most uptodate standard version can be found here) so that we can deploy several different proposed templates for evaluating whether a published article abides by FAIR principles. The plan is to evaluate as many articles as we can, and produce a scoresheet at the end of the day. I don’t know what the answer will be, and it will be interesting to find out!
One of the pitfalls of using multiple public ontologies is that sometimes there are two teams doing great work that overlaps, but neither is a superset of the other. This has come up for the BioAssay Express project, which uses both the Cell Line Ontology and BRENDA cells & tissues.Continue reading
Awhile back I described the idea of bond artifacts, which are layered on top of a core cheminformatics representation to give the rendering engine the hints it needs to make the visual diagram look like what chemists want to see (without breaking the underlying machine readability). Now this enhancement has been added to the open source WebMolKit framework and the derived SketchEl2 drawing app. Furthermore, the artifacts can survive a round trip encoding with the industry standard Molfile CTAB format. Continue reading
One of the things I’ve been investigating lately is the open access segment of PubMed, which is a rather massive collection of open access medicine-relevant publications, with accompanying full text. Similarly with the ChEMBL database, which is focused on structure-activity data traceable back to the original literature document from which each datapoint was curated. This is all for the purpose of advancing the BioAssay Express mission of making the world’s bioassay protocols machine readable (aka FAIR). Continue reading
As of now, there’s a KNIME plugin that can be used to access data from the BioAssay Express. The plugin uses the existing API functionality that can grab all of the available bioassay protocols, or a subset as defined by a query, and bring them into the KNIME ecosystem as a table which can be processed using the multitude of other node types. Continue reading