top of page

Big Data, Data Prep

Exploring reaction data for machine learning: synthesis planning and reaction predictions

Prof Alexei Lapkin, Department of Chemical Engineering and Biotechnology, University of Cambridge

Data is critical for machine learning, but in chemistry, data is rather scarce and needs to be prepared for machine learning workflows. In SRE group we have approached this problem from different directions: evaluating how to use large reaction datasets of available chemical data (Reaxys), looking into data recording standards (extension of InChi to record process data and UDM standard), preparing data for ML pipelines starting from Open Reaction Database (ORD) schema as well as from Reaxys. We then productised some of these tools via our start-up company Chemical Data Intelligence (CDI) Pte Ltd. In the talk I’ll discuss the academic work on this topic and the route to commercialisation

bottom of page