Machine Learning
Calling up Gaussian processes with stochastic gradient descent
Prof Jose Miguel Hernandez Lobato, Department of Engineering, University of Cambridge
Gaussian processes are a powerful framework for quantifying uncertainty and for sequential decision-making but are limited by the requirement of solving linear systems. In general, this has a cubic cost in dataset size and is sensitive to conditioning.
We explore stochastic gradient algorithms as a computationally efficient method of approximately solving these linear systems. Experimentally, stochastic gradient descent achieves state-of-the-art performance on large-scale regression tasks. Its uncertainty estimates match the performance of significantly more expensive baselines on a large-scale Bayesian optimization task. On a molecular binding affinity prediction task, our method places Gaussian processes with a Tanimoto kernel on par with state-of-the-art graph neural networks.
Deploying machine learning in chemical industry
Dr Ben Pellegrini, Intellegens Ltd
How do we separate AI hype from reality?
This talk focuses on the use of machine learning (ML) for data analysis and optimising experiments, products, and processes in research-intensive industries. The focus is on practical applications, proven through real case studies, and on the need for continuous improvement to remove barriers to adoption for the technology.
We will discuss how this involves not just the right technology, for example, ML methods that can be trained on real, sparse and noisy datasets, but also a tight focus on human factors. ML must be seen as a tool that complements rather than replaces human expertise. This means thinking about issues such as how to formulate the right questions to ask the ML, time constraints on its use, how to pilot it successfully in deployments, and investment in optimising user experience. It is also important to address issues of ‘trustworthiness’ through ‘explainable AI’ tools that reduce the ‘black box’ nature of the technology and uncertainty quantification so that users understand the reliability of the information on which they base decisions. The discussion will also touch on the in-house versus external software debate, suggesting a balanced approach for optimal results.Case studies will show how, once some or all of these factors are considered, ML has been deployed to reduced experimental workloads (typically by 50-80%), enhance data insights, and enable the design of improved products and processes.