Artificial intelligence and machine learning are changing chemistry. Today, the chemical literature contains such a huge volume of information that it is impossible for chemists to know everything that’s been done before. However, AI and ML tools can harness all of that knowledge and enable chemists to apply it to the problem at hand. When combined with data generated in-house, AI and ML provide a powerful tool for doing chemistry more effectively.

It’s not about replacing the lab chemist, but rather supplementing them: helping them make better decisions about how to make or formulate molecules, and to make those decisions more quickly. Fundamentally, time that the chemist might otherwise be spending researching the literature can be spent instead on discovery and innovation, which can accelerate R&D, reduce costs and make chemistry more sustainable.

Scientist

Source: © deepmatter®

AI and machine learning can aid chemists’ decision-making and help cut costs, accelerate innovation and make chemistry more sustainable

This is what underpins the SmartChemistry® portfolio from deepmatter®. SmartChemistry® is an integrated suite of products covering capabilities in reaction monitoring, synthesis design and optimisation that allows chemists to harness the full potential of digital chemistry technologies.

deepmatter®  has its roots in a 2013 spin-out from the University of Glasgow, UK. The company was initially focused around providing real-time capture of experimental data, according to chief product officer Glyn Williams. Since then, route planning and prediction tools have been added, and the company’s portfolio was expanded again in July 2022 with the acquisition of Chemintelligence, which brought optimisation tools, in particular for formulation.

This is where AI tools can have the biggest impact: telling a chemist something they didn’t know before

Glyn Williams

‘There is a lot of information in the public domain, and with tools to scrape that information, sort it and improve the quality, the resulting data can offer an important knowledge base for chemistry,’ Williams says. ‘If the algorithm enables you to do an experiment once and get it right first time, there is the potential to save considerable time and money.’

However, these tools are only as good as the datasets they use, and not everything in the literature is reliable. The strength of SmartChemistry® lies in providing access to robust data, using curated datasets that are constantly being expanded and updated. ‘There needs to be cleansing and categorising,’ Williams says. ‘With anything predictive, you face the “garbage in, garbage out” paradigm. This is why our datasets are so carefully curated.’

The literature sources are supplemented with proprietary in-house data sourced from companies using the products. ‘A pharmaceutical company, for example, has a lot of valuable data in its own databases, and this may not be made public until it is included in a patent application,’ Williams says. ‘By integrating these data, the value of the predictive algorithms can be greatly increased.’

Capturing experimental data

DigitalGlassware® is where deepmatter® started. As well as digitising the recipe for a reaction, the software helps a chemist capture more and better data from their experiments. These data can be used to make predictions for future experiments and, ultimately, improve reproducibility and reliability in the lab.

‘One aspect it addresses is the lack of data structure around what happens in the flask,’ says product manager Robbie Warringham. ‘The traditional approach is to perform an experiment, write up the results, and then disseminate them. But the journey in terms of the chemist’s actions, and how they impact the chemistry’s outcome, are not well captured.’ DigitalGlassware®  within SmartChemistry® provides a way to capture that reality with minimal effort – it runs in the background, automatically capturing and analysing data from sensors that monitor the reaction to gather the information that might otherwise be missed.

A chemist might spend hours searching the literature for ideas, but with a few clicks the software can suggest new routes

Julie Gai

‘The result is a dataset that can be leveraged by data scientists and AI/ML to help improve outcomes,’ Warringham says. ‘That could be reproducibility, sustainability and ultimately predictive elements as well. By building up the dataset, we can start to predict what future outcomes will be, based on the captured data.’

Natural language processing is used to convert a written procedure into a structured language that allows key actions and events during the reaction to be identified. These events are then correlated with data captured in real time from sensors that measure parameters such as temperature, pH or even colour. This ensures an accurate record of how a reaction is run, and it can give important information about the reaction itself: Warringham cites an example where the literature stated a 12 hour reaction time, but the sensors indicated the reaction had gone to completion after just 20 minutes. Similarly, it can detect the moment something goes wrong, and prompt the chemist to abort the experiment rather than waste their time waiting for a failed reaction.

Ultimately, the aim is to integrate it seamlessly into the chemist’s entire ecosystem, connecting to all relevant devices, instruments and sensors, collecting their data and structuring it properly. ‘We could then start to implement more automated applications, including robotics,’ he says.

Designing a synthesis

Another of the company’s SmartChemistry® tools, ICSYNTH, carries out synthesis design. A discovery chemist might be looking for ways to make a completely new molecule, for example, or a process development chemist could be on the hunt for a more practical route to make a molecule on a larger scale. These can be arduous tasks for a human chemist, with success often dependent on an individual’s familiarity with the literature and the scope of their experience.

Deepmatter

Source: © deepmatter®

Designing a synthesis can be laborious, but with SmartChemistry® a chemist can find routes to new molecules in a fraction of the time 

‘A chemist might spend hours searching the literature for ideas, but with a few clicks the software can suggest new routes in a couple of minutes,’ says product manager Julie Gai. ‘The tool doesn’t just show what’s already in the literature: it uses machine learning to learn the chemical rules, and then propose new ones that haven’t been published and you might not have thought of.’ It can also focus in more closely on, say, a specific part of the molecule or an individual disconnection. 

If the AI is good, it will know why an experiment succeeded or failed. Chemists can then learn from the data

Thomas Galeandro-Diamant

Importantly, the software can blend its database of synthetic routes from the published literature – all carefully curated and cleansed by SmartChemistry® – with data from experiments run in house, too. It can also suggest starting materials and reagents that are commercially available, and its sustainability score can be used to select greener synthetic routes. It can even factor the cost of materials into the decision-making process and suggest improvements such as greener solvents. The chemist can customise the search further by defining filters or constraints, such as solvents to avoid or a maximum number of steps.

As well as suggesting synthetic routes, it can also predict any side-reactions that could occur, and which impurities might be present. ‘A side-reaction might make something that is highly toxic, and so that route might be best avoided,’ Gai says. ‘This can help prevent serious problems occurring later on.’

Exploring chemical space

The latest addition to the SmartChemistry® portfolio, ChemAssistant®, supports two types of chemist, says Thomas Galeandro-Diamant, the company’s director of products and services. ‘The first is formulation chemists, working on very diverse products such as pharmaceuticals, cosmetics, coatings, inks, plastics, composites and even construction chemicals,’ he says. ‘The second is synthetic chemists, particularly in process development.’

A formulation chemist might want to create a composite material, for example, using ingredients taken from a small selection that has specific properties such as resistance to heat, corrosion and mechanical damage. The software will propose an experimental design to explore the relevant chemical space, the chemist will prepare the materials, test their properties and feed the results back into the software, which will suggest the next experiments. ‘This is AI, but using a small, highly curated data set, which is gradually built up,’ Galeandro-Diamant says. It works in a similar way for a process chemist, except they will be optimising the conditions for a chemical reaction.

The potential time savings are obvious. ‘If there are 10 or 15 parameters to consider, finding the right combination manually can take some time when varying them one at a time,’ he says. ‘It’s also very difficult for humans to think in multidimensional parameter space in the way a computer can.’

Scientist

Source: © deepmatter®

New products can be developed much more efficiently using AI-powered design of experiments with real-time data capture

AI can also help avoid getting too fixed on an idea. ‘Say you have found a step with an 80% yield,’ he says. ‘That’s not bad, but you’d like 95%. A human chemist may be concerned about leaving that 80% sweet spot, but AI will not – it’s not afraid to do “bad” experiments, as they’re just as important to inform the AI as good ones. In the end it should find a solution, if it exists.’

Equally importantly, these are not black-box systems that spit out solutions without showing their work. ‘If the AI has proposed experiments and generated new ideas, the accumulated data and AI model can be used as an understanding tool at the end,’ he says. ‘If the AI is good, it will know what makes the synthesis high or low yield, or why an experiment succeeded or failed. Chemists can then learn from the data.’

Increased innovation

Companies might be concerned that using such systems might be expensive but, according to Galeandro-Diamant, the cost of implementation is minor compared to the costs that companies are incurring while they lack the knowledge of what AI can – and can’t – do. User friendly tools such as SmartChemistry® can help to address that gap and enable chemists to quickly grasp how to harness AI in their work. ‘It also depends on what data are available,’ he says. ‘Access to good data is critical for success.’

The productivity and efficiency gains offered by these tools are massively important across the board, Williams says. ‘But innovation is prized even more highly, particularly when a chemist is trying to find new ways of doing things. This is where AI tools can have the biggest impact: telling a chemist something they didn’t know before.’

AI will also have a positive impact on sustainability, he concludes. ‘It can help chemists select routes that avoid toxic or dangerous chemicals, and do more reactions in the aqueous phase. AI and ML can help mine databases for greener and more sustainable chemistry, providing chemists with new ideas.’

Yet Williams also notes that the enormous potential of these technologies is multiplied if their different capabilities are applied holistically to solve a particular problem. ‘SmartChemistry® uses AI and ML techniques in different ways, giving a loop that circles from retrosynthesis or suggested formulations to running an experiment, and then piping information back, allowing better routes or formulations to be suggested,’ Williams says. ‘It streamlines the process from design through experiment to analysis at the same time closing the circle, enabling the better discovery and application of data.’ With SmartChemistry®, every part of the system works in concert to ensure it delivers the answers a chemist needs.

Find out how SmartChemistry®  can change your lab at deepmatter.io