Successfully reproduced the Nobel Prize in 4 minutes, CMU developed GPT-4 chemist, independently coded and controlled robot to subvert chemical research, published in Nature

avatar
36kr
12-21
This article is machine translated
Show original

AI subverts chemical research and once again appears in Nature! The GPT-4-powered AI tool developed by the CMU and Emerald Cloud Lab teams successfully reproduced the 2010 Nobel Prize research results in less than 4 minutes.

This year the ChatGPT large model became popular, and unexpectedly it subverted the entire field of chemistry.

First, Google DeepMind's AI tool GNoME successfully predicted 2 million crystal structures, and then Microsoft launched MatterGen, which greatly accelerated the design of required material properties.

Today, the research team of CMU and Emerald Cloud Lab developed a new automated AI system - Coscientist, which was listed in Nature.

It can design, code and execute a variety of reactions, fully automating chemical laboratories.

In the experimental evaluation, Coscientist used GPT-4 to search chemical literature under human prompts and successfully designed a reaction pathway to synthesize a molecule.

GPT-4 traverses instruction manuals all over the Internet and selects the best kits and reagents in its database to make molecules in real life.

Paper address: https://www.nature.com/articles/s41586-023-06792-0

The most shocking thing is that Coscientist reproduced the Nobel Prize-winning research in just 4 minutes.

Specifically, the new AI system demonstrated the potential to accelerate chemical research in six different tasks, including the successful optimization of "palladium-catalyzed coupling reactions."

The research on "Palladium-catalyzed coupling reaction" was carried out by American chemist Richard Fred Heck and two Japanese chemists won the 2010 Nobel Prize in Chemistry.

CMU chemist Gabe Gome, who led the research, said, "It was amazing the moment I saw a non-organic intelligence capable of autonomously planning, designing and executing chemical reactions invented by humans."

GPT-4 Automated Chemistry Research

Currently, AI tools are proliferating in scientific fields, but for researchers working in labs or those who are not proficient in coding, AI is not readily available.

We all know that chemical research is based on iterative cycles. In this cycle, experiments are designed, executed, and then refined to achieve specific goals.

For a chemist, the research performed is multi-pronged - requiring not only the technical skills to perform chemical reactions, but also the knowledge to plan and design them.

For example, when synthesizing a new substance, chemists need "retrosynthetic analysis" to think back step by step from the final target substance to determine the initial molecules, then search for suitable reaction conditions in the database and select A synthetic route with the highest probability of success.

However, in actual experiments, it will be found that chemical reactions often fail to produce products with expected high yields and selectivities.

At this time, it is necessary to search the literature again, design a new experimental route, and try the experiment again, and the entire iterative process will become elusive.

For human chemists, even with the corresponding knowledge, it is not easy to design and execute a chemical reaction, because the designed chemical reaction is often difficult to generate products at an ideal rate.

When OpenAI released GPT-4 in March, Gomes and team members began to think about how to make large models serve chemists.

"Coscientists can do most of the things that really well-trained chemists can do," Gomes said.

When a human scientist asks Coscientist to synthesize a specific molecule, it searches the Internet to devise a synthesis route and then designs an experimental protocol for the desired reaction.

After getting a specific experimental plan, it can write code to instruct the pipetting station, and then run the code to let the robot perform the tasks it has been programmed to do.

What’s really cool is that Coscientist can also learn from the results of the reaction and suggest changes to the protocol to improve it.

This iterative cycle optimizes the reaction to achieve the desired experimental goals.

AI writes code to control chemical robots

Clearly, current high-tech chemical robots are often controlled by computer code written by human chemists.

The Coscientist system realizes for the first time that the robot is controlled by computer code written by AI.

The researchers first asked Coscientist to complete some simple tasks, controlling a robotic liquid handler to dispense colored liquid into a plate containing 96 small holes arranged in a grid.

It was asked to "drop a color on every other line", "draw a blue diagonal line", "draw a 3x3 rectangle in yellow", "draw a red cross" and so on.

Coscientist instructed on different designs with liquid handling robots

The liquid handler is just an initial trial, and the team members will also introduce Coscientist to more types of robotic equipment through the Emerald Cloud Lab.

The laboratory is equipped with a variety of automated instruments, including spectrometers that measure the wavelength of light absorbed by chemical samples.

A plate contains liquids of three different colors (red, yellow, and blue). Coscientists are asked to determine what color these liquids are and where they are located on the plate.

Coscientists have no "eyes" and can only write code to automatically pass the mysterious color plate to the spectrophotometer and analyze the wavelength of light absorbed by each hole to identify which colors are there and their position on the color plate. .

For this task, the researchers had to give Coscientist a little hint, instructing it to consider the way different colors absorb light.

The remaining tasks can be completely left to the AI system to complete.

Code generated by Coscientist. It is broken down into the following steps: defining the metadata for the method, loading the labware module, setting up the liquid handler, performing the required reagent transfers, setting up the heater-shaker module, running the reaction, and shutting down the module.

Reproduce the Nobel Prize in 4 minutes and correct code errors independently

Coscientist's ultimate test is to put its assembled modules and training together to complete the research team's command to perform the "Suzuki and Sonogashira reaction."

This reaction, discovered in the 1970s, uses the metal palladium as a catalyst to form bonds between carbon atoms in organic molecules.

These reactions have proven useful in producing new drugs to treat inflammation, asthma and other diseases. They are also used in organic semiconductors, as well as in the organic light-emitting diodes found in many smartphones and displays.

Remarkably, these groundbreaking reactions and their widespread impact were officially recognized with the 2010 Nobel Prize awarded to Sukuzi, Richard Heck, and Ei-ichi Negishi.

Of course, Coscientists had never tried these reactions before.

MacKnight, who designed Coscientist's software module to search technical documents, said, "The most amazing moment for me was seeing it ask all the right questions."

Coscientist looked primarily for answers on Wikipedia, but also on a number of other sites, including the American Chemical Society, the Royal Society of Chemistry, and other sites containing academic papers describing the Suzuki and Sonogashira reactions.

The entire process of palladium-catalyzed coupling reaction

In less than 4 minutes, Coscientist designed an accurate program that produced the desired reaction using chemicals provided by the team.

When it tried to use the robot to execute the program in the real world, it "made a mistake" in the code it wrote to control a device that heats and shakes liquid samples.

But without anyone's prompting, Coscientist immediately discovered the problem, re-referenced the device's technical manual, corrected the code and tried again.

The results were contained in several tiny samples of clear liquid. Boiko analyzed the sample and discovered the spectral signatures of the Suzuki reaction and the Sonogashira reaction.

When Boiko and MacKnight told Gomes about Coscientist's results, Gomes was skeptical.

"I thought they were kidding me," he recalled.

But the result is just there, it’s unbelievable.

"With that comes the need to use this potential power wisely and prevent misuse." Gomes said understanding the capabilities and limitations of artificial intelligence is the first step to developing informed rules and policies that can effectively prevent Harmful uses of artificial intelligence, whether intentional or accidental.

Basic architecture of Coscientist

The researchers proposed a multi-LLM-based intelligent agent (hereinafter referred to as Coscientist), which is capable of autonomously designing, planning, and executing complex scientific experiments. Coscientists can use tools to browse the Internet and related documentation, use the Robot Experiment Application Programming Interface (API), and leverage other LLMs to complete various tasks.

The researchers demonstrated Coscientist's versatility and performance across six tasks:

(1) Use public data to plan chemical synthesis of known compounds;

(2) Efficiently search and browse a large number of hardware documents;

(3) Use documents to execute advanced commands in the cloud laboratory;

(4) Use underlying instructions to accurately control liquid handling instruments;

(5) Process complex scientific tasks that require the simultaneous use of multiple hardware modules and the integration of different data sources;

(6) Solve optimization problems that require analysis of previously collected experimental data.

Scientists acquire the knowledge needed to solve complex problems "through interaction with multiple modules (web and document search, code execution) and experimentation."

The goal of the main module (Planner) is to perform planning based on user input by calling the commands defined below.

The planner is a GPT-4 chat instance that plays the role of an assistant. The user's initial input and command output are considered user information to the planner. The planner's system prompts (static inputs that define LLM goals) are designed in a modular manner and are described as four commands that define the operating space: "GOOGLE", "PYTHON", "DOCUMENTATION", and "EXPERIMENT".

The planner calls these commands as needed to gather knowledge. The GOOGLE command is responsible for searching the Internet using the web search module, which is itself an LLM.

The PYTHON command allows planners to use the "code execution" module to perform calculations in preparation for experiments.

The EXPERIMENT command is "automated" through the API described by the DOCUMENTATION module.

Like the GOOGLE command, the DOCUMENTATION command provides information to the main module from the source, in this case the documentation about the required API.

The researchers demonstrated compatibility with the Opentrons Python API and the Emerald Cloud Labs (ECL) Symbolic Laboratory Language (SLL). Together, these modules form Coscientist, which can receive simple plain text input prompts from the user (e.g., "Perform multiple Suzuki reactions"). The image above shows this architecture in its entirety.

Additionally, some commands can use subreactions.

GOOGLE commands convert prompts into appropriate web search queries, run those queries in the Google Search API, browse the web and feed answers back to the planner.

Likewise, the DOCUMENTATION command retrieves and summarizes the necessary documentation (for example, a robotic liquid handler or a cloud laboratory) for the planner to call the EXPERIMENT command.

The PYTHON command executes code using an isolated Docker container (not relying on any language model) to protect the user's machine from any unexpected actions requested by the planner.

Importantly, the language model behind the planner can fix the code when the software makes mistakes. The same applies to the EXPERIMENT command of the automation module, which executes the generated code on the corresponding hardware or provides a synthesized program for manual experiments.

AI allows everyone to become a scientist

The size and complexity of nature is almost unlimited, and countless new scientific discoveries await human breakthroughs.

Imagine new superconducting materials that dramatically improve energy efficiency, or compounds that cure otherwise incurable diseases and extend human lifespan.

However, getting the education and training needed to make these breakthroughs is a long and arduous journey, and becoming a scientist is simply too difficult.

But Gomes and his team envision AI-assisted systems like Coscientist as a solution that can provide the world with a large number of "AI scientists" to meet the manpower needs of scientific research.

Human scientists also need rest and sleep. Human-guided artificial intelligence can conduct "scientific research" around the clock.

"Autonomous AI systems can discover new phenomena, new reactions, and new ideas."

There is a process of trial and error, learning and improvement in science, and artificial intelligence can greatly speed up this process.

"This can significantly lower the barriers to entry in almost any field," Gomes said. For example, if a biologist not trained in palladium-catalyzed coupling reactions wants to explore the uses of the reaction in a new way, they can ask a Coscientist to help them plan their experiments.

References

https://www.nature.com/articles/d41586-023-04073-4

This article comes from the WeChat public account "Xin Zhiyuan" (ID: AI_era) , author: Tao Zirun, and 36 Krypton is published with authorization.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments