About ChatGPT, Huang Renxun and the co-founder of OpenAI had a "fireside chat"

This article is machine translated
Show original
Nvidia's GTC online conference launched a special event, the company's founder and CEO Jensen Huang and OpenAI co-founder and chief scientist Ilya Sutskevi launched a "fireside chat."

Editing and finishing: Li Haidan, Zhou Xiaoyan

Source: Tencent Technology

At 0:00 on March 23, Beijing time, NVIDIA GTC Online Conference launched a special event, the company founder and CEO Jensen Huang and OpenAI co-founder and chief scientist Ilya Sutskevi launched a "fireside chat" ".

Huang Renxun believes that ChatGPT is "the iPhone moment in the AI world", but the arrival of this moment did not happen overnight. The co-founder of Open AI began to pay attention to neural networks as early as ten years ago, and also experienced neural networks in the process of exploring generative AI. How to explore the depth and scale of the network, and make breakthroughs in the ability of machines to learn unsupervised. Up to now, ChatGPT has become a "net celebrity tool" of global concern. Standing in the present and looking back at its iteration and development process, creativity seems to pop out of the "inspiration" of the founder and the team one by one. What are the "exciting moments" behind the seemingly natural innovation?

Meet by video: video link

The following is a summary of the content of this conversation:

Jensen Huang: The recent upsurge of ChatGPT has brought artificial intelligence to the forefront of the world. OpenAI has also attracted the attention of the industry, and you have become the most eye-catching young engineer and the top scientist in the entire industry. My first question is, what was your starting point for initially focusing on and focusing on the field of artificial intelligence? Have you ever imagined that you will achieve such a huge success?

 

Ilya-Sutskevi: Thank you very much for your kind invitation. Artificial intelligence has brought great changes to our world through continuous deep learning. For me personally, there are actually two main aspects:

First of all, my original intention in deep learning of artificial intelligence is that we humans have a certain intuitive understanding of various problems. I am particularly interested in the definition of human consciousness and how our human intelligence accomplishes such predictions.

Second, during the period from 2002 to 2003, I thought that "learning" was a task that only humans could complete, and computers could not do it. So I came up with an idea at the time: If computers can be allowed to learn continuously, it may bring about changes in the artificial intelligence industry.

Fortunately, I was in college at the time, and my major happened to be researching the direction of neural network learning. Neural networks are a very important advancement in AI. We focus on how to study deep learning through neural networks, how neural networks work like the human brain, and how such logic is reflected in the way computers work. At that time, I didn't know what kind of career path researching this field would bring, but I just thought it would be a promising industry in the long run.

Huang Renxun: When you first came into contact with the research direction of neural networks, what was the size of the neural network at that time?

 

Ilya Sutskevi: At that time, the concept of scale was not discussed in the neural network. There were only a few hundred neural units. So many CPU units. At that time, we started a mathematics laboratory. Based on the limited budget, we only did a variety of different experiments at first, and collected various problems to test the accuracy. We all accumulate little by little to train neural networks. This is also the prototype of the first generative AI model implemented at the beginning.

Jensen Huang: Before 2012, you made achievements in the field of neural networks. When did you start to think that computer vision, neural networks and artificial intelligence are the future directions?

 

Ilya Sutskevi: About two years before 2012, I gradually realized that deep learning would get a lot of attention. This is not just my intuition, but there is a very solid theoretical foundation behind it. If the computer's neural network is deep enough and large enough, it can solve some deep-level hard-core content problems. The key is that the neural network needs to have both depth and scale, which means that we must have a large enough database and computing power.

We have put a lot of effort into optimizing the data model. One of our colleagues made a neural network feedback based on "seconds". Users can continuously train the neural network, which can make the neural network larger and obtain more data. Some people think that such a data set is unimaginably large. If the computing power at that time could handle such a large amount of data, it would definitely trigger a revolution.

Jensen Huang: When we first met, it was also when our visions for the future really intersected. You told me at the time that GPUs will affect the lives of the next few generations, and your gut feeling is that GPUs might be helpful for deep learning training. Can you tell me when did you realize this?

 

Ilya Sutskevi: When we first tried to use GPUs for training deep learning in our Toronto lab, it wasn't clear how to use GPUs, how to get real attention on GPUs. As we get more and more data sets, we also become more and more aware of the advantages that traditional models will bring. We hope to speed up the process of data processing and train content that scientists have never trained in the past.

Huang Renxun: We see that ChatGPT and OpenAI have broken the pattern of computer-edited images in the past.

 

Ilya Sutskevi: I don’t think it breaks the editing of computer images, but to describe it in another way, it is “transcendent”. Most of us approach datasets in a traditional way of thinking, but our approach is more advanced. We also thought it was a difficult thing at the time, and if we could do it well, it would be a big step forward for people.

Huang Renxun: Looking at it now, when you went to Silicon Valley to work at Open AI and served as the chief scientist of Open AI, what do you think is the most important job? I think Open AI has different focus of work at different points in time. ChatGPT is the "iPhone moment in the AI world". How did you achieve such a transitional moment?

 

Ilya-Sutskevi: In the beginning we didn't quite know how to carry out the whole project, and what's more, the conclusions we have come to now are completely different from the logic used at that time. Users now have such an easy-to-use ChatGPT tool to help everyone create very good artistic effects and text effects. But in 2015 and 2016, we didn't dare to imagine that we could reach the current level. At that time, most of our colleagues came from Google's DeepMind. They had practical experience, but their thinking was relatively narrow and constrained. At that time, we conducted more than 100 different experiments and comparisons internally.

At that time, I came up with an idea that was particularly exciting to me, which was to give machines a kind of unsupervised learning ability, although today we take it for granted, you can train everything with natural language models. But in 2016, the ability to learn unsupervised is still an unsolved problem, and no scientist has the relevant experience and insight. I think "data compression" is a technical bottleneck, this word is not common, but in fact ChatGPT did compress our training dataset. But in the end we found a mathematical model that allows us to compress data through continuous training, which is actually a challenge to the data set. This is an idea that I'm particularly excited about, and it came to fruition at Open AI.

In fact, such achievements may not be very popular outside of machine learning, but what I want to say is that the result of my work is the training of neural networks.

We want to be able to train a neural network to predict the next word. I think the unit of the next neuron will be closely related to our entire visual neural network. This is very interesting, and this is consistent with our verification method. It once again proved that the prediction of the next character and the next data can help us discover the logic of the existing data, which is the logic of ChatGPT training.

Jen-Hsun Huang: Expanding the scale of data is to help us improve the performance of AI capabilities. More data and larger data sets can help generative AI achieve better results. Do you think the evolution of GPT-1, GPT-2, and GPT-3 is in line with Moore's Law?

 

Ilya Sutskevi: One of the goals of OpenAI is to solve the problem of expanding the data set, but the problem we are just facing is how to improve the high accuracy of the data so that the model can achieve accurate predictions. When we were working on the Open AI project, we hoped that it could do some strategic games in real time, such as competitive sports games. It must be fast enough, smart enough, and compete with other teams. As an AI model, it actually repeats such a reinforcement learning process based on human feedback.

Jensen Huang: How do you fine-tune reinforcement learning that gives human feedback? Are there other subsidiary systems that give ChatGPT a certain knowledge background to support ChatGPT's performance?

 

Ilya Sutskevi: I can explain to you that our working principle is to continuously train the neural network system and let the neural network predict the next word. Based on the texts we have collected in the past, ChatGPT is not only superficially self-learning, we hope that it can reach a certain logical agreement between the current predicted words and the past words. The past text is actually used to project onto the prediction of the next word.

From a neural network perspective, it's more like drawing a conclusion based on different aspects of the world, based on people's hopes, dreams and motivations. But our model has not achieved the desired effect. For example, we randomly picked a few sentences from the Internet as a preface. On this basis, ChatGPT can write a logical paper without additional training. Instead of simply doing AI learning based on human experience, we're going to do reinforcement learning based on human feedback. Human feedback is important, and more feedback makes the AI more reliable.

Jensen Huang: You can give AI instructions to do certain things, but can you tell AI not to do certain things? For example, tell AI where the boundaries are?

 

Ilya-Sutskevi: Yes. I think the training sequence of the second stage is to communicate with AI and neural network. The more we train AI, the higher the accuracy of AI will be, and it will be more in line with our intentions. As we continue to improve the loyalty and accuracy of AI, it will become more reliable, more accurate, and more in line with the logic of human society.

Jensen Huang: ChatGPT came out a few months ago and is the fastest growing software and application in human history. Many people will give various explanations, some will say that it is by far the easiest application to use. For example, its interaction mode is very simple, it exceeds everyone's expectations. People don't need to learn how to use ChatGPT, just give ChatGPT commands and make various prompts. If your prompt is not clear enough, ChatGPT will further make your prompt clearer, then look back and ask if you want this? Such a deep learning process surprised me.

We saw the performance of GPT-4 a few days ago, and its performance in many areas is astounding, it can pass the SAT exam, the bar exam, and it can achieve very high human performance. What I want to ask is, what kind of improvement does GPT-4 have? And in which aspects and areas do you think it will help people improve more?

 

Ilya-Sutskevi: GPT-4 has made many improvements based on the performance of ChatGPT in the past. Our training on GPT-4 started about 6-8 months ago. The most important difference between GPT-4 and the previous version of GPT is that GPT-4 predicts the next word based on more precise accuracy, because There are better neural networks to help predict.

For example, if you are reading a mystery novel yourself, there are various characters and plots in the novel, there are secret rooms and mysteries, and you have no idea what will happen next during the process of reading the mystery novel. Through different characters and plots in the novel, you have several possibilities to predict the murderer. What GPT-4 does is like a mystery novel.

Huang Renxun: Many people will say that deep learning will bring reasoning, but deep learning will not bring learning. How does the language model learn reasoning and logic? There are some tasks where ChatGPT and GPT-3 are not good enough, but GPT-4 is better at. What kind of defects does GPT-4 still have, can it be further consolidated in the next version?

 

Ilya Sutskevi: Now ChatGPT can define logic and reasoning more precisely, and get better answers in the next decryption process through better logic and reasoning. The neural network may face some challenges, such as letting the neural network break the inherent thinking mode, which means that we have to think about how far the neural network can go, in short, how much potential the neural network has.

We believe that the reasoning of GPT has not reached the level we expected before. If we further expand the database and maintain the past business operation model, its reasoning ability will be further improved. I am more confident about this.

Jen-Hsun Huang: Another point that is particularly interesting is that if you ask ChatGPT a question, it will tell you the answer to this question based on past knowledge and experience. This is also based on its summary of past knowledge and databases, as well as the answer and show some logic. I think ChatGPT has a natural property, it can continue to understand.

 

Ilya Sutskevi: Yes, neural networks do have these capabilities, but sometimes they are unreliable, and this is the next biggest obstacle for neural networks. In many cases, the neural network will be more exaggerated, will make a lot of mistakes, and even make some mistakes that humans can't make at all. Now we need more research to address these "unreliability".

Now that the GPT-4 model has been publicly released, it actually does not have the ability to track the data model. Its ability is to predict the next word based on the text, so it has limitations. I think some people may ask GPT-4 to find out the source of some data, and then do a deeper investigation on the source of the data.

Overall, although GPT-4 does not support internal data collection, it will definitely become more accurate with continuous data mining. GPT-4 has been able to learn from pictures and feed back based on the input of pictures and content.

Jensen Huang: How does multimodal learning deepen GPT-4's understanding of the world? Why did multimodal learning define GPT and OpenAI?

 

Ilya-Sutskevi: Multimodality is very interesting:

First, multimodality is particularly useful in vision and image recognition. Because the whole world is formed by pictures, people are also visual animals, and animals are also visual animals. 1/3 of the gray matter in the human brain is used to process images, and GPT-4 can also understand these images.

Second, the understanding of the world through pictures or words is the same, which is also one of our arguments. On a human scale, we as humans probably only speak a billion words in our lifetime.

Jensen Huang: A picture of 1 billion words flashed through my mind, how many words are there?

 

Ilya Sutskevi: Yes, we can calculate how long a person lives and how many words can be processed in one second. If we subtract the time spent sleeping in this person’s life, we can calculate how many words have been processed in a lifetime word. The difference between humans and neural networks is that in the past, if we have a billion-level vocabulary that cannot be understood for text, we can use trillion-level vocabulary to understand it. Our knowledge and information about the world can slowly infiltrate into AI's neural network through text. If you add more elements such as visual pictures, the neural network can learn more accurately.

Huang Renxun: For the deep learning of text and pictures, if we want artificial intelligence to understand the logic behind it, or even exaggerated, it is to understand the basic principles of the world-such as the way we express a sentence in our daily life, such as It is said that a word actually has two meanings, and the change in the pitch of the voice actually represents two different tones. In terms of speaking language and intonation, will it help AI to understand text?

 

Ilya-Sutskevi: Yes, such scenarios as you mentioned are very important. An important source of information about speech and intonation, including the volume and tone of the voice.

Jen-Hsun Huang: In which content has GPT-4 made more progress than GPT-3, can you give an example?

 

Ilya Sutskevi: For example, in some mathematics competitions (like high school mathematics competitions), many questions need to be answered with diagrams. GPT-3.5 is particularly poor at interpreting graphs, while GPT-4 only needs text to interpret, and the accuracy rate has been greatly improved.

 

Huang Renxun: You mentioned earlier that AI can generate various texts to train another AI. For example, there are a total of 20 trillion different language count units in all languages to train the language model, so what is the training of this language model? Can AI generate AI-only data to train itself? This form seems to be a closed-loop model, just like we humans train our own brains by constantly learning the external world, self-reflection, and problem-solving. What do you think of such a synthetic generation process, and AI's self-learning and self-training?

 

Ilya-Sutskevi: I would not underestimate the data that already exists in this part, and I even think that there is more data in it than we realize.

Huang Renxun: Yes, this is what we are thinking about in the future that we are constantly looking forward to. I believe that one day, AI will be able to generate content, learn by itself, and improve itself. Can you summarize what stage of development we are in now? And what kind of situation can our generative AI achieve in the not-too-distant future? What is the future for large language models?

 

Ilya Sutskevi: It is difficult for me to predict the future. What we can do is to continue to do this, and we will let everyone see more amazing versions of the system. We hope to improve the reliability of data so that the system can truly gain people's trust. If you let the generative AI summarize some texts, and then draw a conclusion. At present, in the process of interpreting this text, AI has not fully verified the authenticity of the text and the source of the information said in the text, which is very important. Next, our vision for the future is to make the neural network aware of the authenticity of all data sources, and make the neural network aware of the needs of users every step of the way.

Huang Renxun: This technology hopes to show people more reliability. I have one last question. When you first used ChatGPT-4, what performances made you feel amazing and shocking?

 

Ilya Sutskevi: Compared with the previous version of ChatGPT, the neural network can only answer questions, and sometimes misunderstand the questions, and the answers are not ideal. But GPT-4 basically does not misunderstand the problem anymore, it will solve the problem in a faster way, and it can handle complex and difficult tasks, which is very meaningful to me. For example, many people realize that ChatGPT can write poems, for example, it can write alliteration poems, and it can also write end rhyme poems. And it can explain the joke and understand the meaning behind the joke. In fact, in short, its reliability is better.

I have been working in this industry for more than 20 years, and I think that the characteristic of "amazing" is the meaning of its own existence, and it can bring help to human beings. It has slowly grown from its humble beginnings in the field of work to become stronger and stronger. The same neural network, trained in two different ways, can grow stronger and stronger. I also often have questions and sighs: How did these neural networks grow so quickly? Do we need more training? Will it keep growing like the human brain? This makes me feel its greatness, or its particularly surprising aspects.

Huang Renxun: Looking back, we have known each other for a long time. You have dedicated your entire career to this cause and saw that you have made achievements in GPT and AI. Communicating with you today made me understand the logic of ChatGPT's work more clearly, which is the most in-depth and artistic explanation of ChatGPT and OpenAI. It is a pleasure to communicate with you again today, thank you!

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments