Chapter 43 Another double breakthrough

Style: Romance Author: CloseAIWords: 2139Update Time: 24/01/11 09:49:09
Meng Fanqi's words sounded uncomfortable.

The implication is that China’s AI technology is inferior to foreign countries.

This is something Li Yanhong doesn't like very much. After all, he paid attention to AI technology so early just to develop the most cutting-edge and advanced technology.

Meng Fanqi probably guessed what he was thinking. In his previous life, he was also confused by the large number of AI papers published by China in the early days.

I feel that in this newly emerging technology, China can already compete with the United States without falling behind.

Although AlphaGO shocked the world, it was a bit flashy after all.

Until the emergence of large-scale language models with hundreds of billions of levels, Meng Fanqi had to surrender to this competition of pure hard-core strength.

In fact, it is not that the technical means and algorithm are too different.

More often than not, the amount of high-quality data is insufficient.

Baidu's Wenxin said that when drawing, he would even translate the user's Chinese input into English before drawing.

Many serious netizens deliberately tested vocabulary that is very different between Chinese and English, such as bus and mouse.

The image Wen Xinyiyan drew turned out to be a bus and a mouse, which is completely unexplainable in Chinese.

It can be seen that, if not all, Wen Xinyiyan, the so-called super model focusing on Chinese, also relies on English-based model weights and technologies to a considerable extent.

Why should we do this? In the final analysis, the foundation is not solid enough.

Organize data, clean data, and label data with high quality.

These are dirty, tiring tasks with slow results.

How convenient and quick is it to take other people’s public data and run it for training?

With the involution trend of domestic 996 major manufacturers, it is difficult to accommodate infrastructure construction with a long payback cycle.

I couldn't tell any difference when I looked at it earlier. I just felt that major domestic manufacturers frequently appeared on the XX list, beating this list and surpassing that one.

It was not until the large language model stage that the disadvantages in the quantity and quality of the basic corpus were exposed.

"Actually, this cannot be completely blamed on China's big factory atmosphere. The Internet in the United States started earlier, and the archiving of documents and materials in many fields is particularly good." Meng Fanqi also thought carefully about this issue.

"Large open communities like github and arxiv contain very high-quality foreign language codes or papers. These are not just the accumulation of Americans themselves. They harvest the world's data through free public use."

"Chinese people have also contributed a lot of lines of code to GitHub. If you look at China's thesis communities, such as CNKI, they are pure cancer. There are papers written by master's and doctoral students, and they charge based on the number of pages. . Even the reader after downloading requires a special..."

As a result, how much precious data is lost...

But at this moment, Li Yanhong probably hasn't thought of using such large-scale data for training. Therefore, Meng Fanqi was not in a hurry to discuss subsequent language technologies and large generative models with him.

In the past year or two, Meng Fanqi's focus has been on visual image algorithms.

"Mr. Li, I personally believe that the degree of open source of AI technology is relative, and it cannot continue to be so transparent. But in the end, what may become a barrier is not the purely technical category of the model itself, but more likely the computing power, high-quality big data, etc." Scale data, as well as some essential training and feedback methods, etc.”

"Even in the current open source era, it is normal for there to be a time lag of half a year to a year between the time the algorithm is produced and the model and code are made public."

“For academia, this is not a particularly long period of time, but for the industry, the direction of real implementation. The results of this time may be worlds apart, or even the difference and distance between the life and death of an enterprise. "

Li Yanhong nodded slightly when he heard the words. He naturally understood the meaning behind Meng Fanqi's words.

Suppose that Robin Li wants to launch a real-time, high-performance image detection application. Currently, no algorithm on the market can support the computing speed and accuracy he requires.

Even if Meng Fanqi is willing to disclose the results in his hand, with the way the paper is reviewed, it will be at least half a year before people know the technical details.

Coupled with the time for replication and trial and error, he was able to apply the technology within 8-9 months, which is already very fast.

But if we cooperate with Meng Fanqi, we will naturally be able to directly obtain this technology 8-9 months in advance.

Such a long time is enough for Robin Li to develop all aspects of adaptation, embedding, and even interactive things such as apps and user interfaces.

Market promotion and negotiation can also be started early.

When the technology was first announced, competitors were still reading papers and marveling at the performance of the new technology.

Baidu has been talking to potential customers for three or four months.

While competitors are still scrambling to replicate the results, Baidu may have even signed the order.

Once a large leading company like Baidu can create a gap of half a year to more than half a year in technological innovation, it will be difficult for rising stars to get a big piece of the cake.

Li Yanhong mentally calculated the gains and losses, and felt that if the other party was really unwilling to consider recruiting, this kind of cooperation would be of no harm to him.

"What you mean is that you will always share your latest results, or part of them, with Baidu immediately or in advance." Meng Fanqi was very productive in the second half of this year, with DreamNet, generative adversarial networks, and just now New detection technology shown to Robin Li.

Although he found it a bit unbelievable, Li Yanhong no longer doubted Meng Fanqi's productivity. The only thing he had doubts about was the specific method of cooperation.

"If you don't accept employment, you will naturally have no salary income. Technology investment is not enough. The output is not fixed, or to put it more ugly, you will not come to me with all the results. So what do you want to do? What about formal cooperation?" Li Yanhong asked, "Do you plan to set up a shell company, and Baidu will price the acquisition based on the technology and specific indicators you provide?"

"Don't you want to gain wealth as soon as possible?" Li Yanhong was actually a little confused. He picked up the paper in his hand and weighed it. "With the algorithms and models here, you can pay with one hand and deliver with the other. How convenient and fast it is."

"Mr. Li was joking, this won't be a one-and-done deal." It is of course necessary for the reborn to use some effort to make technological breakthroughs, but there is no need to use all his strength. Not to mention that some technologies are limited and cannot be made directly. Even if you can do it, you have to leave more room for yourself.

If a person has the ability to break the world record, then it is naturally more cost-effective to break through it ten times. There is no reason to achieve it in one step.

What's more, he broke the world record again.