Li Yanhong thought about it and found that during the communication and interview process, he basically lost the initiative.
Because at first his core plan was actually to recruit talents and explore the technical details of DreamNet.
As soon as Meng Fanqi got in the car, he happily gave himself a copy of the DreamNet paper.
This incident directly disrupted his rhythm, and every subsequent step aggravated the process.
He took out the details of the communication with Alex and Hinton and declined the recruitment. He talked about the route of the AI model and induced himself to propose technical cooperation.
Then he suddenly pulled out such a shocking algorithm, as if he had never planned to mention it during this trip.
"If you think about it carefully, it feels a bit like a magic trick. First divert your attention and hide your true intention. Then, when you are not prepared, I will attack and deceive you."
Under the strong doubts of several technical staff headed by Yu Kai, Li Yanhong couldn't help but have such an idea.
After all, Meng Fanqi only gave some experimental results at that time and did not have any other information.
If the situation is really as Yu Kai said, the performance improvement comes from the downstream application of DreamNet technology, but the detection speed has not actually improved, it is actually a big breakthrough.
It can only be that it is not worthy of direct intervention by him, the CEO of the company.
However, the feeling of "as if this trip was not even intended to be mentioned" is indeed not an injustice to Meng Fanqi. He did originally intend to use this algorithm to directly negotiate with Google.
But after Robin Li proposed technical cooperation, Meng Fanqi thought about it briefly and believed that it would be very beneficial for him to cooperate with Baidu first.
First of all, Baidu lacks AI technology and has a greater sense of crisis than Google. Li Yanhong also personally negotiated with him. The same technology could be obtained at a higher price at Baidu.
Secondly, it would only be a few months since Google gave me a letter of intent, and I would be able to have such an initiative and technical cooperation with Baidu. It can greatly enhance your bargaining power and negotiation space.
You know, there are many factions within larger companies, and things like resources have to be fought for.
I have no historical achievements, no external connections, and I arrived in Silicon Valley as an unfamiliar place. If there is really a shortage of computing resources, it will be a big delay.
Of course, the most important thing is to focus on the resources of the Chinese government.
Detection technology is the most widely used AI technology in government agencies at this stage. Not only can hundreds of millions of cameras use detection algorithms to intelligently mark key periods of monitoring, but it also provides high-precision real-time face detection that is several levels more secure. It is a very huge market.
I plan to go to Silicon Valley early next year. If I want to catch up with the Chinese government, I still need to rely on large Internet companies like Baidu.
At this time, Baidu is not like the decline that it had shown ten years later. Currently, Baidu and Penguin Ahri are among the top three, and they still have great value.
Li Yanhong was thinking about the same thing. He also had a much deeper understanding of Chinese officials than Meng Fanqi, and he was very eager for the potential opportunities.
Since he wants to take this direction, he doesn't need to doubt the people he uses, and he still has the courage to employ people.
Of course, the most important thing is that the contract has not been signed yet.
"To put it bluntly, you have nothing to worry about. We will only sign the contract if the acceptance results are passed. Then you will review the code and reproduce the results yourself. If you can't trust others, can you still not trust yourself?"
Li Yanhong quickly adjusted his mentality, "It is very undesirable for us to directly hold such a questioning attitude. After a while, when people come, we still have to adjust and pay attention to the methods."
On the other side, Meng Fanqi, who knew nothing about the inside story, was preparing to go to Baidu's Yanjing headquarters.
As a reborn person, he overestimated the existing detection technology after all.
The first real application of deep learning technology to target detection should be R-CNN, which was just proposed this month, which is a region detection neural network.
While the mAP value of traditional algorithms stops at 30-40 and does not continue to improve, R-CNN is based on neural networks and breaks through the mAP value of 60 in one fell swoop.
Its R refers to the area. To put it bluntly, the detection task is to point out the position/area of the object in the picture.
Even in 2014-15, as the leading high-performance algorithm of the R-CNN series, its inference time was extremely slow.
Using the VGG network from Oxford University in 2014 as the backbone of the structure, it takes dozens of seconds to process an image. There is no real-time possibility. It is only used for academic research and is difficult to put into the industry.
Even a year or two later, the fast version of the Fast R-CNN series, which has been repeatedly updated and upgraded, only has 0.5 and single-digit FPS.
And the algorithm given by Meng Fanqi: YOLO. Even on 448 x 448 sized images, the speed exceeds 80FPS.
If the smallest model version is used for inference, the speed can even reach an astonishing 200 frames.
How many people will not be able to display 100 frames when playing games until ten years later?
The original first version of YOLO technology actually lacked accuracy. After all, as a detection technology that focuses on speed, it is inevitable to sacrifice performance.
But when Meng Fanqi started to come into contact with YOLO technology, it had already reached V4. By 2023, it would even have reached V7 and V8.
Meng Fanqi didn't know how to make mistakes even if he wanted to make many detailed problems.
The first thing I remember is the technology after optimization.
At this moment, the more commonly used detection technology is DPM, with a performance of 26.1 mAP at 30FPS and only 16.0 mAP at 100FPS.
The R-CNN technology that just came out this month has a qualitative breakthrough in performance, reaching 50-60, but the FPS has already reached a few decimal places, making it unusable.
The results handed over by Meng Fanqi were 69.5 mAP, 82FPS, 58.3 mAP, and 200FPS.
This can no longer be said to be an ordinary transcendence, it is simply a complete explosion within a complete explosion.
However, in addition to being negligent in this regard, Meng Fanqi is actually consciously trying to improve this performance.
Looking at all the AI technologies I have mastered, only detection is the fastest to be realized at this stage.
This function is straightforward, easy to understand, and easy to demonstrate.
Just connect the camera and demonstrate to the audience in real time. This AI technology can smoothly and smoothly detect common objects such as tables, chairs, people, animals and plants on the screen, giving the audience the most direct shock.
Technologies such as image generation and language dialogue still require a certain amount of time and massive data and computing resources to support them before they can implement these technologies themselves.
In terms of actual application prospects, detection technology is not only the easiest technology to implement at this stage, but its future prospects are also very broad.
There will be countless companies engaged in autonomous driving in two or three years, like crucian carp crossing the river, countless.
Trying to make exaggerated breakthroughs in testing will be very helpful for one's historical position in this direction in the future. To put it bluntly, it is actually easier to cheat money.
It was just the first time he mastered the knife skills and lacked experience, so he didn't cut it well. This may inadvertently lead to misunderstandings among more professional people.