134. I will give you a Qiding King

Style: Romance Author: CloseAIWords: 2593Update Time: 24/01/11 09:49:09
Compared with the computer's victory in international chess, the progress of intelligent programs in Chinese chess has been lagging behind.

This is not because Chinese chess is more difficult than international chess, but because chess intelligence is just a public relations tool for large companies and has no actual revenue value.

After "Deep Blue" won chess, many people thought that computer chess was almost over.

Continuing to play Chinese chess, which is of similar difficulty, is thankless, and IBM also disbanded the "Deep Blue" team.

Only Go is indeed much more difficult and challenging.

It is generally believed that it is much more difficult for a computer to win in Go than in games such as chess, because the Go board is too large, there are many chess points, and there are far more branching factors than other games.

And every time a move is made, the situation is erratic, and heaven and hell can happen in an instant. After the technology has matured, people can often observe the situation where the AI ​​system has a 60-70% winning rate in a single move.

It can be said to be the best interpretation of "one careless move and everything is lost".

Traditional artificial intelligence methods such as brute force search, Alpha-beta pruning, and heuristic search are difficult to work in Go.

However, Go has no audience in the West. The main popularity is still in the three East Asian countries. Therefore, for a long time not many people were willing to spend time on this matter. The development speed in the past ten years has been quite satisfactory.

DeepMind's investment in this matter was largely a coincidence.

On the one hand, many senior executives love chess. On the other hand, perhaps more importantly, Huang Shijie, a core member of DeepMind and one of the two chief scientists, has a deep accumulation and affection for Go intelligence.

Huang Shijie's master's thesis is "Robbery Strategy in Computer Go" and his doctoral thesis is "New Heuristic Algorithm Applied to Monte Carlo Tree Search Method in Computer Go".

Compared with people like Meng Fanqi who have forgotten all their undergraduate professional knowledge after graduation, Dr. Huang can say that his major is very relevant.

"In fact, the current Go intelligence has a certain degree of competitiveness." Dr. Huang introduced Meng Fanqi to the current Go intelligence's chess capabilities: "The highest level is almost at the level of amateur five-dan. If you don't give up words, it is not as good as the real players." When professional players play chess, there is no chance of winning.”

Meng Fanqi still has a rough idea of ​​the basic classification of Go skills. The amateur sixth-dan level is roughly comparable to the professional first-level level.

Dr. Huang Shijie himself is an amateur sixth-dan player in Baodao and can be regarded as a professional-level goalkeeper.

If the intelligent program you create can stably gain the upper hand and you cannot play the game at all, it basically means that Go intelligence has reached a truly professional level.

Instead of defeating professional chess players only by letting 3-4 pieces.

Moreover, if the intelligence created cannot defeat oneself, this matter is really meaningless.

"What are your current ideas and strategies?" After briefly chatting about the situation, Meng Fanqi turned to the specific algorithm part.

Theoretically speaking, the input of Go problems is actually very similar to the image class that Meng Fanqi is very good at.

The form of color pictures in computers is a multi-channel matrix, usually 3 channels, representing the three primary colors.

For example, a picture with a resolution of 224x224 is stored in the form of three [224, 224] matrices.

Generally speaking, the value of each position is between 0 and 255.

In the case of Go, its input is like a 19x19 single-channel image.

19x19 represents all the placement locations on the chessboard, and the value of each location has only three states: black, white, and no stones.

It can be represented by three numbers [-1, 0, 1].

The goal of Go intelligence is so-called playing chess.

If you don't consider the principle, its external feedback is actually given such a chessboard [19, 19], hoping that the program can only change a number 0 with no pieces on it to a given piece type (number -1 or 1), while making the probability of the party winning as large as possible.

"The chessboard is a black and white single-channel image with a resolution of 19." Ordinary people would never think of this.

But for Meng Fanqi, who is more familiar with image technology and deep neural networks, it is a natural thing and concept.

“We were inspired by the breakthrough of deep neural networks. Before AlexNet at the end of 2012, Crazy Stone’s Go intelligence provided the highest accuracy, reaching about 35%.

Currently, we are mainly studying how to use deep neural networks to make the judgment of Go intelligence more accurate.

The deep neural network led by Alex and you has made amazing breakthroughs in classification problems. This is a big reason why we started this project this year.

We are currently trying to collect a large number of professional game records, and currently have more than 100,000 games. And from these more than 100,000 games, millions of single moves can be extracted.

Through this data, we are currently establishing a suitable network structure. In this regard, I think you are the expert among experts. "

"I roughly understand." After listening to this, Meng Fanqi basically understood DeepMind's current thoughts and progress.

Although Dr. Huang has done a lot of research on Go AI projects before, the Alpha Go project has just begun after all, and it is also based on new deep network technology.

So far, they have not formed a complete set of learning and confrontation ideas, and the overall structure of the policy network-evaluation network-reinforcement learning-Monte Carlo search has not yet been formed.

It is still at a relatively early stage, and it has not even been decided yet which network structure is better to use. At this time, the structure of the model itself is being tested and designed.

"This aspect is indeed the direction I am good at. Especially recently, I have some ideas on the design of CPUs and small models. These contents should be of some help to you."

When it comes to various devices and various types of tasks, which operator is better for the network, and how to choose between speed and performance, even in the next five years, Meng Fanqi will be the well-deserved number one.

Because the trade-offs and conclusions he is familiar with are all the experimental results of NAS (Network Structure Search), a large platform like Google.

The so-called NAS is actually a method of exhaustive comparison.

On a specific data set, all imaginable operator combinations are tested in an exhaustive manner.

The final network structure will of course be better and faster than one designed by humans, but it may not be easy to use data with a huge gap.

The cost of obtaining this answer is quite staggering. As the search space increases, it obviously requires terrible computing resources to support it.

Fortunately, Meng Fanqi has already used tens of thousands of graphics cards from several large companies to test the main conclusions over several years, all of which were in vain.

The cost of this knowledge is probably more than a billion dollars.

"That's great." Dr. Huang beamed when Meng Fanqi agreed to help with AlphaGo's network design. "At this stage, except for the need to quickly iterate and compare to determine the network structure, we don't have any particular difficulties."

"If I have to say, there is a lack of a professional goalkeeper-level human chess player." Dr. Huang thought for a while and added.

He is an amateur sixth-dan player, so he can actually take on this role.

But after all, he knows the Go AI too well, and the testing effect may not be realistic enough. Moreover, he is also very busy and cannot be responsible for the game testing all the time.

At the same time, the popularity of Go in Europe and the United States is not high. Fan Hui, second dan, has won many European and American Go championships. Professional goalkeeper-level testers are not so easy to find.

"It doesn't matter. I will go to the UK in February to confirm the effect and follow-up ideas with you." Meng Fanqi smiled when he heard this: "I will find you a goalkeeper among goalkeepers then."

When it comes to professional goalkeeper-level chess players, there must be no one more suitable than Qi Ding Wang Zhan Ying, who has been competing for seven consecutive years and is challenging the status of a professional chess player for the eighth time this year, right?

This is the eighth time I have guarded this door.