Chapter 9 The disciple has forgotten everything
"What kind of confusing plot is this?"
Earlier, I saw the instructor from the School of Electrical Engineering coming to him in a menacing manner, and I thought this guy was a problem student.
How did it turn into such a profound and professional discussion?
Who am I? Where am I? What are they talking about?
Why do I know every word, but I can’t understand them all together?
Time passed slowly, and before you knew it, Dean Fu's drafts and deductions had filled six or seven pages of A4 paper.
“If we simply use the method of y=f(x) b to understand the behavior of one layer, then for any positive integer k, there is a neural network with a depth of k^3 and a width of constant magnitude, which cannot be used by a k-layer neural network.
Neural network fitting, unless its width is 2^k times the original."
The more Dean Fu calculated, the more he felt that this guy had good intuition.
"If, as you said, batch normalization is used to avoid the problem of covariate shift, and nonlinear functions provide additional expressive capabilities for the layer, then even in extreme cases, 2^
k^3 times the width. It seems that your idea of focusing on solving the depth problem first, and then considering solving the training difficulty problem is a very smart choice."
Dean Fu pondered for a while, reviewed his calculation process, and asked, "Do you understand?"
Meng Fanqi shook his head very seriously and said, "I don't quite understand."
Dean Fu smiled and went through the entire process in order without being anxious or annoyed.
Even though Meng Fanqi had read these three papers several times in his previous life, he had never been able to fundamentally clarify the mathematical relationship between them. The AI world and the mathematics world have completely different perspectives on this issue, and Dean Fu gave Meng Fanqi a new understanding.
and understanding.
A senior mathematics professor sorted out the mathematical principles and relationships behind it for him. Meng Fanqi suddenly felt enlightened, but he was still confused, separated by a layer of gauze.
After looking carefully for a long time, it seemed a little clearer, "I seem to understand a little now."
"Read it a few more times to consolidate it." Dean Fu stood up after hearing this, patted Meng Fanqi on the back, and said, "My office is in Building 503 of the Institute of Science and Technology. If you have any questions in this regard, you are welcome to come to see me.
I discuss and communicate."
After saying that, he turned around and left, not asking what Meng Fanqi's name was, and just having a purely academic exchange.
Before leaving, he gave a look to the two graduate students from the School of Mathematics who were craning their necks and looking around. It probably meant, "Look at me, I'm an undergraduate, and I've almost written an article. Look at me, look at you."<
/p>
The two-digit graduate students in the college quickly lowered their heads and looked away.
Meng Fanqi was left alone to savor the complex argumentation process. After a while, Meng Fanqi felt that he understood it, but he seemed to have forgotten everything, and entered a mysterious and mysterious state.
This must be how Zhang Wuji felt when he learned Zhang Sanfeng's Tai Chi sword.
--------------------------
The friendly guidance from Dean Fu of the School of Mathematics and Physics has made up for the weakest link in Meng Fanqi's current thesis career, which is enough solid mathematical analysis and formula deduction.
In the middle and later stages of the AI discipline, no truly convincing theory has been found to explain the powerful power of deep neural networks. Therefore, many articles focusing on performance and practical directions are becoming more and more like experimental reports rather than papers.
This has been criticized by many people.
The number of contributors has increased sharply every year. Around 2017, AI-related fields have become floodplains in the eyes of many researchers. The rolling Yangtze River is full of water, and there are not many heroes among the waves.
However, at this point in time, reviewers at many conferences and journals still pay great attention to the theoretical part. If the argument and reasoning in this part are not clear and smooth enough, no matter how good your results are, even if you are one of the three giants of AI.
Lecun will also be ruthlessly rejected.
Meng Fanqi received this honor by chance, and he was not polite about it. He had dozens of early papers that needed to be completed as soon as possible, and many of them troubled him and made him not know how to write.
Now that we have such good teachers, we naturally need to consult them frequently.
He didn't need Dean Fu to spend a lot of time reading specific articles. He only needed to open up that layer of information for himself through the succession and transition of a few key formulas and the tiny details that he had never paid attention to when reading in his previous life.
Through the seemingly invisible veil, we can peek into the real mystery.
A few weeks later, around five o'clock on a Friday afternoon, Meng Fanqi just walked out of the dean's office in the Mathematical Sciences Building.
Dean Fu inside did not get up and go to the cafeteria to eat as usual. Instead, he took out all the papers that Meng Fanqi asked these questions and looked at them carefully.
Although Dean Fu’s expertise in deep learning and image algorithms is not suitable here.
But if the picture content wants to be displayed on the monitor and stored in the computer, it must be in the form of a matrix. In fact, to put it bluntly, it is a bunch of rectangular numbers, just like one side of a Rubik's Cube, with nine squares on it that can store nine numbers, that is
A 3 by 3 square matrix. Each number is a pixel, which is the smallest component unit of the image.
When it comes to operations and transformations between matrices, Dean Fu is a master in this field.
Although Meng Fanqi never came with a complete paper, from these scattered questions, Dean Fu was able to get a glimpse of the leopard.
"The first time we discussed was an extremely deep network optimization problem. The residual and batch normalization he proposed should belong to the category of model structure."
"But he later came to me to discuss first-order and second-order gradient calculations, as well as several variations. This should be the content of the parameter optimizer."
"Further on, he asked about some error analysis and transmission, which is the content of numerical analysis." Dean Fu frowned and thought carefully, "This should involve different ways of processing the same number in the computer.
Storage and storage methods are different, and the computer resources occupied are naturally different. However, some errors are bound to be introduced."
"Today, the things he asked were even more imaginative." Dean Fu recalled the prerequisites and background of Meng Fanqi's question today and marveled at his bold imagination. "With two deep networks, one is responsible for generation and the other is responsible for
Judgment."
"The generator continuously creates images that it believes are real, and the discriminator is responsible for distinguishing whether the input is real or generated by the generator."
"This mutual confrontation method can get rid of the human supervision mode where model training always relies on humans to provide correct answers to these data one by one."
Dean Fu murmured to himself that Academician E is actively promoting matters related to big data. As one of the top disciples, he naturally understands the magnitude of today's data and provides answers to them one by one. What is the difficulty and workload of labeling?
How big.
Chapter completed!