Dr. Hongjiang Zhang: The Essence of AI and its Opportunities | 2017 Code Class

Source Code Capital announced during the 2017 Code Class that former CEO of Kingsoft Dr. Hongjiang Zhang joined Source Code as a Venture Partner.

Dr. Hongjiang Zhang shared his opinion on “The Essence of AI and its Opportunities.”

Dr. Zhang’s full speech is as follows:

Today, let’s start with AlphaGo. Our agenda will include the reason behind the recent trend of AI, the fuel which drives machine learning forward – The development of big data, the basic principles of deep learning, the impact of AI on our daily lives, our creativity, and our workplace. Finally, we will discuss the opportunities and challenges of investing in the field of AI.

 1. Let’s talk about the trend of AI, beginning with AlphaGo: Algorithm + Computing Power + Big Data

  • Deep Neural Networks are the typical deep learning algorithm
  • AlphaGo consumes 300 times more energy than a human during each match
  • The explosion of data has changed our lives over the last decade

The current wave of AI actually started last year from the match between AlphaGo and Go Master Lee Sedol. I put in a bet for the victory of AlphaGo even though I never learned to play chess.

Why did I make this bet? We read articles about AlphaGo, which is a typical deep learning system. It leverages Deep Neural Network as well as Reinforced Learning, Semi-Supervised Learning, and Monte Carlo Search methodologies from the deep learning discipline.

As a whole, AlphaGo combines classic technology with new machine learning algorithms, enhancing the performance of its algorithms and its ability to learn. It has developed a powerful ability to digest and absorb knowledge by increasing the Hidden Layer depth of artificial neutrons in its deep learning network. The previous description is the algorithm perspective.

In fact, there are two other significant underlying properties, which are results of the advancement of computer science over the last several decades.

First, the last twenty years of advancement in the Internet brought us big data, especially big data of very high quality. Take the example of AlphaGo, before its match with Lee Sedol, it already played over sixteen thousand games with human players with a rank between six to nine, and gathered over thirty million board layout points. These data are critical to AlphaGo’s advancement. Furthermore, by playing against itself, AlphaGo gathered another thirty million layout points, which formed its Decision Network. This is the point I would like to emphasize today, which is high-quality big data.

Second, there are high-performance computing resources. Let’s take a look at the final form of AlphaGo which played the match with Lee Sedol. We know it uses 1920 CPUs and over 280 GPUs. What does this mean? We know Go masters are ranked between the first to the ninth, with the “ELO” value defined as a more fine-grained measurement of skill.

With 1920 CPUs and 280 GPUs, AlphaGo increased its ELO from over 2,000 to over 3,000, which is already reasonably close to Lee Sedol’s ELO of 3,500. AlphaGo easily defeated the Go Master precisely because it possessed high-quality big data, high-performance computing resource, and new learning methods. A point worth mentioning is that between Lee Sedol’s hard-earned victory in the fourth match and next morning, AlphaGo practiced another one million games with itself.

Why is the current AI trend different from the last two? The AI we discuss today is mostly a combination of big data, with leaner and newer algorithms. The Moore’s Law guided the advancement of computing resources over the years and allowed them to grow exponentially. At the same time, the cost of computing decreased exponentially. As the computing power gets ever more powerful and the cost gets ever lower, we can afford to utilize a vast amount of computing power economically.

The other fundamental change over the last ten years is if you do not wish to purchase computing hardware, you can use cloud computing. When we look into this, we find that the ability to handle big data on cloud platforms is already available as standardized cloud services, conveniently available at a low cost. This advancement in computing changed the fact that machine learning was only possible for large corporations since small companies can now also do so with the cloud.

Let’s come back to talk about data, my favorite topic. Before AlphaGo, I had already talked about big data constantly. Of course, I am even more motivated to evangelize Kingsoft Cloud. However, big data had indeed changed our workplace, our lifestyle, and the way we think.

A report in IDC pointed out that the total data production by the entire human population in 2013 is 4.4 ZB (zeta-byte). This will grow to 44 ZB in the year 2020, a ten-fold increase, and an annual growth rate of 40%. Currently, Walmart users generate more than 2.5 PB (petabyte) every 4 hours. In a single day, Twitter users publish 500 million tweets per day. Toutiao serves 6 billion requests totaling 6.3 PB. These numbers illustrate the sheer extent of data generated by humanity every day. If we have a way to track and mark all these data, they could be used to drive AI algorithms.

2. Let’s talk about Big Data, the Fuel that Drives Machine Learning

  • The advancement of the Human Facial Database dramatically increased computation accuracy
  • Jim Grey’s four patterns of scientific research: Observation, Experimentation, Computational model, and Data-Driven Model
  • Big Data is becoming the industry standard for Enterprise AI

I will share an example of Facial Recognition.

Today, we believe facial recognition has become a proven technology. A series of companies developed facial recognition skills with higher accuracy than humans. This feature is excellent on our cell phones. Previously, when a person took many photos, it was problematic when he wants to search through them. Now, when you are looking for a specific person, you may need to remember when the photo was taken. However, it is easier to remember whom you took a photograph with, and facial recognition made it very easy to find what you are looking for. Twenty years ago, this would be a feature of dreams. Today, it is available on a cell phone.

This is a screen capture from my cell phone. My photo, my wife’s photo, and the kid’s photo are all here. Mr. Lei Jun, my previous boss, is also here. I can make one click, and all photo of Jun showed up. Let us suppose I want a particular photo, this one of Jun Lei and the Michael Dell, the founder of Dell Computer, and others in the Dell team. We can see the cell phone has automatically tagged all of them with their names. Who is this person? Who is that person? From here onwards, all of his photos could be recognized as soon as they are imported. This feature is available on all brands of cell phones today.

This is a patent I applied for twenty years ago in the U.S. The subject of this patent is precisely the workflow I just discussed. After taking a new photograph, compare them with the photo in the database, and recognize the individual in the photo. Twenty years ago, we were clear about the limited computational power in mobile devices, so we envision this processing power to come from distributed computing, what we call the “cloud” today. Twenty years later, we have turned this concept from twenty years ago into reality, this could be the advancement of the algorithm, but it just as well could be the advancement of computing power.

What I want to talk about is the advancement of the facial recognition database.

Twenty years ago when we worked on facial recognition, we could take a few hundred sample photos. Today we can get billions of photos. In the early 90s, there was only one database with several hundred individuals and several hundred pictures, but in the late 1990s and early 2000s, they grew to thousands and tens of thousands of photos. Thus you can see the rise in accuracy here. During the Industry Age about five to six years ago, Google and Facebook began to independently apply deep learning into facial recognition, using much more data and thus getting a higher accuracy.

With the massive growth in user data, the accuracy of the same algorithm improves. When I use more computation, more CPU, there is also rapid growth in performance. This proves the point I have shared earlier. Data may be more important than the algorithm. We can also say that, without the sheer volume of data, a deep neural network is unimaginable.

No matter how severe the problem is, in the hands of the Chinese people there will be solutions. With so many surveillance cameras everywhere, and so many avatar photos and Citizen Identity Card photos, there is no place in the world like China. This is the advantage of China. Today there are not merely 200 million photos, but billions of photos, and hundreds of millions of tagged people. Only with this big data can you apply a deep neural network to extract content and information.

Today the facial recognition provided by these companies far surpassed human accuracy. They are the most advanced in the world. When you compare a photo with the database to determine whether you are on this photo, the accuracy is measured by the unit of ten-thousandth. Basically, with surveillance cameras and algorithm, you should not be doing anything wrong in China even in your car. If a photo is taken in the gas station when you put your hand where it should not be, it will get out very quickly. This is how accurate and precise the recognition became.

A year ago, Jian Sun from Microsoft Research Asia led his team to build a 152-layer neural network. They achieved image recognition accuracy better than that of a human. I want to confirm with everyone here. When our model complexity begins to increase from 8 to 152 layers, we can see the amount of processing increased, and the training data increased consistently. In the year 2012, with an eight-layer neural network, there are more than 650,000 neurons, and more than 600 million connections. When we get to the network with 152 layers, there are now 22 million neurons. With the new algorithm and more accurate parameter tuning, there are now 11.3 billion connections. In our brain, there are likely more than one quadrillion neurons (1 million billion).

The advance of AI is in many ways a fundamental change between classical model- and rule-building, to a data-driven machine learning we have today. This transformation is because we now have data, with more coverage, more precision. Thus we are less dependent on models. Furthermore, ever more complex models have enough data to train them.

Traditional AI algorithms or neural networks in the past could not reach the accuracy we have today. The reason is that we do not have useful data, and we are dependent on models and specific algorithms. However, now we have to a large extent covered the entire pattern space. We have so much data, which turned initially challenging problems into excellent solutions.

The performance of different algorithm will also change based on the data volume. When the amount of data grows, the precision will improve. However, you may ask a question: Do we now have sufficient data which allows our AI to cover all scenarios? Last year, there was the first-ever case of injury by Tesla. This means even with Tesla, who have hundreds of thousands of cars on the road, there are still not enough data, and there are still fatalities in some situations.

If a database technologist does not know who Jim Grey is, he really should not tell others that he is a database technologist. Jim Grey proposed the four patterns of human scientific research more than ten years ago. He suggested that our earliest pattern be purely based on Observation and Experimentation. From there, we advanced to Theoretical Models around one hundred years ago. Several decades ago, we then proceeded to the Computational Model. Finally, we have the Data-Driven Model today. Over the last ten years, big data had progressed rapidly, and big data is already being utilized on a large scale within enterprises applications.

A U.S. Consulting firm researched over three hundred companies with more than three thousand employees. They found that 60% of IT companies started to use big data at different levels of maturity. The earliest stage is to gather statistics on what happened. Then, they analyzed what happened. Now they can predict what will happen. In the future, big data will provide insight into sound business decisions, and advance toward an understanding of how to execute this decision. This insight is the self-learning ability.

A well-known company, Intel, is acquiring AI and data-generating companies relentlessly. For example, they acquired the Israeli company Mobileye with a very high offer. The reason is simple. Intel believes automobiles will be devices that will generate a significant amount of data about people’s lives. These data will help data analysts profile people and form business decisions about various applications. Intel will control the entire procedure to generate and process these data. This means Intel will control a new platform, and this is why they are investing in such an enormous sum of capital in this domain.

Now that we covered computation and big data, let’s return to the development of the algorithm we mentioned earlier.

3. The Fundamental Principle of Deep Learning

  • Autonomous machine learning is deep learning driven by Big Data
  • The third wave of deep learning is: Big Data + Powerful Computation + New Algorithm

The third wave of AI has finally arrived after sixty years of culmination. This wave appears to be stronger than ever and solving ever more problems. The important thing is that our deep learning methodologies are very different from traditional expert systems. Expert systems’ method is for humans to gather rules, and hand them over to the machine who then apply these rules in a use case. Big data drives the deep learning method. The machines learn by themselves. The benefit is because the machine themselves do the learning, it is relatively easy to extend from one use case to another.

The last ten years happens to be a decade of rapid development for deep learning. In the year 2006, Hinton published a paper on Nature, and coined the term “deep learning.” In the year 2010, with the explosion of big data, the trend of deep learning begins. In the year 2012, Hinton’s team took the first place in annual championships with the CNN model, beating the second-place solution by a ten percent score difference. In the year 2016, AlphaGo wiped out people’s remaining doubt about the capability of deep learning. It will change humanity, and mark the dawn of a new era.

What is deep learning? The neural network was used as early as the second wave in the 80’s and 90’s. The neural network was overused everywhere. There was not enough data at that time, so there are only the Input Layer, the Output Layer, and the Hidden Layer. The other fundamental change is the sheer processing power of devices today.

Why is the deep learning methodology different? First, it was based on the neural network, a simulation of how the brain produces thought. The human brain has around 100 billion neurons, and more than one quadrillion synaptic connections. The number of neurons and connection is an important indicator of the capacity for intelligence. Each neuron is a core with a synapse. We simulated the artificial neuron with this principle and connected them with other neurons with a non-linear function, and the output is produced. The output is the result you need. When there are many neurons or many layers, then you apparently need more data for training.

Why is big data the driving force of deep learning? The classical approach of simulation cannot reach 100 billion neurons and one quadrillion connections because the complexity of physical simulations constrains them. We can achieve this size with today’s computers. What does deep learning mean? It is simple: deep learning is just a neuron network with more layers. With every training cycle, you output a set of data. When this output is different from the target function, this difference would be sent back to the network as feedback for training, and the cycle continues.

4. The Advancement and Impact of AI

  • AI’s application and future: Support humans, Replace humans, Surpass humans.
  • Machines will make 90% of people redundant in the future society, but they cannot easily replace capitalists, artists, and
  • Machines have surpassed human in perception, but it will take five to ten years for it to do the same with cognition.

Now we are done with deep learning. We can imagine the growth of complexity when a neural network expands to 152 layers. This growth enables modern AI and machine learning to not only support, but potentially replace people to a large extent and surpass us in the future. Perhaps we are not willing to accept this idea.

I do believe that AI will surpass us in the future. How would it do so? Let us imagine if AI can already do much of what humans do today. What is the reason? Can humans play one million matches in one night as AlphaGo could? Can we gather data and learn from hundreds of moving cars at the same time as Tesla could? No, we cannot. Along the same lines, a human cannot process data from train stations and airports all over the country at the same time. Thus, we cannot operate on the same scale.

AI will replace people, and it is a matter of time before they surpass people. Not only this, machine learning is already better than humans in some scenarios. For all decisions that people make based on their observations and thoughts, AI can make these decisions faster with learning. AlphaGo has demonstrated this with AI that defeats humans in the game of Go.

Investment decision, policy, urban planning, and war-gaming are all activities that depend heavily on experience. We can see from AlphaGo today that AI will surpass human in these scenarios. This is because the machine has powerful self-learning capabilities. Driving, skiing, painting, and playing the violin are activities that cannot be learned by just following instructions. AI has surpassed human in these domains as well. In reality, AI has a better understanding of many things that we feel proud of.

A while ago, AlphaGo played Go against humans anonymously. When Jie Ke, the world champion, lost his match, he declared in introspection that, humanity has only skimmed the surface of Go in its 3,000 years of history. When we play Go, the human mind can see locally optimal solutions, but cannot evaluate grand strategies across the hills and the valleys of the decision space. AlphaGo, on the other hand, can look beyond the local situation and think outside the box. AlphaGo can learn about these because it possesses a superior data processing power. It is a plain fact that humans will never be able to defeat AlphaGo again.

Let’s share an example from Microsoft Asia Research. In this example, when the machine observes a stop sign, it can describe the sign by labeling it as being in the city, being red, having a signpost, and is related to traffic. This system tells stories about input pictures, rather than just come up with a name.

There is another example. This photograph shows a woman preparing food in the kitchen. The first description is, “A woman preparing food in the kitchen.” The second description says, “A woman preparing breakfast or lunch beside the kitchen sink.” The first one is machine-generated. At this point, the machine has surpassed humans. You could say this particular person is not good at telling a story, but at least the computer can tell a better story than he could. Of course, all of these are still in the experimental phase.

People say that under the trend of AI, the safest job is an archeologist. Unfortunately, in our society, we do not need many archeologists, and they will not be earning an unusually high salary. There is a proverb in China that says, “It is imperative for a man to enter the right profession, and imperative for a lady to marry the right man.” With all these changes, what jobs will be eliminated in the future by AI, and how would the society evolve?

Globalization is the process of finding lowest-cost suppliers on a global scale. Thus, globalization causes inequality and the Matthew Effect. With ever more efficient transnational corporations, we see the decline of employment for blue-collar workers in the developed world, including the US. Would AI catalyze this tendency?

In the future, there will be two types of people, the polymath, and the slacker. The problem is that nine out of ten people will be slackers. What will we do then? Last year, a member of the Swiss Parliament proposed a universal income of three thousand Swiss francs to everyone regardless of whether they did any work. The Swiss people are rational, so this was defeated in the referendum. In the future, there will be three types of people who can face-off against AI. Capitalists will be ok since the future still runs on capital. The other two will be artists and craftspeople, because machines will not be able to learn these skills shortly. Of course, most people could not just fit into one of these types.

Where are the limitations of AI? General AI (GAI) still has a long way to go. Machines have surpassed humans in perception, but cognition may again take five to ten years or even longer.

Do deep learning methodologies have problems? Yes, there is a big one created by ourselves. Artificial machine intelligence is built when machines learn from observation and experience and program themselves. Programmers do not need to write commands to solve problems, but the programs will generate their algorithms with the example data and expected output.

In many industries, we are already approaching these goals. The first example is the Nvidia Autonomous Vehicle, which learned by observing the behavior of the human driver, and determined how it should do the same without instructions from programmers. The second example is a system named Deep Patient, developed in a New York hospital. The system learned from the data of seventy case files, deduced patterns, and produced compelling disease prediction capabilities. It is extraordinarily powerful in diagnosing schizophrenia, where it surpassed doctors. In the third example, the U.S. military invested heavily in machine learning to supply attack targets for vehicles and aircraft. They mined for terrorist information from a significant amount of data, and the effectiveness far exceeded initial expectations. Deep learning already possessed such abilities, but it is still not capable of explaining its actions. It is still a black box.

In human history, such a machine had never been developed, whose action and decision are not even comprehensible by humans. Today, we created deep learning machines that cannot determine or explain its operations, and this makes us uncomfortable. People questioned themselves, we made many decisions, but can we explain why we do such things? However, even though humans can tolerate this situation for themselves, they cannot tolerate the machines to do the same. The U.S. Department of Defense identified the lack of explainability of machine learning as a “Key Obstacle.”

Of course, we know people will have to collaborate with machines in the future. We should take a look at the evolution from animals to humans. The core property of the evolution of intelligence is to evolve a system that cannot be explained by its creator. I cannot guarantee God will understand all of the various things we do today. Researchers started analyzing to address the concern about explainability so that they can understand and track the decision-making process.

The final objective: what is the difference between machine and human?

They are fast, bigger than you, and more powerful than you in certain aspects. So, what is the difference between the machine, and humans?

Machines do not have a survival instinct, which is the fear of death. This is the core difference in the definition of humans and machines. Humans’ and animals’ evolution rate depends on a series of survival instinct. They escape when attacked, avoid pain, consume food and procreate, and feel a sense of belonging. Humans also do evil because they are fearful of death and driven by their desires. The machines, up until now, do not fear death, and they are emotionless. Does being emotionless mean they are not intelligent? This is a question of religion, not of science.

Now we have finished talking about machine learning. We are running out of time. Let us quickly speak about making investment decisions in the AI industry.

5. The Opportunities and Challenges in AI Investing

  • Decisions about investing in the AI ecosystem include the Foundation, the Technology, and Application levels.
  • Large corporations control the Foundation level, and it is doubtful whether there could be new companies emerging at the Technology level.
  • The Application level is looking for industries that can produce a significant amount of data.

Experience tells us that in each technology trend, there will be some platform companies. When we talk about AI investment and ask about where to invest, it is just like the discussion about investing in PC and Internet. We are talking about the industry ecosystem.

To decide to invest in AI, we know the industry ecosystem includes Foundation, Technology, and Data. Industry giants already control the Foundation. In the Foundation level, there are two major blocks, Foundational Computing Power and Data. Foundation Computing Power is already provided as SaaS services by Google, Microsoft, and Baidu in China. In the Technology block, can you become a platform without data? In the SaaS area, maybe some companies could emerge in SaaS applications, but not generalized SaaS? This is also a big question mark.

In the Application layer, AI is a scenario of “AI+,” and a tool for increasing productivity. AI will make many existing applications more useful. Of course, you will want to look for areas of easy breakthroughs. Industries with a lot of money and data are areas where AI can become effective sooner. Therefore, we need to determine if an industry is producing a significant amount of data, and if that data is expanding and generating value regularly. This allows us to disrupt the original eco-system.

Let’s suppose the last trend was the Internet, and the new trend is AI. We need to be aware of the difference between AI and the Internet. In simple terms, as AI progresses to the next year, we will notice it will become “Intelligence+.” AI is technology-driven and vertical-driven. Technology itself advances too fast, but the Internet is about innovation in business models and building completely new applications. Therefore, the “winner takes all” theory may not be the case in the AI industry.

According to this observation, let’s take a look at the scenario of AI investment today. Today, there are many bubbles in AI, the biggest being the valuation of companies. When you negotiate with a company, every one of them will say that they are an AI company. To truly determine if a company is an AI company, the critical thing is whether it owns data. The core competency is to have data and to be able to get new data consistently.

The company Toutiao established itself and suddenly emerged in the last five years. The reason for this is the demand for information gathering. Toutiao had taken the high ground quickly when AI took off in adoption rate. It is the first company that conducted a news recommendation by searching. The system itself is an extensive learning network, and it continues to evolve today to increase its recommendation effectiveness. Using this as the basis, it makes breakthroughs in its core competency. We can predict in the future that Toutiao will be a super intelligent system, and we can envision how big its scale of data will be. It has expanded beyond initial text processing, to images, message boards, and to video streaming. It owns an ever-expanding set of data.

Finally, for investing in AI, if you remember what I just said, there are three areas of investment. The first is “Intelligence+,” a capability all companies should have. This ability is the core competitive edge. The second is the “AI ecosystem,” including new development, consulting, and AI-as (AI as a service), the third is in “data and talent.”

Talent and data is the key here. Investing in the algorithm is in essence investing in people. The training process for deep learning I mentioned before requires people to understand the algorithms and the training methodologies. Talented people who understand algorithms and their application are precious. Furthermore, data is the final competitive moat of an AI company.

Finally, I want to share a piece of information with everyone. There are many Chinese talents in the area of AI. An AI report by Goldman Sachs noted that in the last five years, there are more Chinese authors than U.S. authors of machine learning-related papers, and the number continues to grow. There are also more citations for Chinese authors compared to U.S. authors, and the number continues to grow. At least in this area, Chinese talents are not rare. When there are many talents, there naturally will be many top-talents. China is not inferior in its talent pool.

China also has such a significant amount of data. Therefore I will say that both talent and data are available in China. The area of AI must be new hope for Chinese innovation and investment. Thank you all.