Tencent's Ma Xiaoyi: How far are we from the Metaverse?

Tencent Culture
2022-04-30 09:27:57
Collection
What is the current state of technology in the metaverse? What kind of devices and content does the metaverse need for support? Why will a qualitative change occur in 2030? Perhaps answers can be found in Ma Xiaoyi's sharing.

Speech: Xiaoyi Ma, Senior Vice President of Tencent

Organizer: Game Grape

Source: Tencent Culture

On April 24, Xiaoyi Ma, Senior Vice President of Tencent, shared his views on the metaverse, VR/AR, and other topics at the Fudan University School of Management alumni meeting.

In the wave of enthusiasm for the metaverse, Ma's perspective on the changes in the metaverse over the next 2-3 years is conservative. Although many specialized technologies have made breakthroughs, the general-purpose devices needed for the metaverse era (similar to the home PCs that brought about the computer revolution and the smartphones that propelled the mobile internet) have not yet emerged. From both a technological and commercial standpoint, there is still a distance to its invention and popularization.

However, he has strong confidence in the longer-term development prospects over the next 10 years. On one hand, the concept of the metaverse has gained more technological support in the past two years and is addressing its bottleneck issues from multiple angles; on the other hand, society's understanding of it is also breaking through, evolving from a concept limited to games and software to one that spans multiple scenarios. He believes that the turning point for the metaverse will be in 2030.

What is the current state of technology in the metaverse? What kind of devices and content does the metaverse need to support? Why will the transformation occur in 2030? Perhaps answers can be found in Ma's sharing.

The metaverse has recently become very popular, but in hindsight, it is not a new topic.

In 1997, a game called "Ultima Online" introduced the concept of a continuously running world. Under this concept, even if no one is online to interact, the game's world continues to exist and develop. This idea is completely consistent with the current concept of the metaverse.

Around the years 1995-1998, various internet companies began to emerge, considering how to bring real-life scenarios and relationships online.

The term "Metaverse" comes from a novel published in 1992. The most recent mention in the business world was in a 2018 article by an American analyst titled: "What is Tencent's Dream?"

The article analyzed why Tencent invested in Epic Games, Roblox, Reddit, Snapchat, and various content and platform companies. When he pieced together the investment layout, he felt that the term "Metaverse" was the best fit.

With the IPO of Roblox and Facebook's rebranding to Meta, the metaverse has become an increasingly hot concept. Why has this early concept suddenly gained popularity today? I believe it is due to the many changes that have occurred in recent years compared to the past.

The first is that overall computing power has reached a critical point. In 2016, there was a wave of VR enthusiasm, and at that time, I was relatively conservative and somewhat pessimistic. After discussing with many people in the industry, we felt that to truly support a good experience on VR devices and achieve large-scale applications, the necessary technology was far from mature.

For example, the mobile computing chips at that time could not support high-resolution displays, and combined with the limitations of batteries and display devices themselves, the experience was poor.

Secondly, some key new technologies were not yet in place. I have been closely monitoring many upstream key technologies and even the progress of some component manufacturing. We have already seen some roadmaps becoming clearer, and in the coming years, these key technologies will have breakthrough developments.

Thirdly, I believe it is important that the penetration rate of the internet has also reached a critical point. Especially after experiencing the global pandemic over the past two years, it has become mainstream for people to work from home—if you are a tech company and do not offer employees the option to work from home, you lose your appeal. People have become accustomed to accomplishing a large part of their lives online.

Finally, from the user's perspective, looking at demand. As a species, humans' greatest characteristic is "collaboration." As society becomes increasingly complex, the technologies we need also become more complex. When society sees that some technologies have the potential to meet these needs, it will drive significant breakthroughs.

Based on the above points, the first attempt of "Second Life" in 2006 was clearly too early, and the first wave of VR in 2015-2016 was also too early.

Today, perhaps is a suitable time point.

So, taking advantage of this conference, we can discuss our visions for the future of the Metaverse—honestly, I personally have some different views from many media reports outside. For example, I feel that discussing the details of the Metaverse now is still too early.

After all, what the metaverse should look like and what components it should consist of… is still a challenging task to define clearly and precisely.

However, we can now view the metaverse as a potential integrated approach to future new technologies and applications, a collection of imaginations about future human-computer interaction and interpersonal interaction.

I will mainly discuss my imagination for the future of the metaverse from the perspective of immersive experience and content development.

Immersion: The Metaverse May Begin to Become Popular After 2030

First, regarding immersion, I think there are mainly two directions:

  1. Human-computer interfaces & interaction methods;
  2. Experience issues with devices like VR.

When we mention the Metaverse, people inevitably associate it with VR. This is due to two reasons: on one hand, Facebook made a significant bet on VR recently; on the other hand, if we want to achieve the concept of the Metaverse, then new devices like VR are indispensable.

Why? Ordinary people input information to computers using keyboards, mice, touchscreens, etc. From a technical perspective, their bandwidth is actually very narrow, and the information that can be input is quite limited. The computer can only respond based on this input, and more information cannot be directly input.

However, as the input bandwidth increases, we can actually add more dimensions.

For example, we can add more cameras and sensors to VR devices to input environmental information into the computer. We can track human body states; the latest technology can achieve 3D tracking in six dimensions—front, back, left, right, up, and down—including head position and eye orientation, facilitating interaction with the surrounding environment.

Among these, the operation of the hands is particularly noteworthy. The latest Oculus Quest 2 still requires controllers for input, but there is a consensus in the industry that more natural interaction involves using hands.

This advancement has already appeared in smartphones: early mobile phones were operated with styluses, but Apple stood up and said this was against human intuition. Human intuition should be to touch the screen with fingers—this principle applies to VR devices as well. So now, much of our progress is focused on capturing finger movements.

Additionally, we see some of the latest technologies focusing on facial recognition to capture expressions. Everyone knows that the reactions of others are important feedback during communication.

For example, my experience in online meetings is not very good because I cannot see everyone's expressions; I do not know if I am speaking too fast or too slow. However, in offline settings, I can immediately see the audience's reactions. Therefore, in the future, we also need to leverage facial recognition to track real-time expressions and project them into virtual scenarios…

These new input methods, when integrated, greatly increase the information bandwidth and can significantly change how information is input into computers.

There are also some very exciting technological developments.

For example, recently we have seen display units on VR headsets. If you have tried the Oculus Quest 2, you know it uses Fast-LCD technology. However, this technology has a problem known as the "screen door effect." When the device is too close to the eyes, you can see the pixel dots right in front of you, as if looking through a screen door. This is similar to the low-resolution displays of early mobile phones.

The subsequent story is well known; one major reason for the iPhone's success was its introduction of the retina display concept in the iPhone 4, which greatly improved the user experience with mobile screens.

As VR has developed to today, new technologies are emerging to address these issues. For example, 4K resolution silicon-based OLED displays are already on production lines and can provide sufficiently high resolution for VR headsets.

How can we achieve the effect of a retina display on VR headsets? There is now a standard for how many pixels should be present in each degree of human vision. Generally speaking, when there are about 30 pixels, you can no longer perceive the boundaries of the pixels; if it reaches 60 pixels per degree, you cannot see the pixel dots at all. Therefore, 4K silicon-based OLED displays are already sufficient to provide a very good experience.

Brightness is also a changing point for VR headsets.

From the current silicon-based OLED displays to future MicroOLED displays, they can provide higher brightness. Previously, LCDs had a brightness of about 400-500 nits, but now we can see 2000 nits, and from future development roadmaps, we have already seen devices in research that can reach 10,000 nits.

This addresses a significant bottleneck issue, namely, improving the technology of display units to enhance user immersion. The market has also seen some mature products.

However, in the future, another important development direction will be the technology of folded optical paths.

Previously, VR devices were typically thick, heavy, and large, making them uncomfortable to wear. In the future, we will see devices with folded optical paths that can make VR headsets very thin, closer to the feeling of wearing regular glasses.

Another breakthrough is the debate between VR and AR. VR says that all content comes from the display unit, while AR says that content must overlap and integrate with the real environment.

Currently, the industry is primarily working on folded optical path solutions—still using displays to present all content but adding more sensors to capture the user's surrounding situation, ultimately overlaying it on your display. There is also a rumor in the industry that Apple will release such a device early next year, using folded optical path solutions to achieve VR/AR effects.

In summary, the technology roadmap for human-computer interfaces is developing clearly. Aside from some limitations with batteries, there are almost solutions for other aspects, and progress is being made.

This is also why we have confidence in the development of the metaverse, VR, and AR.

Another point is that to ensure a good technological experience, having hardware support within the devices is not enough; software development is also necessary.

It should be noted that the software here refers to technology, not content. We can summarize it into three dimensions:

  1. Trustworthy environments: The scenes of the Metaverse are becoming increasingly realistic, allowing users to believe that this is a real world.

  2. Trustworthy characters: Humans are highly social beings; I need to interact, communicate, and collaborate with groups. Among this, the most needed is a trustworthy character. I believe everyone has some experience with this.

  3. Trustworthy interactions: Still discussing online meetings, this format resembles a speech rather than a discussion; in offline settings, people can interject and discuss, creating collisions, but the current online format lacks these elements.

To create more realistic interactions, many convincing details must be involved. For example, if I ask you to hand me a bottle of water or shake hands, I can feel the temperature and weight, which traditional computers and tablets cannot achieve… However, there are already technologies making significant progress in this area.

In summary, with the development of technology, we need to provide many tools to companies working on the Metaverse, VR, and AR, while helping them create more trustworthy worlds and character relationships.

Content Volume: Who Will Provide Enough Content?

This is quite a large topic. Because we still do not know what content the Metaverse needs to include. However, from our current perspective, we can divide it into several dimensions: specialized content and general content.

I am somewhat pessimistic about this part. Because although technology is developing on the roadmap, it requires more time to prepare. Transitioning from specialized to general will take even more time.

When a technology has not yet reached a critical point, it often remains a relatively specialized application, such as in gaming or film; while general applications, like the smartphones we use, now have thousands of ways to utilize them.

Roughly speaking, when I communicate with the internal team, we generally set 2030 as a benchmark. Before that, the Metaverse will still be in a specialized phase, and only after that might there be opportunities to transition to a more general state, beginning to challenge existing computer and smartphone usage scenarios. Currently, it is more of a new, supplementary scenario positioning.

Here, it may be necessary to elaborate on the differences between specialized and general.

About a month and a half ago, I had a meeting with Epic Games CEO Tim Sweeney, and when discussing this issue, my summary was: our current games are just "games," and when you watch a video, you are only watching a "video," but the Metaverse is different; you should be "living" in the Metaverse. There is a fundamental difference in this regard.

For example, when you go to an online cinema, that is a specialized world, separate from reality. However, the Metaverse integrates and incorporates everyone's lives into this virtual world. Therefore, it requires enough content to make you willing to live within it.

This sufficient content can be divided into several routes:

  1. PGC (Professionally Generated Content): For example, companies like Tencent produce movies, games, and TV shows, which is a form of professional production. PGC remains an important part of the Metaverse.

  2. UGC (User Generated Content): This refers to content created by users.

  3. Blending Reality and Virtuality: As we just discussed many technologies, they aim to integrate and incorporate the real world into the virtual world.

These three parts are all crucial components of content.

So how can we achieve this?

First, there have been some breakthroughs in large-scale PGC content recently. Those who follow new games may know that Epic Games recently showcased a stunning demo; they created a virtual city coinciding with the release of "The Matrix." This city covers an area of about 10-20 square kilometers, with 30,000 citizens and a large amount of traffic and buildings.

The key is that the vehicles within it have their own actions: they stop at traffic lights and yield to pedestrians, and each of the 30,000 citizens has different clothing, behavior, and appearance—previously, creating such a large-scale virtual world would have been nearly impossible.

However, today, with technological advancements, these previously impossible tasks are gradually being realized because, on one hand, technological capabilities are increasing, and on the other hand, there are now better methods to solve these challenges.

For example, we have two projects in collaboration with Epic Games. One project involves using AI to import the real world. Previously, we went to a valley in New Zealand, which is a 10-square-kilometer area. We took 7,000-8,000 photos there and imported them into an engine, which then reconstructed about 90% of the valley. After some manual adjustments, we were able to generate a very large world in a short time, and it looked very realistic.

The other project we are working on is a virtual human project—domestically known as Siren. Later, Epic Games also released the Meta Human project, which features highly realistic humans. This ties back to the concept we mentioned earlier: a trustworthy character is an important component of the virtual world.

In the past, creating a virtual character required a significant amount of time.

You may have heard this story: in Hollywood, creating a virtual character for a 10-second online rendering could take several months, but with current technology, we can achieve 95% of the quality that Hollywood would take four months to produce in just 0.01 seconds.

All of this provides better methods for large-scale production. Including motion capture and other content, AI support can make content creation much easier.

Secondly, regarding the UGC part, my views differ somewhat from external perspectives.

For example, when discussing UGC externally, the key point is decentralization. However, I have had many discussions with Roblox CEO David— we participated in the construction of Roblox early on, around 2015, and made several investments. When we talk about the key to successful UGC content, we mention the principles of community and transparency, especially the part that maintains order and status, which requires strong centralized execution.

However, this does not mean I disagree with external opinions. I believe decentralization should refer to the decentralization of capabilities: providing tools, capabilities, and resources to users, allowing them to create their own content without needing a centralized platform. What supports this content is a continuous, principled, and long-term stable platform. I believe this is the most important point for creating UGC.

The third point, blending reality and virtuality, largely returns to the new technologies of the Metaverse, VR, and AR we discussed earlier. The goals we hope to achieve align with the previous efforts of the internet and human society. Humans have always sought to break physical limitations; for example, airplanes and cars were created to eliminate physical constraints, reduce physical losses, and improve the efficiency of large-scale collaboration.

Therefore, in the long run, the Metaverse will focus more on how to introduce more external, real-world services and content, breaking the boundaries of physics.

More Realistic Issues

Finally, I want to talk about the issues surrounding the Metaverse, VR, and AR.

Although the content I just discussed seems optimistic, I personally harbor some pessimism deep down.

When I chat with everyone, I often say: looking at the short term of 1-3 years, I am more pessimistic than most people; looking at the long term of 10 years, I am more optimistic than most people, believing that these new technological scenarios will bring significant changes to human society.

From my perspective, these technologies will be widely available between 2025 and 2027, but to fully roll them out, we still need to wait until 2030. This is the timeline I have in mind.

A simple example is price, which is an important reference. For instance, the aforementioned 4K silicon-based OLED display offers an excellent experience, but it is too expensive; the price of just one display unit is equivalent to the entire price of the Oculus Quest 2. Therefore, it will take a long time for prices to come down.

Secondly, the business model itself still has some issues. As I mentioned earlier, it is still too early.

If we were to correlate our current stage with past history, there are two key points:

1. The Popularization of General Devices.

Many young people may not know that the first household specialized device was actually a game console—the Atari 2600, released in 1977, which can be compared to the current situation of the Oculus Quest 2.

The well-known IBM PC entered the market in 1981, but the widespread adoption of home PCs did not occur until after Windows was released; the first version was in 1985, and the widely promoted 3.0 version came out in 1992, while the more familiar Windows 95 was released in 1995.

We can see that this process took at least 6-10 years of development.

2. The Establishment of Business Models.

The internet actually appeared in the 90s, but well-known internet companies, including Tencent and Google, were established in 1998. Moreover, these companies truly found their supporting business models around 2004. This process also took many years.

Therefore, in the short term, the development of the entire Metaverse will require several more years of nurturing, but the direction of nurturing is clear, and the potential is also very evident.

In the past 20 years, the contributions of the entire internet to human society, economy, and technology are evident to all, so I believe that in the next 20-30 years, an integrated internet experience centered around the Metaverse will bring even greater impacts to society.

And within this, there will be many points worth discussing, collaborating, and promoting.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators