The Significance of B5G Research and Development

The Remote Shared Three-dimensional Space Technology Aiming to Realize a Remote Society

Jun Rekimoto , Ph.D(Science)
Interfaculty Initiative in Information Studies,
Graduate School of Interdisciplinary Information Studies
The University of Tokyo

The fundamental technology for a remote society is being established through research and development aimed at realizing remote shared three-dimensional spaces that enable the measurement, recording, transmission, reconstruction, and sharing of remote spaces in real time. What is the near future of this technology, which has potential applications in a wide range of fields? We asked these questions to Professor Jun Rekimoto, who is leading this research and development.

Realizing remote shared three-dimensional spaces by introducing the concept of image segmentation

Please give us an overview of the “Research and Development of Human Augmentation and Space Creation-based Remote Collaboration Support Infrastructure ” project that you have been leading.

暦本純一先生

Rekimoto: This project aims to create an environment that transcends the barriers of time and space, allowing people and avatar robots working remotely to share the same space and support collaborative remote work while grasping a comprehensive view of the remote location. It seamlessly and scalably integrates real-time spatial information from remotely located static spaces that have been pre-measured in three dimensions with dynamically acquired spatial information and human behavior information obtained via sensors. This is combined with zero-latency spatial sharing technology that integrates physical behavior prediction using low-latency networks and deep learning, enabling the construction of a spatial work support user interface that allows remote workers to freely switch between a first-person view and a flexible, bird's-eye view of a three-dimensional space.

Rekimoto

In other words, this research and development aims to establish technology that creates three-dimensional spaces that users can experience as if they were physically present in that space—even when they are in a remote location—from any view, such as first-person, third-person, or even from behind. Specifically, we have conducted demonstration experiments in which we closely observed people playing the piano in remote locations, and demonstration experiments in which we taught students the movements of tea ceremony from several different views.
The first stage involves, for example, recording piano performances in three dimensions and connecting them offline. This stage allows users to view scenes in three dimensions as if they were actually there, even when they are in a remote location. The second stage involves connecting remote locations in real time, enabling mutual communication and collaborative work between people in different locations. At this stage, it is also possible to talk to the other person. Currently, the first stage has been realized, and we are working on demonstrating the second stage.

What technological advances were made to successfully carry out these demonstration experiments?

img 2
Demonstration experiment: Evaluating the effectiveness of three-dimensional representation
Reconstructing piano playing posture in three dimensions.

Rekimoto: Traditionally, transmitting all the information of massive three-dimensional images in real time was not feasible due to communication volume limitations. Therefore, we pre-transmitted the image information of static elements such as background buildings so that we only had to transmit the image information of moving workers by isolating their surrounding areas. This approach allows the transmission of necessary information to be contained within realistic communication volume limitations. Introducing the concept of image segmentation enabled the transmission of video that was previously impossible.
Since backgrounds are static, they can be measured in high definition in three dimensions and reconstructed remotely regardless of communication volume. On the other hand, moving objects such as workers are not as high definition as backgrounds, but the information is transmitted in real time. I believe that we were able to successfully carry out these demonstration experiments by achieving a good balance in the accuracy of distinguishing between areas that require real-time transmission and those that do not.
We are currently conducting research and development with a view to establishing technologies that enable higher precision for moving elements. We are incorporating the results of our research on three-dimensional image reconstruction technology, such as reconstructing images using depth sensors that can measure distance while shooting, which is also used in autonomous driving technology for automobiles, and are in the process of improving this technology. Recently, technology that uses AI to reconstruct images has been advancing even in consumer cameras, and I believe that this technology is likely to progress rapidly in the future.

img 2
Demonstration experiment: Evaluating the effectiveness of three-dimensional representation
Reconstructing piano playing posture in three dimensions.

What role does AI play?

Rekimoto: Depth cameras are a measurement technology, not an AI technology. However, AI can perform tasks such as reconstructing three-dimensional spaces from multiple camera images through machine learning. I believe that this will enable the realization of technologies such as reconstructing high-definition images with a small number of cameras.

The realization of remote shared three-dimensional spaces is an effective application of Beyond 5G (B5G)

How did B5G contribute to these demonstration experiments?

Rekimoto: I consider this to be an effective application of B5G, given the increased transmission bandwidth and reduced latency. The realization of remote shared three-dimensional spaces involves the reconstruction of images at remote locations, the transmission of information, and the reconstruction of the images at the receiving end. I believe that B5G will improve efficiency in the transmission stage, resulting in latency that is acceptable for real-world applications.

What will become possible with remote shared three-dimensional space technology?

Rekimoto: This technology is extremely important for building a remote society. In addition to office work using two-dimensional screen sharing in remote meetings, it can also support work that requires movement in real work site spaces, such as farmland and disaster sites.
For example, it can support three-dimensional activities such as learning piano playing in a remote location, as mentioned earlier, on a wide scale. The integration of human augmentation technology and B5G will enable the transfer of technology by connecting experts in remote locations with learners, which can be utilized in various fields as the backbone of industry.
One specific application is remote medical care. The ability to learn expert techniques from a remote location as if you were right there in person is highly valuable from the perspective of education and skill transfer. Additionally, it may be possible to experience the sensation of being on-site, as with remote tourism. In this case, by reconstructing pre-recorded images of the location in three dimensions and transmitting moving images in real time, you can experience the sensation of traveling with a partner. This could also be useful for supporting people who are unable to travel to remote locations due to physical disabilities or other limitations.

The tea ceremony instruction used in the demonstration experiment could also be useful as teaching material for learning other skills. In this case, the instructor's movements would be measured in three dimensions and users could learn by viewing them in AR. Currently, there is already video-based learning content available, but it is limited to observing the instructor's movements from a fixed view. With remote shared three-dimensional space, it is possible to observe the instructor's movements from a third-person view, as well as from a first-person view as if you were the instructor, or from behind the instructor.
Furthermore, by introducing AI technology, it may become possible to ask teachers questions and receive guidance from them while learning content. In this scenario, AI that understands tea ceremony etiquette is added to the space. With AI incorporated into the information transmission path, humans and AI can share what is happening in real time. AI could accurately answer questions such as, “Why are you holding this tool now?” and “How do my movements differ from those of the teacher?” I believe that such a world is not so far off and is likely to be realized within a few years.

img 3
Observing the tea ceremony from a first-person view, third-person view, and behind view (demonstration experiment at the Sony Computer Science Laboratories - Kyoto ).
img 3
Observing the tea ceremony from a first-person view, third-person view, and behind view (demonstration experiment at the Sony Computer Science Laboratories - Kyoto ).

B5G is an important foundation for building a remote society

How do you plan to advance this research going forward?

Rekimoto: I hope that we can realize a remote society with infrastructure that allows workers to perform their duties without distance limitations. In reality, there are still many tasks that cannot be performed remotely, but as the number of tasks that can be performed remotely increases, the freedom to choose where to work will dramatically improve. This will also help support areas with declining populations and the elderly. In addition, it will reduce the need to travel to places where transportation costs are high, such as disaster sites, which I believe will greatly increase its value as social infrastructure.

What are your expectations of B5G?

Rekimoto: I believe that it is an important foundation for building a remote society. Not only will it be possible to hold remote meetings as we do now, but it will also enable three-dimensional collaboration where real spaces can be shared, freeing people from the constraints of where they should be. I hope that this will enable people to work collaboratively even with those on the other side of the world.