Saturday, April 1, 2017

VR/360 Streaming Standardization-Related Activities

Universal media access (UMA) as proposed in the late 90s, early 2000 is now reality. It is very easy to generate, distribute, share, and consume any media content, anywhere, anytime, and with/on any device. These kind of real-time entertainment services — specifically, streaming audio and video — are typically deployed over the open, unmanaged Internet and account now for more than 70% of the evening traffic in North American fixed access networks. It is assumed that this number will reach 80% by the end of 2020. A major technical breakthrough and enabler was certainly the adaptive streaming over HTTP resulting in the standardization of MPEG-DASH.

One of the next big things in adaptive media streaming is most likely related to virtual reality (VR) applications and, specifically, omnidirectional (360-degree) media streaming, which is currently built on top of the existing adaptive streaming ecosystems. The major interfaces of such an ecosystem are shown below and are described in a Bitmovin blog post some time ago (note: it has now evolved to Immersive Media referred to as MPEG-I).



Omnidirectional video (ODV) content allows the user to change her/his viewing direction in multiple directions while consuming the video, resulting in a more immersive experience than consuming traditional video content with a fixed viewing direction. Such video content can be consumed using different devices ranging from smart phones and desktop computers to special head-mounted displays (HMD) like Oculus Rift, Samsung Gear VR, HTC Vive, etc. When using a HMD to watch such a content, the viewing direction can be changed by head movements. On smart phones and tablets, the viewing direction can be changed by touch interaction or by moving the device around thanks to built-in sensors. On a desktop computer, the mouse or keyboard can be used for interacting with the omnidirectional video.

The streaming of ODV content is currently deployed in a naive way by simply streaming the entire 360-degree scene/view in constant quality without exploiting and optimizing the quality for the user’s viewport.

There are several standardization-related activities ongoing which I'd like to highlight in this blog post.
The VR industry forum has been established with the aim "to further the widespread availability of high quality audiovisual VR experiences, for the benefit of consumers" comprising working groups related to requirements, guidelines, interoperability, communications, and liaison. However, the VR-IF just started but it may be find itself in a similar role as the DASH-IF for DASH.

QUALINET is a European network concerned about Quality of Experience (QoE) in multimedia systems and services. In terms of VR/360 it runs a task force about "Immersive Media Experiences (IMEx)" where everyone is invited to contribute. QUALINET also coordinates standardization activities in this area. It can help organizing and conducting formal QoE assessments in various domains. For example, it has conducted various experiments during development of MPEG-H High Efficiency Video Coding (HEVC).

JPEG started an initiative called Pleno focusing on images. Additionally, the JPEG XS requirements document references VR applications and JPEG recently created an AhG on JPEG360 with the mandates to collect and define use cases for 360 degree image capture applications, develop requirements for such use cases, solicit industry engagement, collect evidence of existing solutions, and update description of needed metadata.

In terms of MPEG, I've previously reported about MPEG-I as part of my MPEG report (also see above) which currently includes five parts. The first part will be a technical report describing the scope of this new standard and a set of use cases and applications from which actual requirements can be derived. Technical reports are usually publicly available for free. The second part specifies the omnidirectional media application format (OMAF) addressing the urgent need of the industry for a standard is this area. Part three will address immersive video and part four defines immersive audio. Finally, part five will contain a specification for point cloud compression for which a call for proposals is currently available. OMAF is part of a first phase of standards related to immersive media and should finally become available by the end of 2017, beginning of 2018 while the other parts are scheduled at a later stage around 2020. The current OMAF committee draft comprises a specification of the i) equirectangular projection format (note that others might be added in the future), ii) metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data, iii) storage format adopting the ISO base media file format (ISOBMFF/mp4), and iv) the following codecs: MPEG-H High Efficiency Video Coding (HEVC) and MPEG-H 3D audio.

The Spatial Relationship Descriptor (SRD) of the MPEG-DASH standard provides means to describe how the media content is organized in the spatial domain. In particular, the SRD is fully integrated in the media presentation description (MPD) of MPEG-DASH and is used to describe a grid of rectangular tiles which allows a client implementation to request only a given region of interest — typically associated to a contiguous set of tiles. Interestingly, the SRD has been developed before OMAF and how SRD is used with OMAF is currently subject to standardization.

MPEG established an AhG related to Immersive Media Quality Evaluation with the goal to document requirements for VR QoE, collect test material, study existing methods for QoE assessment, and develop a test methodology -- very ambitious.

3GPP is working on a technical report on Virtual Reality (VR) media services over 3GPP which provides an introduction to VR, various use cases, media formats, interface aspects, and -- finally -- latency and synchronization aspects.

IEEE has started IEEE P2048 and here specifically "P2048.2 Standard for Virtual Reality and Augmented Reality: Immersive Video Taxonomy and Quality Metrics" -- to define different categories and levels of immersive video -- and "P2048.3 Standard for Virtual Reality and Augmented Reality: Immersive Video File and Stream Formats" -- to define formats of immersive video files and streams, and the functions and interactions enabled by the formats -- but not much material is available right now. However, P2048.2 seems to be related to QUALINET and P2048.3 could definitely benefit from what MPEG has done and is still doing (incl. also, e.g., MPEG-V). Additionally, there's IEEE P3333.3 defining a standard for HMD based 3D content motion sickness reducing technology to resolve VR sickness caused by the visual mechanism set by the HMD-based 3D content motion sickness through the study of i) visual response to the focal distortion, ii) visual response to the lens materials, iii) visual response to the lens refraction ratio, and iv) visual response to the frame rate.

The ITU-T started a new work program referred to as "G.QoE-VR” after successfully finalizing P.NATS which is now called P.1203. However, there are no details about "G.QoE-VR” publicly available yet, just found this here. In this context, it's worth to mention the Video Quality Experts Group (VQEG) which has a Immersive Media Group (IMG) with the mission on "quality assessment of immersive media, including virtual reality, augmented reality, stereoscopic 3DTV, multiview".

Finally, the Khronos group announced a VR standards initiative which resulted into OpenXR (Cross-Platform, Portable, Virtual Reality) defining an APIs for VR and AR applications. It again could benefit from MPEG standards in terms of codecs, file formats, and streaming formats. In this context, the WebVR already defines an API which provides support for accessing virtual reality devices, including sensors and head-mounted displays on the web.

DVB started a CM Study Mission Group on Virtual Reality which released an executive summary comprising mission statements of individuals/companies. The topic has been also discussed at DVB World. The goal of DVB CM-VR is to investigate the commercial case in the context of the DVB project which eventually may lead to technical specifications.

Most of these standards activities are currently in its infancy but definitely worth to follow. If you think I missed something, please let me know and I'm happy to include it / update this blog post.
Post a Comment