Friday, February 10, 2017

MPEG news: a report from the 117th meeting, Geneva, Switzerland

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects. Additionally, this version of the blog post will be also posted at ACM SIGMM Records.
MPEG News Archive
The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:
  • MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF)
  • MPEG-H 3D Audio Verification Test Report
  • MPEG Workshop on 5-Year Roadmap Successfully Held in Geneva
  • Call for Proposals (CfP) for Point Cloud Compression (PCC)
  • Preliminary Call for Evidence on video compression with capability beyond HEVC
  • MPEG issues Committee Draft of the Media Orchestration (MORE) Standard
  • Technical Report on HDR/WCG Video Coding
In this blog post, I'd like to focus on the topics related to multimedia communication. Thus, let's start with OMAF.

Omnidirectional Media Application Format (OMAF)

Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats "are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users' growing need for better-integrated multimedia solutions" [MPEG-A]." In the context of OMAF, the following aspects are defined:
  • Equirectangular projection format (note: others might be added in the future)
  • Metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data
  • Storage format: ISO base media file format (ISOBMFF)
  • Codecs: High Efficiency Video Coding (HEVC) and MPEG-H 3D audio
OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 -- Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin's solution for VR and 360-degree video.

Research aspects related to OMAF can be summarized as follows:
  • HEVC supports tiles which allow for efficient streaming of omnidirectional video but HEVC is not as widely deployed as AVC. Thus, it would be interesting how to mimic such a tile-based streaming approach utilizing AVC.
  • The question how to efficiently encode and package HEVC tile-based video is an open issue and call for a tradeoff between tile flexibility and coding efficiency.
  • When combined with MPEG-DASH (or similar), there's a need to update the adaptation logic as the with tiles yet another dimension is added that needs to be considered in order to provide a good Quality of Experience (QoE).
  • QoE is a big issue here and not well covered in the literature. Various aspects are worth to be investigated including a comprehensive dataset to enable reproducibility of research results in this domain. Finally, as omnidirectional video allows for interactivity, also the user experience is becoming an issue which needs to be covered within the research community.
A second topic I'd like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC.

Preliminary Call for Evidence on video compression with capability beyond HEVC

A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard -- with a preliminary target date for completion around late 2020 -- is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don't like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

MPEG Workshop “Global Media Technology Standards for an Immersive Age”

On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today's and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I'm happy to see any reactions, hints, etc. in the comments..

Finally, let's have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

MPEG-DASH and CMAF: Friend or Foe?

For MPEG-DASH and CMAF it was a meeting "in between" official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.
  • Spatial Relationship Description (SRD): allows to describe tiles and region of interests for partial delivery of media presentations. This is highly related to OMAF and VR/360-degree video streaming.
  • External MPD linking: this feature allows to describe the relationship between a single program/channel and a preview mosaic channel having all channels at once within the MPD.
  • Period continuity: simple signaling mechanism to indicate whether one period is a continuation of the previous one which is relevant for ad-insertion or live programs.
  • MPD chaining: allows for chaining two or more MPDs to each other, e.g., pre-roll ad when joining a live program.
  • Flexible segment format for broadcast TV: separates the signaling of the switching points and random access points in each stream and, thus, the content can be encoded with a good compression efficiency, yet allowing higher number of random access point, but with lower frequency of switching points.
  • Server and network-assisted DASH (SAND): enables asynchronous network-to-client and network-to-network communication of quality-related assisting information.
  • DASH with server push and WebSockets: basically addresses issues related to HTTP/2 push feature and WebSocket.
CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends -- let's hope that the friendship lasts for a long time.

What else happened at the MPEG meeting?

  • Committee Draft MORE (note: type in 'man more' on any unix/linux/max terminal and you'll get 'less - opposite of more';): MORE stands for “Media Orchestration” and provides a specification that enables the automated combination of multiple media sources (cameras, microphones) into a coherent multimedia experience. Additionally, it targets use cases where a multimedia experience is rendered on multiple devices simultaneously, again giving a consistent and coherent experience.
  • Technical Report on HDR/WCG Video Coding: This technical report comprises conversion and coding practices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video coding (ISO/IEC 23008-14). The purpose of this document is to provide a set of publicly referenceable recommended guidelines for the operation of AVC or HEVC systems adapted for compressing HDR/WCG video for consumer distribution applications
  • CfP Point Cloud Compression (PCC): This call solicits technologies for the coding of 3D point clouds with associated attributes such as color and material properties. It will be part of the immersive media project introduced above.
  • MPEG-H 3D Audio verification test report: This report presents results of four subjective listening tests that assessed the performance of the Low Complexity Profile of MPEG-H 3D Audio. The tests covered a range of bit rates and a range of “immersive audio” use cases (i.e., from 22.2 down to 2.0 channel presentations). Seven test sites participated in the tests with a total of 288 listeners.
The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.

Thursday, October 13, 2016

MPEG-CMAF: Threat or Opportunity?

In February 2016 at the 114th MPEG meeting in San Diego, an input contribution was registered providing a proposal for a “common media format for segmented media” signed by a number of major companies. This document proposed a Media Application Format (MAF) based on the ISO Base Media File Format (ISOBMFF) and other MPEG standards for segmented media delivery which later became the MPEG Common Media Application Format (CMAF; officially MPEG-A Part 19 or ISO/IEC 23000-19).

In this blog post we will look closer into CMAF and how it actually relates to existing over-the-top (OTT) deployments.

The full version of the blog post is available here...

Wednesday, October 5, 2016

New Springer Journal “Quality and User Experience”

In 2016, the Springer journal “Quality and User Experience” was launched (Editors-in-Chief: S. Möller; M. Tscheligi). It presents research on the human experience and quality perception of digital media, telecommunication and Information Communications Technology (ICT) products and interactive services. It explores human-centered and technology-centered approaches and examines a range of perspectives on quality of experience. Coverage includes mobile and pervasive applications, augmented and virtual reality, gaming, video conferencing, telepresence, and video-on-demand. Tactics can be human centered (e.g., to characterize user perceptions) or technology centered (to guide product development). As a result of this research, technologies, products and systems can be evaluated and optimized to provide optimum experience; this optimization process is also targeted by the journal.

The journal promotes integration of knowledge by assembling a range of disciplinary perspectives on experience quality: quality of experience (QoE), user experience (UX), quality management, usability engineering, human-centered design, cognitive processes, subjective audio & video quality assessment, and human-computer interaction.

The journal will encourage and enable first class research from any scientific discipline that contributes to and shows relevance to quality of experience and user experience. Examples include: development of a new metric based on subjective or objective analysis; taxonomies and models to define and explain quality of experience and user experience; relationship to other concepts such as user acceptance or value systems; lab or situated studies delivering insights to specific experience aspects, discussion of influence factors on UX and QoE and their relationships; the significance of time for the dynamics of user experience and quality of experience, relevant insights from different disciplines such as design, psychology, social sciences or material science; research in contextual experiences to capture specific situations including specific domain aspects; tools and frameworks towards the development of next generation experiences; methods to capture, analyze, design and evaluate user experience and quality of experience; user experience research related to special user groups, special needs as well personal differences; insights on the design of experiences from the constructive as well as from the process perspective; experience design approaches and methods; viewpoints on the meaning of experience design; and experience design for specific application domains.  

Please note that QUEX offers permanent free access to all articles published in 2016 and 2017.

More information:

Tuesday, September 13, 2016

Open Positions at Bitmovin

About working with Bitmovin

Bitmovin, a YCombinator company, is a fast growing privately owned technology leader, located in Klagenfurt am Wörthersee in Austria and in California. Our company is leading in research and development of cutting-edge multimedia services like, e.g.,Dynamic Adaptive Streaming over HTTP (DASH)HTTP Live Streaming (HLS), multimedia systems and multimedia cloud services.
We believe our employees are the most valuable assets of our company. They drive Bitmovin’s success with their knowledge, experience and passion. Therefore we provide a high degree of freedom to our employees to initiate projects and take on responsibility, while paying a very competitive salary above the average.
Working at Bitmovin is international, fast-paced, fun and challenging. We’re looking for talented, passionate and inspired people who want to change the way media is consumed online. Join us to develop, sell and market the world’s fastest video encoding service.

Sales and Marketing

Software and Development

Admin and Finance

Wednesday, August 31, 2016

IEEE Computer "Social Computing" Column: Call for Papers

I’m looking for forward-looking and thought-provokening articles for the Social Computing column within the IEEE Computer magazine. As you know, IEEE Computer is the flagship publication of the IEEE Computer Society (CS) which is distributed to all members (CS is the biggest society within IEEE).

The topics are related to the Special Technical Community on Social Networking (STCSN) and please submit column articles directly to me! The guidelines see below and no specific template is required (just plain text in an editable Word file is fine).

An overview of previous columns can be found here. If you have any questions or comments, don’t hesitate to contact me.

Guidelines for Computer Column Contributions

We encourage column editors to include contributions solicited from their colleagues to provide the six installments for their bimonthly Computer columns.

The target length for each column is 2.0-2.5 magazine pages, or about 1,500-1,900 words. Each figure or table is counted as 300 words, and obviously we prefer to include appropriate graphic elements when they are available. Max. 2,200 (if no art).

Editors are asked to remind contributors that columns do not include a bibliography or an acknowledgments section. References or URLs can be inserted inline in the text if needed.

Submitted columns should include the article title, author(s) name(s) and affiliation(s) and a brief bio that also provides email contact information:

//First name/last name// is a //academic title, institution, or business title, company//. Contact him at //email address.//

Image guidelines

To ensure the quality needed for print publication, we need an editable vector art file-for example, Illustrator or Visio files-for each line drawing. For each photo, we need a 4-color electronic image at 300 dpi resolution, preferably in a .tif, .png, or .jpeg format. We cannot use derivative images or images embedded in a document.

In our article layouts, the figures are usually at least 4 inches (24picas) wide. If you prefer to send screenshots, they should be approximately 12 inches wide. Our production artist can reduce these low-resolution images to 4 inches in Photoshop and process them to achieve the required resolution. If your original images are smaller than 12 inches, using a large monitor set at its highest resolution will help achieve a better screenshot. No compression is necessary.

Tuesday, August 16, 2016

DASH-IF Academic Track

The MPEG-DASH standard has raised a huge momentum within both industry and academia. The DASH-IF provides – among others – interoperability guidelines and test vectors and closes the gap enabling interoperable deployments. In recent years, we have seen a tremendous amount of research papers addressing various issues in and around DASH and, thus, the DASH-IF establishes an academic track to:
  • identify research communities working in the area of DASH
  • create awareness of DASH-IF material and promote it within the academic community, and
  • solicit research within and collect results from the academic community
As a first step the DASH-IF created the “Excellence in DASH Award” at ACM MMSys 2016 and is proud to announce the result as follows. The excellence in DASH award was selected by members of the DASH-IF and instead of a first, second, and third place the DASH-IF concluded to give the first price to all three papers which are as follows: “ABMA+: lightweight and efficient algorithm for HTTP adaptive streaming” by Andrzej Beben, Piotr Wiśniewski, Jordi Mongay Batalla, Piotr Krawiec (Warsaw University of Technology, Poland ); “Delivering Stable High-Quality Video: An SDN Architecture with DASH Assisting Network Elements” by Jan Willem Martin Kleinrouweler, Sergio Cabrero, Pablo Cesar (Centrum Wiskunde & Informatica, Netherlands); and “SQUAD: A Spectrum-based Quality Adaptation for Dynamic Adaptive Streaming over HTTP” by Cong Wang, Amr Rizk, Michael Zink (University of Massachusetts Amherst, USA). (see pictures here).

For academics who want to join the DASH-IF Academic Track, please subscribe to the public email reflector via

Everyone is welcome - let's do something! For any comments or questions, please let me know.

Another related activity was the IEEE ICME 2016 Bitmovin Grand Challenge on DASH which is summarized below. We'd like to thank all authors who have submitted their work to the grand challenge and we'd like to congratulate the winner team!

Tuesday, August 2, 2016

Review of ACM MMSys 2016 & NOSSDAV, MoVid, and MMVE

The 7th ACM International Conference on Multimedia System (MMSys 2016) was successfully held in Klagenfurt am Wörthersee, Austria from May 10-13, 2016 ( with the co-located workshops NOSSDAV, MoVid, and MMVE.
We'd like to thank our Gold Sponsors: Adobe and YouTube.
The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems are regularly published in the various proceedings and transactions of the networking, operating system, real-time system, and database communities, MMSys aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and the inter-play of the various approaches and solutions developed across these domains to deal with multimedia data types.
This year’s MMSys introduced a new format referred to as overview talks which have been held on May 10 starting in the afternoon and concluding in the evening with a get together event at the conference venue. The following overview talks have been given at MMSys: “Using Games to solve Challenging Multimedia Problems” by Oge Marques, ACM Distinguished Speaker, FAU, USA ; “More Juice Less Bits: MediaMelon Content Aware Streaming” by Ali C. Begen, MediaMelon Inc., USA, Ozyegin University, Turkey, IEEE ComSoc Distinguished Lecturer, “MPEG-DASH Spatial Relationship Description” by Omar Aziz Niamut, TNO, The Netherlands, “Mulsemedia: Novelty or Reinvention?” by Gheorghita Ghinea, Brunel University, UK ; and “Smart Camera Systems” by Bernhard Rinner, Alpen-Adria-Universität Klagenfurt, Austria .
ACM MMSys typically comes with keynotes from experts and leaders in industry and academy. The first keynote was about “Ten Thousand Channels to Ten Million Viewers: Technologies for Scaling Video Delivery over IP” by Neill A. Kipp, Comcast VIPER, USA addressing issues with video delivery at scale whereas the second keynote entitled “Advances and Trends in Augmented Reality Systems“ by Dieter Schmalstieg, Graz University of Technology, Austria was related to one of the special session. The third keynote was about “5G enabling the Tactile Internet” by Frank Fitzek, Technische Universität Dresden, Germany providing insights about next generation mobile networks.
Best Paper Award
Best Paper Award
In general, ACM MMSys 2016 attracted 71 full paper submissions from which 20 got finally accepted in the program which has been carefully selected from our experienced members of the technical program committee. In addition to the full paper submissions, MMSys 2016 hosted two special sessions, one on augmented reality and another on media synchronization. A demo session provided researchers, engineers, and scientist to present the opportunity to showcase their research prototypes, systems, and applications to MMSys attendees. An important aspect of MMSys is the dataset track which enables reproducible research thanks to the availability of common datasets across different application areas. In particular, the dataset track is an opportunity for researchers and practitioners to make their work available and citable.
ACM MMSys hosts three workshops: the 26th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), the 8th ACM Workshop on Mobile Video (MoVid), and the 8th ACM Workshop on Massively Multiuser Virtual Environments (MMVE). The operate with their own committees and review process but benefit from a single registration fee for all events co-located with MMSys.
The 7th ACM MMSys issued the following awards:
  • a best paper award,
  • a best student paper award, and
  • for the first time the excellence in DASH award sponsored by the DASH-IF.
Best Student Paper
Best Student Paper
The best paper award goes to “Distributed Rate Allocation in Switch-Based Multiparty Videoconference” by Stefano D’bronco (EPFL), Sergio Mena (Cisco Systems), Pascal Frossard (EPFL) and the best student paper award goes to “Network-assisted Control for HTTP Adaptive Video Streaming” by Giuseppe Cofano (Politecnico di Bari, Italy), Luca De Cicco (Telecom SudParis, France), Thomas Zinner (University of Würzburg, Germany), Anh Nguyen-Ngoc (University of Würzburg, Germany), Phuoc Tran-Gia (University of Würzburg, Germany), Saverio Mascolo (Politecnico di Bari, Italy).
The excellence in DASH award was selected by members of the DASH-IF and instead of a first, second, and third place the DASH-IF concluded to give the first price to all three papers which are as follows: “ABMA+: lightweight and efficient algorithm for HTTP adaptive streaming” by Andrzej Beben, Piotr Wiśniewski, Jordi Mongay Batalla, Piotr Krawiec (Warsaw University of Technology, Poland ); “Delivering Stable High-Quality Video: An SDN Architecture with DASH Assisting Network Elements” by Jan Willem Martin Kleinrouweler, Sergio Cabrero, Pablo Cesar (Centrum Wiskunde & Informatica, Netherlands); and “SQUAD: A Spectrum-based Quality Adaptation for Dynamic Adaptive Streaming over HTTP” by Cong Wang, Amr Rizk, Michael Zink (University of Massachusetts Amherst, USA).
We would like to congratulate all award winners of ACM MMSys 2016.
Finally, we would like to thank our gold sponsors Adobe and YouTube for their excellent support. In this context, it is worth mentioning the social events including coffee breaks, lunches, get together on the first evening, welcome BBQ on the second evening, and gala dinner on the third evening. These side events are as much as important as the technical papers, demos, and datasets and allow for networking, discussions, and possible future collaborations of conference attendees.
Finally, we’re happy to announce next year’s ACM MMSys (and NOSSDAV, MoVid, and MMVE) in Taiwan with Sheng-Wei (Kuan-Ta) Chen from Academia Sinica.