7 minute read

Audiovisual multimedia experiences over 5G networks

By David Jiménez, PhD, Assistant Professor at Polytechnic University of Madrid

The increased capabilities offered by the evolution of communications networks provide a set of possibilities beyond mere direct exploitation of bandwidth, latency, high density and scalability improvements. This new horizon, which can be characterized at a very high level by the integration of intelligence in the network, features the ability to manage network resources dynamically, based on various service criteria, as well as to maximize the use of computing possibilities in Edge and Cloud.

Among the potential applications, those relating to audiovisual multimedia are among the most demanding ones, with very high requirements on network capacity and computational resources.

For content services, decentralization that is allowed by the flexibility of the new networks, together with the power to deploy services on the Edge, provides the real ability to change classic workflows, bringing the processes linked to the distribution of contents to the point of acquisition thereof, thus facilitating, through the virtualization mechanisms, an ability to dynamically deploy the processes linked to the services, and finally, relying on the programmability of the networks, the adaptation of the same to meet the specific demand arising therefrom, along with the possibility to adapt to any potential modifications at specific times and even integrate, to ensure the service, real-time quality control systems that allow anticipating decisionmaking on the way in which the services are provided or the resources assigned to them.

In this line of thought, from the Visual Telecommunications Application Group (GATV) of Polytechnic University of Madrid (UPM), work has been carried out on the deployment of proofs of concept that would allow assessing the actual capacity to provide high-level audiovisual content services by making the most of the aforementioned set of capabilities: Network Function Virtualization (NFV), Software-Defined Networks (SDN) and computing at the edge of a multiple access network, the so-called Multiaccess Edge Computing (MEC). On this basis, the architecture and services within the communications network are structured. By virtualizing network functions, various network functions are implemented through software (VNF, Virtualised Network Functions), typically over a Network Function Virtualization Infrastructure (NFVI), which decouples network functions from hardware, resulting in increased infrastructure flexibility and reduced operating and equipment expenses. Additionally, Physical Network Functions (PNFs) are hardware boxes that provide a specific functionality.

On the other hand, SDNs handle the routing and forwarding functions in network device software. The use of SDNs offers three important keys: it separates the control side from the data side, provides centralized management, and finally, turns the entire network into a programmable entity. With SDNs and

NFV, the complexity of device design is reduced, an efficient network configuration is achieved, and the working context can react to status changes much faster than through conventional approaches, which provides great flexibility and cost-effectiveness in the implementation of services, in this case, audiovisual content distribution services.

Last, implementation over the MEC enables migration of processing and storage resources closer to demanding users, thus reducing latency and traffic aggregation as required by audiovisual multimedia services. NFV, SDN and MEC are mutually complementary technologies that lead the evolution of network architecture, offering new services, in this case, operated for the provision of content services.

One of these pilots was carried out within the framework of the 5G-Media Programmable edge-to-cloud virtualization fabric for the 5G Media industry project, which aimed to create a flexible environment based on a service platform with a Software Development Kit (SDK) that would facilitate agnostic network users to deploy solutions, in the form of virtualized network functions, for the implementation of audiovisual multimedia services on the Edge. The main goal behind this pilot was to verify the feasibility of carrying out virtualised remote production on 5G networks in real time, an experience that took place in the Matadero (Madrid) premises, in collaboration with Radio Televisión Española (RTVE) and Telefónica, on the Radio 3 broadcast of “Life is a dream”, showing how technological advances in the 5G domain and edge computing were able to provide enough potential to offer high-quality audiovisual content services through a dynamic and efficient allocation of resources.

It offers an alternative to the issues associated with audiovisual multimedia services and applications that go beyond the capacity, latency and bandwidth requirements offered by the network.

Nowadays, professional production of events for broadcast is to a great extent associated to a large investment of resources: money, large equipment to be hauled, OB vans and long preparation and verification times for installations. In addition, dedicated connections are established between the event’s venue and the station site to ensure the required high throughput and transmission quality, with the associated cost, being the bandwidth requirements for conventional television production around several gigabits per second. All in all, this gives an idea of the magnitude of the coverage of an event and the investment associated with its production.

In contrast to this approach, 5G technologies propose a new paradigm for the management of services distributed and deployed in the Edge, thus guaranteeing the quality of the service, generally evaluated as quality of user experience, even for the most stringent and demanding network content service requirements.

The remote production pilot targeted live production of an event from a location other than the event itself. To do this, it takes advantage of the ability of the 5G network to send the camera, audio and control signals to a production room, thus making it unnecessary to take that equipment on site during production, nor sending over people in charge of the equipment. The architecture used is detailed below.

Acquisition was made by means of three SONY cameras, two PMW-500 and one PDW-700, which deliver a signal of 1280x720 pixels with a depth of 10 bits per RGB channel, 50 frames per second and progressive scanning, as per the SMPTE 296M standard. Those signals are fed into the uncompressed Edge by using SMPTE standard ST 2110 Professional Media Over Managed IP Networks, working with video, audio and ancillary data in separate elementary streams. This enables separate processing and generate the desired workflow for each of them, even allowing their management at different points. SMPTE ST 2110 also allows the active image area to be sent with savings close to 40%. These frames were embedded with an Embrionix device allowing two different raw streams to be embedded.

The hardware chosen for this task is an Embrionix SMPTE ST 2110 device [27] which is controlled by software and allows inclusion of 2 different raw streams, and management from a proprietary program that facilitates network configuration and setting up routing parameters, as well as the Session Description Protocol (SDP) file containing the configuration and the signals that are sent to the MEC for handing the IP video signals. To manage these IP signals, which are very demanding in terms of bandwidth requirements, the local area network (LAN) needed in place is created by means of a switch configured with a Small Form-Factor Puggable transceiver (SF) and connected to the service provider’s network through a 10 Gbps connection.

A schematic view of the architecture can be seen in Figure 1.

The VNFs deployed on the Edge are based on open source. They are flexible, scalable and capable of evolving more easily than traditional networks and can be used for both live and delayed production. They allow to automate tasks and create customized workflows by using intelligent systems, adding capacity where and when needed. In addition, updating is simpler than compared to its physical equivalents.

The VNFs developed under the pilot are as follows:

• vUnpacker: allows the use of the UDP protocol with the IP SMPTE ST 2110 standard. It enables the decoding of RTP video over IP, creating a regular TCP workflow in a matroska format at the output. As input, the functions use an adaptation of the SDP file.

• Media Process Engine (vMPE): allows modifying and mixing video signals. The final program signal

(PGM) is produced near the site to harness the computing power of the network edge. For the pilot, its function was to allow switching between the three input signal sources, as well as creating a composition between two different sources. The MPE is split into two modules:

- Server: is the VNF that deploys the editor’s kernel. It provides two types of outputs: previews and program signal (PGM). Preview signals are sent to the client application in low resolution (with very high compression). For this compression, M-JPEG is used by the server. The PGM signal is composed of the source selection or source composition selected by the filmmaker it is provided as a raw signal with a 4:2:0 sampling (packaged in matryoshka format).

- Client: is the graphical interface for production of the program signal. In the case at hand, the filmmaker is located in the central facilities and not at the event site. For server- client communication, functions use a simple TCP-based command-line protocol, with three main types of operations: clientto-server commands, command-response errors, and server preview signals.

• vCompressor: it is responsible for encoding audio/video signals to reduce their bandwidth by using the H.264 standard. It is based on open-source coding techniques and libraries included in ffmpeg. Compression introduces

IndicatorTradional5GRemoteImprovement

latency into the signal transmission and can be critical in some cases, so implementing virtual functions at the network edge is an advantage for reducing latency.

As a final summary of the pilot’s workflow, the camera baseband video signal is carried via HDSDI and converted to IP using the SMPTE 2110 standard. Thereafter, the VNFs described above interact with the signal. The vUnpacker obtains raw video over the IP signal, the vMPE acts as a video switch controlled by the filmmaker from the transmission facilities (where it can view the preview signals from each source) and finally, the vCE compresses the PGM (program) signal, using an H.264 encoding format. The output signal is the one finally used in the broadcast.

The pilot’s overall results were highly satisfactory. Regarding bandwidth, the bandwidth used for the three monitoring signals was 4.93 Mbps, and 10 Mbps for the PGM signal. Latency, which is tremendously critical for any audiovisual multimedia service, was controlled by a GPS time application, and the measured average was 500 ms. Finally, regarding the use of virtualized resources, the use of processors (CPUs), memory, disk storage managed with the NFVI has been measured. The largest allocation was made to the compressor -8 CPUs and 4 GB of memory- and the workload on them when in operation ranged between 50 and 75% of capacity.

The table 1 summarizes the main impact indicators compared between the two models and the improvement provided by 5G-based remote production. 

This article is from: