Video: Encoding Vs Compute Efficiency in Video Coding

Ioannis Katsavounidis from Facebook joins us to talk us through his work finding the best balance between computation and encoding. He explains how encoding has moved from real-time, hardware-based encoding in the late 80s and 1990s through to file encoding, chunk-based encoding and now shot-based encoding. Each of these stages has brought opportunities to speed up encoding, but there has always been a fundamental reason why encoding can’t simply be sped up by the advance of IT.

Moore’s law posits that every year, the number of transistors in chips doubles. Whilst this has continued to be true until recent years, transistors have always been a proxy for processing power. For many years now, the way to keep the computational ability of CPUs high has been not to increase clock-speed as it was twenty years ago, but to add cores to the chip. As each core acts as its own CPU, this gives the ability to execute code in parallel with a thread of code running separately on each core. Whilst 12-20 cores are typical for servers, there are CPUs which deliver up to 128 cores.

Ioannis explains why DCT-based codecs are resistant to multi-thread encoding by showing how some of the encoding decisions are based on the previously decoded video frame so the encoder needs to decode the video before it has the information it needs to make the next encode decisions. An example of this motion estimation where you need to understand what a macroblock looks like in order to detail if and how it can be moved to form part of the macroblock currently being encoded.

It turns out that some of the information you need to calculate can be found from the original video. Whilst this doesn’t provide full parallelisation, it does help in freeing some of the computation to be done in parallel thus reducing the length of time spent on the linear encoding stage. As the design of the codec itself is limited in its ability to be parallelised, the best way to speed up encoding has been to split up the original video and encode these, now separate, sections independently.

Speeding up video encoding has therefore focused on splitting up the video into different sections and encoding those in parallel rather than trying to parallelise the encoding itself due. Encoding each frame separately is one way to do this, but sacrifices encoding efficiency. Splitting each frame up into sections (tiles or slices) is another way, though this also sacrifices either quality or bitrate. The most successful encoding parallelisation has been chunked encoding. As streaming applications use chunks, typically around 2 seconds nowadays, there’s no reason not to just cut your video up into small sections and encode those separately; the whole of this video focuses on non-live video.

If there’s a shot change in the middle of your chunk, this is likely to look very bad since the motion estimation will fail to produce good results and there may not be enough bitrate budget to compensate. Therefore it’s best to drop in an IDR frame at the shot change or to actually change your video chunks to match shot changes. Simply encoding these chunks in parallel would speed up the encoding, however, it misses an opportunity to optimise quality vs bitrate.

Ioannis explains an experiment to determine the best operating point for chunks. He does that by reminding us that all encoders have certain ‘speed’ settings which control how much computation, and therefore time, is required for each encode. The ‘very fast’ setting in x264 will encode at the highest speed possible, but the quality will be worse or a certain bitrate compared to the ‘very slow’ setting. Ioannis’s experiment encoded each chunk at every speed setting for a variety of resolutions and bitrates. Each encode was then analysed for quality using PSNR, MS-SSIM and VMAF.

From Ioannis’ work, we can see how the bitrate setting affects both the encode time and the quality and we can observe that the slower speeds tend to have minimal quality advantages for the significant extra time involved in the encoding. Each curve has a steep part and a shallow section with the transition between known as the ‘convex hull’. Choosing a setting on the convex hull portion of the line is the optimal balance between quality and encoding time and is where, says Ioannis, most people should aim to operate.

The talk finishes with a summary of the conclusions which can be drawn from this work looking at the use of convex-hull which we’ve just discussed, the best type of parallel processing, whether oversubscription of CPU cores is helpful or not and an interesting observation that it’s often the metrics which put a significant burden on encoding rather than the video encoding itself, particularly for lower resolutions.

Watch now!
Speakers

Ioannis Katsavounidis Ioannis Katsavounidis
Research Scientist,
Facebook

Video: Keeping Time with PTP

The audio world has been using PTP for years, but now there is renewed interest thanks to its inclusion in SMPTE ST 2110. Replacing the black and burst timing signal (and for those that used it, TLS), PTP changes the way we distribute time. B&B was a waterfall distribution, PTP is a bi-directional conversation which, as a system, needs to be monitored and should be actively maintained.

Michael Waidson from Telestream (who now own Tektronix) brings us the foundational basics of PTP as well as tips and tricks to troubleshoot your PTP system. He starts by explaining. the types of messages which are exchanged between the clock and the device as well as why all these different messages are necessary. We see that we can set the frequency at which the announce, sync and follow-up messages. The sync and follow-up messages actually contain the time. When a device receives one of these messages, it needs to respond with a ‘delay request’ in order to work out how much of a delay there is between it and the grand master clock. This will result in it receiving a delay response. On top of these basic messages, there is a periodic management message which can contain further information such as daylight savings time or drop-frame information.

Michael moves on to looking at troubleshooting highlighting the four main numbers to check: The domain value, grandmaster ID, message rates and the communication mode. PTP is a global standard used in many industries. To make PTP most useful to the broadcast industry, SMPTE ST 2059 defines values to use for message repetition (4 per second for announce messages, 8 for sync, delay request and delay response). ST 2059 also defines how devices can determine the phase of any broadcast signal for any given time which is the fundamental link needed to ensure all devices keep synchronicity.

Another good tip from Michael is if you see the grandmaster MAC changing between the grandmasters on the system, this indicates it’s no receiving any announce messages so is initiating the Best Master Clock Algorithm (BMCA) and trying the next grandmaster. Some PTP monitoring equipment including from Meinberg and from Telestream can show the phase lag of the PTP timing as well as the delay between the primary and secondary grandmaster – the lower the better.

A talk on PTP can’t avoid mentioning boundary clocks and transparent switches. Boundary clocks take on much of the two-way traffic in PTP protecting the grandmasters from having to speak directly to all the, potentially, thousands of devices. Transparent switches, simply update the time announcements with the delay for the message to move through the switch. Whilst this is useful in keeping the timing accurate, it provides no protection for the grandmasters. He finishes video ends with a look at how to check ptp messages on the switch.

Watch now!
Speakers

Michael Waidson Michael Waidson
Application Engineer
Telestream (formerly Tektronix)

Video: TV Sport Innovation – Staying Ahead of the Game

Sports has always led innovation in many areas of broadcast, but during COVID not only did they have to adapt nearly every workflow and redeploy staff, but they then had to brace to deliver 100 games in 40 days. Gordon Roxburgh sums it up: “I’ve been at Sky twenty years, and I think [these have] been the most challenging six months…we’ve faced.”

In this session from the DTG’s Future Vision 2020 conference, Carl Hibbert from Futuresource Consulting talks to Sky, Arsenal TV and Facebook to find how their businesses have adapted. Melissa Lawton from Facebook explains how their live streaming, both for user-generated footage and produced sport have adapted to the changing needs. When COVID hit, Facebook lost some very valuable content. Their response was to double down on fan engagement, with challenges to fans to create content and also staging events which were produced and commentated as real sports events, but all shots were people at home exercising but being brought into the narrative of an Iron Man competition. Facebook have also invested in their user-facing tools and dashboards to help expose and monitor contribution via live streaming.

Gordon Roxburgh from Sky explains the seachange he’s seen in production. “The first thing was to keep channels on air…and keep staff safe.” They moved rapidly from a fully staffed office to just three or four people on-site and a presenter. In order to mix, they created a Virtual Production suite which allowed people to create content in the cloud.

For content, Gordon says that watch-alongs proved very popular where key sports personalities talk through what they were thinking during key sporting moments. This was just one of the many content ideas that keep programming going until “Project restart” commenced where the whole sports ecosystem asked itself ‘How can we deliver 100 games in 40 days?’ Once they knew the season would start, Gordon says, this opened up a 3-week build period during which BT Media and Broadcast, NEP, NEP Connect and multiple internal departments collaborated to produce rapid turnarounds.

“As an industry, we came together.” The working practices developed at Sky were shared with other major broadcasters who also shared their best practice – always putting staff first. Sky even went to the extent of building a technical space in a large studio floor to keep people apart and co-opted a set of training rooms to become a self-contained graphics unit. These ideas kept graphics operators together but not mixing with the rest of the production.

The view from Arsenal TV is explained by John Dollin. They worked quickly very early on and were able to be back in the office from February. Whilst Arsenal TV doesn’t have the rights to stream live, they produce their programmes live for transmission later. This used to be done in a crowded room but was soon transferred to a virtual mixer in the cloud with remote editors. John highlights the challenge of involving freelancers into the system and providing them with appropriate supervision. More importantly, he feels that their current ability to maintain the pre-covid production quality is due to the continued dedication of certain personnel who are putting in long hours which is not a sustainable situation to be in.

Watch now!
Free Registration
Speakers

Gordon Roxburgh Gordon Roxburgh
Technical Manager,
Sky Sports
Melissa Lawton Melissa Lawton
Live Sports Production Strategy,
Facebook
John Dollin John Dollin
Senior Product & Engineering Manager,
Arsenal Footballl Club
Carl Hibbert Carl Hibbert
Head of Consumer Media & Technology,
Futuresource Consulting

Video: Delivering Quality Video Over IP with RIST

RIST continues to gain traction as a way to deliver video reliably over the internet. Reliable Internet Stream Transport continues to find uses both as part of the on-air signal chain and to enable broadcast workflows by ensuring that any packet loss is mitigated before a decoder gets around to decoding the stream.

In this video, AWS Elemental’s David Griggs explains why AWS use RIST and how RIST works. Introduced by LearnIPvideo.com’s Will Simpson who is also the co-chair of the RIST Activity Group at the VSF. Wes starts off by explaining the difference between consumer and business use-cases for video streaming against broadcast workflows. Two of the pertinent differences being one-directional video and needing a fixed delay. David explains that one motivator of broadcasters looking to the internet is the need to replace C-Band satellite links.

RIST’s original goals were to deliver video reliably over the internet but to ensure interoperability between vendors which has been missing to date in the purest sense of the word. Along with this, RIST also aimed to have a low, deterministic latency which is vital to make most broadcast workflows practical. RIST was also designed to be agnostic to the carrier type being internet, satellite or cellular.

Wes outlines how important it is to compensate for packet loss showing that even for what might seem low packet loss situations, you’ll still observe a glitch on the audio or video every twenty minutes. But RIST is more than just a way of ensuring your video/audio arrives without gaps, it. can also support other control signals such as PTZ for cameras, intercom feeds, ad insertion such as SCTE 35, subtitling and timecode. This is one strength which makes RIST ideal for broadcast over using, say RTMP for delivering a live stream.

Wes covers the main and simple profile which are also explained in more detail in this video from SMPTE and this article. One way in which RIST is different from other technologies is GRE tunnelling which allows the carriage of any data type alongside RIST and also allows bundling of RIST streams down a single connecting. This provides a great amount of flexibility to support new workflows as they arise.

David closes the video by explaining why RIST is important to AWS. It allows for a single protocol to support media transfers to, from and within the AWS network. Also important, David explains, is RIST’s standards-based approach. RIST is created out of many standards and RFC with very little bespoke technology. Moreover, the RIST specification is being formally created by the VSF and many VSF specifications have gone on to be standardised by bodies such as SMPTE, ST 2110 being a good example. AWS offer RIST simple profile within MediaConnect with plans to implement the main profile in the near future.

Watch now!
Speakers

David Griggs David Griggs
Senior Product Manager, Media Services,
AWS Elemental
Wes Simpson Wes Simpson
RIST AG Co-Chair,
President & Founder, LearnIPvideo.com