Bandwidth/CPU Management with Track Subscriptions, Video Quality Settings, and Spatial Layers

When you're building with Call Object and your calls have more than a few participants, you may need to do a few things to ensure that your users have the best call experience.

Picking People: Managing Track Subscriptions

By default, every participant in a call will receive video and audio from every other participant. In large calls, that can mean a lot of network bandwidth and a lot of processing power for decoding all that video. In many of these scenarios, you can improve the experience by only subscribing each participant to the audio and video tracks they need at the moment.

You can use track subscriptions to separate a large call into breakout groups, where each person only gets audio and video from the other people in their group.

For larger calls, you can separate video into 'pages' to reduce the number of cameras on screen at the same time. For example, even if a call has 40 people in it, you might only show a 3x3 grid of videos, with buttons to switch between 'pages'. As a participant switches pages, your app can unsubscribe from the first group of tracks and subscribe to the next. Even in an Active Speaker layout, you can improve overall performance by paginating your thumbnail row.

Here's a blog post where we discuss this a bit more: https://www.daily.co/blog/create-dynamic-meetings-using-track-subscriptions/

In daily-js v0.17.0, we also added a staged state for track subscriptions. This is a way to indicate that you'll need a track soon, but not yet. Setting a track to staged creates the necessary channels between the SFU and your app without sending any actual video/audio data until you set the track to true. Details here: https://docs.daily.co/changelog/changelog-018-2021-08-17-21

Controlling Quality: Understanding Video Simulcast Layers

In a peer-to-peer call between you and a few other people, each of you establishes a direct connection to each other participant to send and receive audio and video. These connections use a protocol that has built-in bandwidth estimation for the connection. That means that your browser can continuously analyze the quality of the network connection between you and each person you're talking to and select a changing video bitrate that gives you the best quality possible without dropping too many frames.

In a server call, your browser establishes just one of those peer-to-peer connections with the server, and the server sends everyone else's media tracks itself. This has lots of benefits, but it makes it impossible to be continuously tweaking each person's video bitrate. Fortunately, modern browsers have the ability to efficiently encode video at multiple quality levels simultaneously using something called spatial layers. On a typical server call, Daily uses these default spatial layer quality settings for webcam video, described in more detail in this document:

1280x720, 30 fps, 600kbps
640x360, 15fps, 200kbps
320x180, 10fps, 80kbps

Although you're sending three video streams to the server, the server is only sending you one stream for each participant you're seeing in a call.

Receive-Side Video Quality

The bandwidth estimator can select lower spatial layers for participants with slower networks, but it doesn't know what you're doing with the video once you receive it. If you are displaying videos at a resolution smaller than 640x480 or so in your app, you can tell the call server to send you a lower quality video layer in order to reduce network contention.

To do this, you can call the updateReceiveSettings() function (new in daily-js 0.17.0), passing in a participant ID and an integer specifying the layer number. 0 is the very low bandwidth layer, 1 is the medium bandwidth layer, and 2 is the high bandwidth layer. More documentation on this function is available here.

call.updateReceiveSettings({
  'some-participant-id': { video: { layer: 0 } }
});

You should wrap this code in a check that makes sure we are in media server mode ("sfu" in our internals, which is a standard WebRTC term that stands for "selective forwarding unit"). Also, ideally, wrap this in a try/catch block, too. :-)

if (rtcpeers && rtcpeers.sfu) {
let c = rtcpeers.sfu.consumers[sender_daily_session_id + '/cam-video'];
// 0 is the low-bandwidth layer, 1 is medium, and 2 is high
rtcpeers.sfu.setConsumerLayer(c, 0);}