Video Support

Gameface can play video/audio through the standard <video> element like so:

    <video width="320" height="240" src="my_movie.webm">
    </video>

The video player will:

  • play opaque videos
  • play transparent videos
  • send audio data to the engine for playing (it won’t play it by itself)
  • currently work only with videos
    • in the WebM format
    • with 1 video track encoded with VP8 or VP9
    • with 1 audio track encoded in Vorbis

The supported standard attributes are “src”, “autoplay”, “loop”, “muted” and “preload=auto”. The supported events are “durationchange”, “ended”, “loadstart”, “seeking”, “seeked”, “volumechange”.

Distribution

The video player is distributed in a separate dynamic library to reduce binary size for users who don’t need it. The separate library is called MediaDecoders.[PlatformName].[PlatformExtension] (e.g. MediaDecoders.WindowsDesktop.dll on Windows). You need to place that extra binary next to the core Cohtml library in your distribution package and it will be automatically loaded.

Distribution on iOS

iOS doesn’t support dynamic libraries so it’s an exception to the rule above. Instead of placing a dynamic library, on iOS, you’ll need to statically link to either libMediaDecoders.iOS.a or libMediaDecodersEmpty.iOS.a depending on whether you want to use the video feature or not.

How to encode videos

Gameface doesn’t have any special requirements for the video files as long as the format and the codecs used are supported. However video quality and playback performance depend highly on how the video was encoded. Using the latest versions of VPX encoders is highly recommended. Performance may suffer if old encoders are used or certain encoder features are disabled.

We recommend using the latest release version of ffmpeg for encoding your videos even if your preferred video editing software has an option to export in WebM. We have found ffmpeg to provide the best video quality and resulting decoding performance with its default VPX encoder settings, which makes ffmpeg easy to use, without the need to pass additional encoder options.

The basic usage of ffmpeg for transcoding is the following:

  • specify the input file with “-i”
  • choose a video codec with “-c:v” (either “libvpx-vp9” or “libvpx” for VP8)
  • specify a target bitrate for the video stream with “-b:v” (use “k” suffix for kilobits e.g. 512k or capital “M” suffix for megabits e.g. 3M)
  • choose an audio coded with “-c:a” (e.g. “libvorbis”)
  • specify a target bitrate for the audio stream with “-b:a”
  • finish the command with a filename which extension is “webm”, “ffmpeg” will automatically set the format accordingly

Example command for transcoding:

  ffmpeg -i VideoIn.mp4 -c:v libvpx-vp9 -b:v 1M -c:a libvorbis -b:a 128k VideoOut.webm

For better video quality, results we recommend two-pass encoding, in which you will need to run ffmpeg twice, with almost the same settings, except for:

  • in pass 1 and 2, use the “-pass 1” and “-pass 2” options, respectively.
  • in pass 1, output to a null file descriptor, not an actual file. (This will generate a data file that ffmpeg needs for the second pass.)
  • in pass 1, you need to specify an output format (with “-f”) that matches the output format you will use in pass 2.
  • in pass 1, you can leave audio out by specifying “-an”.

Example commands for two-pass encoding:

  ffmpeg -i VideoIn.mp4 -c:v libvpx-vp9 -b:v 1M -an -pass 1 -f webm NUL
  ffmpeg -i VideoIn.mp4 -c:v libvpx-vp9 -b:v 1M -c:a libvorbis -b:a 128k -pass 2 VideoOut.webm

You can refer to the ffmpeg documentation on how to use it also to adjust resolution, framerate, audio channels and other media properties.

Transparent video support

The basic authoring process of transparent videos goes as follows:

  • Export a video with an alpha channel, if such format is available in your video editing tool (e.g Quicktime PNG with RGBA), OR export a sequence of transparent PNGs

  • Feed the transparent video OR the sequence of PNGs (e.g -i sequence-%05d.png) as input to ffmpeg. The ffmpeg commands are the same with the addition of the “-pix_fmt yuva420p” switch which enables transparency for VPX.

Video playback performance

The video playback performance depends on the amount of data that needs to be processed and the amount of video data that needs to be displayed. The size of the processing data is determined by the bitrate and the size of the display data is determined by the resolution. Using transparency in videos adds additional data for processing and additional channel to display.

  • Bitrate - lower bitrate yields better performance
  • Resolution - lower resolution yields better performance
  • Transparency - adds up to 60% more data for processing, depending on how complex the alpha masking is, and 25% more display data

Most video compression algorithms are reusing the frame data from the previous frame and are only processing changes as this proves to be a very efficient compression technique for the most produced videos. This makes the size of the processing data depend heavily on the amount of motion and scene switches present in the video. Due to the dynamic nature of most videos, the amount of data to process varies from frame to frame.

The bitrate is the number of bits that are processed in a unit of time. Video data rates are given in bits per second. There are two methods of compression:

  • Using constant bitrate (CBR), the raw data will be compressed as much as needed to meet the target bitrate value for the frame. This means that the video quality will vary depending on the dynamics of the scene. This method of compression will ensure stable performance for the entire duration of the video, but it is considered wasteful and better to be avoided.
  • Using variable bitrate (VBR), the raw data will be compressed with a fixed amount which is calculated to meet the target bitrate as an average for the entire duration of the video. This means that the video quality will be constant during the entire video, but it may introduce performance issues in certain parts of the video where the bitrate is very high. We recommend you use the option for setting the upper limit of the bitrate when using VBR ("-maxrate" in ffmpeg).

The resolution isn’t usually changed during playback, so it can be said that it has a fixed performance cost. Higher resolution or frame rate means more data to process, hence it will require a higher bitrate to preserve the video quality.

VP8 vs VP9 - We have found that using the exact same properties for encoding, VP8 gives better performance, but produces worse video quality. VP9 performs slower but will produce better video quality. If we are about to compensate for the quality difference by adjusting the bitrate, either higher for VP8 or lower for VP9, then the performance difference will be evened out. Additionally, for the same perceivable quality, using VP9 will result in smaller file sizes due to the lower bit rate. We recommend using VP9 when possible, VP8 is generally used for compatibility reasons.

Seek performance

Seeking a video is an unexpected event both for the video player from the buffering standpoint and for the media itself.

Because most video compression formats only store incremental changes between frames (except for keyframes), it is not possible to directly seek at any arbitrary point in the video stream. In order to display an arbitrary interframe (non-keyframe), the decoder must start with the nearest previous keyframe and apply the changes of all interframes to that point.

Simplifying this to plain data looks like this (K - keyframe, I - interframe): [154(K), +6(I), -10(I), +5(I), 212(K), -15(I), …] If we seek to the third frame (-10(I)), in order to know the value we have to calculate 154 + 6 - 10 = 150.

All videos start with a keyframe and encoders create additional keyframes only when that benefits the compression - when a keyframe is smaller in size than an interframe describing the changes (e.g. on scene switch). This means that you can have a video that doesn’t have a keyframe for several seconds. Seeking to such a point can require hundreds of frames decoded before you get the desired frame to show.

If you want to seek a video with good performance, make sure that the seek happens at a keyframe to avoid decoding more than one frame. We have added a custom attribute/property that does that automatically:

  • cohfastseek attribute, and HTMLMediaElement.cohFastSeek property which when present or enabled will force the seek to happen on the nearest keyframe. For example, if we have keyframes at the following time points: [0s, 1.5s, 5s] a seek currentTime = 1.4; will seek to 1.5s instead and reading currentTime will report 1.5;

You can manually force the encoder to create keyframes at the desired seek points with ffmpeg by passing the following argument:

-force_key_frames 0:05:00,0:07:50,...

You can read the timestamps of the keyframes in a video by using another custom API:

  • HTMLMediaElement.cohGetKeyframeTimestamps() - returns an array of timestamps in seconds of all keyframes. This info is available only after video metadata is parsed, otherwise, an empty array is returned.

The video player requires at least two buffers to guarantee smooth playback as the data is processed asynchronously. During normal playback, the video player buffers future encoded frame data and future decoded frames and discards past data to minimize memory consumption.

The encoded frame data can get delayed due to slow I/O operation. The player buffers future encoded frame data to mitigate this and to be able to send the data to the decoders on time - just before it needs to display the frame. The decoded frame data needed to draw the frame on screen can also get delayed. Decoding delay can happen because of unavailability to process the decode request immediately or when the frame which is being decoded takes more time than usual (different frames require a different amount of processing). The player buffers future decoded frame data to mitigate such delays and prevent frame misses.

In most cases, a seek event won’t be in a buffered region. This makes it vulnerable to delays both during obtaining the frame data and decoding. We have added the following API to mitigate the delay in obtaining the frame data:

  • HTMLMediaElement.cohPrebufferKeyframe(double timestamp) - pre-buffers the encoded keyframe data, so a seek to that point can immediately schedule decoding. This API will accept only timestamps that are keyframes. It can be used in conjunction with getKeyframeTimestamps:
  // Prebuffer all keyframes
  video.cohGetKeyframeTimestamps().map(t => video.cohPrebufferKeyframe(t));

There is also a declarative way to preload keyframes to improve seek performance by adding the “preload” attribute to the video element which when set to “auto” will preload all known keyframes.

Audio support

Gameface does not play audio by itself. All audio data is decoded, converted to Pulse-code modulation (PCM) and passed to the engine for further processing. The PCM data is passed through several callbacks on the cohtml::IViewListener interface (look for the OnAudio* methods)

You can use your engine’s audio system to enqueue the PCM data in the sound buffers and get it playing. There are two reference implementations available

  • one based on Windows' XAudio2 and one on OpenAL. Both can be found under Modules/AudioSystem/. The Audio system module provides an abstraction over both implementations and can also be used directly in the engine by including the source file and linking to the corresponding third-party dependencies.

Take a look at the Sample_VideoPlayer sample in the distribution package for more info.

How to play a video with controls

In order to use controls, you need to include an additional JavaScript library which you can find under Samples/uiresources/VideoPlayback. After you include the library, use the custom HTML element:<video-with-controls>.

  <video-with-controls id="myVideo" src="my_movie.webm" width="320px" height="240px">
  </video-with-controls>

To use the library you must:

  • copy the files from Samples/uiresources/VideoPlayback to your UI directory
    • video_controls_images folder
    • video_controls.js
    • video_controls.css
  • include video_controls.js at the beginning of your HTML file
  • include video_controls.css which contains the styles for the controls

JavaScript API is the same as the one for HTMLMediaElement:

  • Supported attributes:
    • src
    • width
    • height
    • autoplay
    • loop
    • muted
  • Supported properties:
    • paused
    • ended
    • loop
    • autoplay
    • currentTime
    • duration
    • volume
    • muted
    • src
  • Supported methods:
    • play()
    • pause()
  • Events are not supported

If the API exposed by our video container element is not sufficient for your needs, you can get the video element itself:

  let video = document.getElementById("myVideo").querySelector("video");

Customizing controls

If you want to customize the way your video player looks - copy your custom images in the uiresources/VideoPlayback/video_controls_images folder and keep the names the same.

If you want to customize the font size or another inherited, then add CSS style to our video container element.

Media playback events

The media element supports the following standard events:

  • durationchange: The metadata has loaded or changed, indicating a change in duration of the media. This is sent, for example, when the media has loaded enough that the duration is known.
  • emptied: The media has become empty; for example, this event is sent if the media has already been loaded (or partially loaded), and the load() method is called to reload it.
  • ended: Sent when playback completes.
  • pause: Sent when the playback state is changed to paused (paused property is true).
  • play: Fired after the play() method has returned, or when the autoplay attribute has caused the playback state to change. Note that this does not necessarily mean there’s actual playback, since the network request can be delayed. See the played event for more information.
  • playing: Sent when the media has enough data to start playing, after the play event.
  • seeked: Sent when a seek operation completes.
  • seeking: Sent when a seek operation begins.
  • timeupdate: The time indicated by the element’s currentTime attribute has changed. Note that this event is not fired during normal playback.
  • volumechange: Sent when the audio volume changes (both when the volume is set and when the muted attribute is changed).
  • error: Sent when there is a media error.

There are a few custom events not present in the standard:

  • cohplaybackstalled: Sent when a decoder is unable to keep up with the playback rate and cannot provide new frames quickly enough. This event is useful when trying to synchronize the video playback with other animations since the frame decoding is asynchronous and not tied in any way to the View timer. Usually, pausing any animations that need to be in sync with the video on the cohplaybackstalled event and resuming them on the cohplaybackresumed event is enough.
  • cohplaybackresumed: Sent when the video/audio decoder that previously stalled the playback is now caught up and provided new frames.

Showing video preview

Currently, Gameface does not render video frames unless the video is playing or “seeked”. This means that there will be no image of the video shown initially. In order to show a preview, either play the video or simply seek the desired time in seconds, e.g. videoElement.currentTime = 0;.

Resource handler

When playing a video, Gameface will send range requests to the client’s implementation of cohtml::IAsyncResourceHandler::OnResourceRequest to avoid loading the entire video file in memory and potentially running out of memory on platforms with limited hardware. You check the reference implementation on how to handle range requests in the class resource::ResourceHandler, which is used all across the samples.

Responding to range requests

Gameface always sends range requests for videos by including the Range header in the request. Range requests enable streaming resources over the network or from disk instead of loading the entire resource in memory. The client implementation should read the Range header to figure out which part of the resource is requested and then respond by providing the requested data and setting the status to 206 (HTTP Partial Content). If the client implementation does not set the status to 206, the response will be interpreted as a regular response, and Gameface will assume that the whole resource data is provided and won’t send more requests.

End of stream handling

When the client implementation responds with partial content Gameface reads the Content-Range header in the response and is specifically interested in the range-end and size directives to determine whether the end of the stream is reached. The client implementation is expected to provide a valid Content-Range header in the response, otherwise Gameface will issue an out-of-bounds request when the end of the stream is reached.

Lifetime of video resources

Lifetime flow in <video>:

  1. Create <video> element with a video resource specified by the src attribute.
  2. The element will add a reference to the resource.
  3. The reference will be removed when:
    • The src attribute is changed.
    • The Garbage Collector destroys the element.

You can release the video resource reference when needed by changing the src attribute.

For example, let’s say you want the resource reference to be released when the <video> element is detached from the DOM tree. In that case, you may use the <video-with-controls> custom element and unset the src attribute inside the disconnectedCallback function, like so:

//Javascript
class CohVideo extends HTMLElement {
  //...
  disconnectedCallback() {
      this.videoElement.setAttribute("src", "")
  }
  //...
}