QtMultimedia, FFMpeg, Gstreamer: comparing multimedia frameworks

Update Feb 7, 2019: despite being written ten years ago and last updated three years ago, this article content remains correct, and still describes the actual situation with the multimedia frameworks.

During the last few years I had developed several multimedia applications. The applications were open-source, free and cross-platform, and therefore they needed the multimedia frameworks which would be open-source, free and cross-platform. Thus during those years I have used the following frameworks Qt native multimedia framework. First it was Phonon (Qt4) and then QmediaPlayer/Qt5, FFMpeg, and Gstreamer.

In this article I will summarize my personal experience with each framework, as well as provide some guidance where it should and should not be used based on my experience.

The summary for each framework includes the following:

  • Easiness. This is how easy would it be to start using the framework for developer who never used it before. This is not about completely mastering the framework, but grasping its basics and writing a reasonably complex application.
  • Documentation and support: how well the framework is documented for the developer. Is the documentation readable? Does it cover enough for the most usage scenarios, or you have to dig the source code? What kind of support is available, and how good is it?
  • Codec support: which codecs the framework supports natively?
  • Filter support: which filters (such as resizing, flipping, watermarking) the framework provides?
  • Audio and video I/O: how well the framework supports different audio and video inputs and outputs on different platforms?
  • Cross-platform support: which platforms the framework supports officially. This generally means the framework provides official builds for a specific platform, and accepts bug reports in those builds. If the framework provides only the source code, but also provides build procedures/scripts for a particular platform and accepts bug reports for it, it is also considered supported.
  • Framework redistributable size. How large would be the increase of your application size once you include this framework?

Contents

Qt Multimedia framework

such as QMediaPlayer and such.

Summary

  • Easiness: Very easy if you worked with Qt before, moderate otherwise.
  • Documentation and support: excellent documentation. It is a popular library, so good community support is available.
  • Codec support: differs; framework only support the codecs supported natively by each platform. MIDI playback is unsupported.
  • Filter support: no filters are available as part of framework except basic playback speed control.
  • Audio and video I/O: basic input and output is provided, which should be good enough for most use cases.
  • Cross-platform support: excellent. Windows, Linux, OS X, Android, iOS and others
  • Framework redistributable size: minimal assuming you’re already using Qt; moderate otherwise

Details

The strength of the Qt Multimedia framework comes from it being part of Qt library. It follows the same paradigm and API conventions as the rest of Qt, making it intuitive to use for the developers with Qt experience. However it is not difficult even to those developers who never used Qt before due to the limited feature set exposed. It also has the same great documentation which has been trademark of Qt for so many years, and it works (more or less) on all supported platforms. Since it relies on native platform multimedia frameworks, the extra size it adds to your application is minimal (around 4Mb), so you can deliver a relatively small application. This is of course if you’re already using Qt; otherwise you’d need to bring in Qt itself, and the size goes up to 40Mb.

However the Qt developers decided to rely on native multimedia frameworks for each platform instead of bundling the relevant decoders (such as MP3) into the library. This makes it possible to license Qt under dual-licensing scheme, and keep it free from known patent encumbrances. However this makes it very difficult to provide a consistent playback experience across the platform – in fact there is no single compressed audio format which is guaranteed to be support across all supported platforms. This makes it difficult to create, for example, cross-platform voice/video chat applications.

Due to reliance on native frameworks, it is exposed to the bugs in them. This sometime leads to the situation when the same MP3 file plays fine on some supported platforms, but refuses to play on another supported platform; for example see here. Also native frameworks are often not good in handling corrupted media files, resulting in funny results when your application cannot play a music file on Windows, while other applications can play the same file.

The feature set is also pretty basic – there is no way to alter or process the decoded sound, and only on certain platforms you have the read-only access to raw sample data.

Recommendation

This framework is recommended to use if all of the following is true:

  • Your application already uses Qt;
  • You need nothing more complex than a simple audio or video playback functionality;
  • The uniform codec support across platforms is not important, and you’re ok with supporting “just some formats”;
  • Overall application size is extremely important (and you’re already using Qt).

FFmpeg framework

Summary

  • Easiness: difficult. The API is huge, its use often non-intuitive, and many important concepts are not explained clearly.
  • Documentation and support: average. There is some documentation, but you will end up sometime reading the headers, and even the source code. Mailing list is active,but newbie or vague questions might not attract attention, so you have to be very specific.
  • Codec support: excellent. If compiled with everything enabled, you can assume that every known format could be decoded, and a large number of formats is supported for encoding. Does not depend at all on platform-provided codecs. MIDI playback is unsupported.
  • Filter support: excellent. A lot of filters of many kinds are available, from simple rotation and flipping to watermarking.
  • Audio and video I/O: not provided by this framework, and typically other SDKs (such as SDL) are used for that.
  • Cross-platform support: limited. The project does not provide nor release any official builds, so this is something difficult to consider. Linux could be considered officially supported; there are Windows, OS X, and Android builds maintained by volunteers, but those builds are not directly supported nor endorsed by FFmpeg project.
  • Framework redistributable size: minimal to moderate, and heavily depends on which codecs you want to support, and this which libraries are included in the build.

Details

The main strength of FFmpeg framework lies in its support for the variety codecs (decoders and encoders) and filters. Virtually any playable format is supported, including those formats you likely never heard about, such as sound format of certain old computer games. It also supports a large number of encoders, and is generally among the first adopters of new formats. It is also very fault-tolerant of semi-corrupted but still playable files – once my application switched to FFmpeg, there were zero “music file fails to play” bug reports.

Its filter selection is also very broad, including both basic filters such as deinterlacing, and advanced filters such as drawing a watermark on a video. Also you have access to the audio/video raw buffers all the time, so adding any new effect or filter is very easy.

The weakest part of FFmpeg is its documentation. The official documentation is just the API reference, there is no official tutorial or any kind of introduction, so at the beginning your main reference would be the source of the ffmpeg command-line tool. The framework comes with several well-documented samples, but besides that you’re on your own, and the significant source of your information would come from searching FFmpeg mailing list. This makes it the most difficult framework to master, as it requires the largest amount of code to write among all frameworks. This is not “setup and forget” type of framework – your application deals with buffers all the time, directly calling frame readers, deciding whether the packet has audio or video, calling the relevant decoders, converters/resamplers and so on. Audio-video output and input is also not FFmpeg’s responsibility. It is purely codec and filter framework and does not deal with audio and video output, so for this something else should be used, with most developers choosing SDL.

Until recently there were also no release scheme, making troubleshooting difficult, as the only way you could know which version of FFmpeg is being used is to include it into your application.

Lack of official cross-platform support is also what makes it difficult to commit to this framework, as while there are plenty of unofficial builds, lack of official commitment to support particular platforms leaves you at mercy of volunteers who make particular builds – or force you to master the build process yourself. Which is a valuable skill, but not when the timing is tight and you’d rather be doing something else.

However if you start doing your own builds, you will be able to control the framework size directly and very effectively, and you can make it very small by only including the necessary libraries. From the space usage viewpoint this is surely the most effective framework.

Recommendation

This framework is recommended to use if all of the following is true:

  • “Making it working today” is not the top priority (there’s rather steep learning curve);
  • You don’t mind writing a lot of extra code (comparing to other frameworks) in exchange of having full and direct control over audio-video playback or decoding pipeline;
  • You only need Linux/Windows support or are willing to set up the builds;
  • Your application needs to support certain formats on each platform;
  • You already have, or will take care of audio and video output yourself.

GStreamer framework

Summary

  • Easiness: easy-moderate. Basic stuff is easy, and accessing the advanced stuff does not require jumping through the hoops as the architecture is clear and straight.
  • Documentation and support: good . There is API reference, tutorials and general documentation. There are not many communities around GStreamer, but its development mailing list is very active, and developers are very helpful even with newbie questions.
  • Codec support: depends on how it was compiled; generally every known format is supported for decoding, and a large number of formats is supported for encoding. MIDI playback is supported but depends on 3rd party software, so realistically it is only semi-supported on Linux.
  • Filter support: excellent; better than FFMpeg. A lot of filters of many kinds are available, from simple rotation and flipping to watermarking, and even some exotic such as voice removal.
  • Audio and video I/O: excellent. Supports all known audio and video outputs and inputs, and very unlikely you’d use anything else.
  • Cross-platform support: Linux, Windows, Mac OS X, Android, iOS
  • Framework redistributable size: huge indeed.

Details

GStreamer framework seems to be the framework of choice for many modern applications. It is very robust, handles a large number of media formats (including both encoding and decoding), and supports video/audio output on all supported platforms. As a result, this framework alone can handle all your media needs, and will likely offer more functionality than you would ever need.

Its documentation is good, as it includes comprehensive overview which teaches you about the framework concepts. The framework also includes official tutorials which show different aspects of using the framework, and the overall quality of those is good. The framework maintainters are also active in the GStreamer-devel mailing list and are answering questions promptly, making it the best supported media framework for free software developers.

This framework is easy to use if you’re familiar with GLib; it will be a litle more complex if you are not.

The main GStreamer concept is a “pipeline” where you insert blocks called “elements”. A pipeline might look like following:

  • an element reading the encoded audio data (called “source”);
  • an optional element decoding the encoded audio data into raw audio stream (if the source data is encoded, such as MP3);
  • an optional element converting the audio from one raw format to another. This is necessary if the original data is in a format your sound card cannot understand. For example it could be sampled at 22KHz with 4 channels, but your soundcard only supports 48KHz sample rate with two channels maximum.
  • an element where the processed audio data is finally consumed (called “sync”). This is typically an element communicating directly with the sound card or platform-specific multimedia API (such as PulseAudio or CoreAudio), but may be anything – for example writing the audio into a file on disk.

Any number of elements may be added in this pipeline. For example, there are common elements (to change the output volume), or equalizer elements (changing the audio frequencies). There are even exotic elements such as “audiokaraoke”, which removes audio from the central channel, providing a voice removal effect.

As a programmer, your use of this framework consists of creating the pipeline, providing the data to the source, and starting it. There are controls available for changing the pipeline dynamically, such as adding or removing the elements, as well as typical player controls (seek, pause). The framework can handle any kind of multimedia data, from audio and video to subtitles and even teletext. The data flow inside the constructed pipeline is generally out of your control, unless you perform special steps to have access to it.

The main strengths of this framework are easiness of use, considering the functionality it provides, and its maturity. It is being used by free desktops environments such as GNOME and KDE, as well as several commercial apps such as Skype for Linux. It is also very flexible – there is probably nothing you can’t do with this framework. Its documentation is vast, good and up-to-date – you can certainly master most of the framework simply by reading its documentation. And its developers are very active in the mailing list, and answer questions nicely and correctly, both advanced and newbie questions, so if you hit the roadblock, you can get help easily.

Two disadvantages of this framework are its size, and lack of supported Qt bindings. Unless you’re prepared to compile your own builds, or manually track shared library dependencies (to ensure all necessary libraries are packaged), your application would be larger by 30-120Mb. This might or might not be a concern in your case. The lack of supported Qt bindings (there are some, but their support status is unknown) is however only relevant if you already use Qt.

Recommendation

This framework is recommended to use if all of the following is true:

  • “Making it working tomorrow” is acceptable (there’s some learning curve, although far smaller than for example FFMpeg);
  • You don’t mind writing some extra code in exchange of having significant control over audio-video playback or decoding pipeline;
  • Your application needs to support certain formats on each platform;
  • Project binary size is either not important, or your project is already large enough that adding this framework would make no significant difference.

Conclusion

This is all I can say about those frameworks. I also considered other frameworks such as CoreAudio, SDL etc, but they were either limited to one main platform, with no usage track records on other platforms, or they do not have features to be considered full multimedia frameworks – SDL, for example, can play (some) audio, but cannot process or encode it.

Please let me know in comments, if you see any errors in this article, or if there is another framework you would like to share with me.

This entry was posted in android, Linux, qt.

7 Responses to QtMultimedia, FFMpeg, Gstreamer: comparing multimedia frameworks

  1. Arsene says:

    Hi, thanks for your sharing.
    Do you know which of them is able to use hardware codec in any platform? And how?

    • George says:

      GStreamer can through its VAAPI plugins. Qt also can on Linux, because on Linux it uses GStreamer. Not sure about native platforms. FFMpeg I do not know.

  2. Dharma kc says:

    Could you please reply to this question. I think you can reply to my problem. Thank you.
    https://stackoverflow.com/questions/50016019/ffmpeg-vs-libav-vs-libvlc-vs-gstreamer-as-of-2018

  3. ashok says:

    hi ,
    thanks for the information.
    can we use qt without gstreamer for streaming videos from 4 sources??

    • George says:

      Yes, you can, but you will be quite limited in the streaming formats those sources must support.

  4. Davy says:

    Hi,
    As far as I understood things, there is a dependency between those frameworks. QtMultimedia (at least on linux) is built on (and requires) GStreamer (the gstreamer1.0-libav plugin to be precise) which on its turn is built on (and requires) FFmpeg.
    This also explains the increase in difficulty as you stated it.
    Or am I missing something?

    • George says:

      This is partially correct. Indeed QtMultimedia on Linux depends on GStreamer, although on other platforms it does not depend on it. However GStreamer does not depend, and does not require, FFMpeg. It is just one of GStreamer plugins (gstreamer-plugins-libav), and it is completely optional. You don’t need it, for example, to play MP3 files or h264 videos.