Fabien Sanglard's non-blog

  




Progressive playback: An atom story.



November, 15th 2011

Introduction


I have been doing a lot of work with video containers recently, especially figuring out interoperability between iOS/Android and optimizing progressive playback. In particular it seems Android devices fail to perform progressive playback on certain files while iOS and VLC succeed:

Why ?

As usual understanding things to the deep down proved extremely worthy.


Analysis


A movie file is called a container. There are several kind of containers but the most common on mobile platforms are:


Within the container, datas are organized as "ATOM"s. As you can see in the drawing on the left a typical movie container features four atoms:

  1. ftyp atom: The magic number part of the file. The body of this atom also contains the branding and version of the container format. With quicktime/MOV it is always "qt  ".
  2. moov atom: The metadatas, containing codec description used in the mdata atom. It also contains sub-atoms "stco" and "co64" which are absolute pointers to keyframes in the mdata atom.
  3. wide atom: A dirty hack explained later.
  4. mdata atom: The interleaved compressed audio and video streams. Account for 95% of the file size. Most of the time codecs used are H.263 for video and AAC for audio.

Note : Why is the wide atom a dirty hack ? Because its only purpose in life is to be overwritten: Atom size are coded on 4 bytes. Hence an mdata atom maximum size is 4GB. To allow itself to grow further the mdata atom header can be moved up by 8 bytes thanks to the padding and a special atom header can be used in order to code its size on 8 bytes instead of 4.... and raise the limit from 4 GigaBytes to 9 ExaBytes.

Now when a file like this is accessed over HTTP, the player performs progressive playback as follow:

  1. Receives the "ftyp" atom and check that the container format, version and branding are supported.
  2. Receives the "moov" atom, check that the required codec are available and use the "stco" sub-atoms to start decoding the video and audio streams.
  3. Receives the "mdat" atom, buffer the content and make it available so codec can decompress it.

Since the "ftyp" and "moov" are a few KB, progressive playback can start within a few seconds.


Problem


In order to start playing a movie file right away its metadata contained in the "moov" atom is paramount to the player. If the movie file atoms are ordered as previously described everything work as expected...but most video editors (ffmpeg, quicktime, flash video) generate atoms in the wrong order (as seen on the right): With the "moov" atom last.

If you try to load a file structured like this on an Android device over the internet, you get an error message like this:



Progressive playback is not possible and you have to download the entire file before you can start watching the video. But if we try to open this file with an iOS device or VLC they are able to start playback within seconds:

How ?



The answer is pretty obvious and can be observed via WireShark:



iOS and VLC open a second HTTP connection to the server using the not so well known "Range" HTTP header:

  1. The first HTTP request features a "Range: bytes=0-" HTTP header field. So the movie is downloaded from the start.
  2. As soon the the player detects a "mdat" atom without the "moov" atom it opens a second connection with a "Range: bytes=4726467-" HTTP header field. This skip most of the file up to the end and retrieve the "moov" atom.

Thanks to the second connection, the "moov" atom is retrieved faster and progressive playback can start right away without waiting for the entire file to be downloaded.

Solution


Android videoplayer elect NOT to open a second connection but wait for the entire file to download. The only solution is to fix those files and reorder the atoms inside. This can be done:


Add a comment



Name Homepage
E-mail
(Will not appear online)
Comment



Comments (10)


#1 - Daniel (NessDan) - 11/22/2011 - 23:32
Incredibly informative! When I got to the part about atoms, I decided to find a small video I took with my cell phone and open it up in a text editor and at first all I saw was gibberish, but before giving up I looked closer and I saw the FTYP! To be exact, I saw this: "ftyp3gp5" I was extremely happy after seeing that. I then looked for MOOV and WIDE but couldn't find them and instead saw MDAT. After that I just went back to reading and once I got to the 2nd part, I realized what could've happened and sure enough, the MOOV was at the bottom of the file!

There was one thing I was trying to understand which was how VLC and iOS open a 2nd connection with "Range: bytes=4726467-" as the HTTP header. Could you shed more light on what exactly this does? I'm not incredibly familiar with what's going on there.

Thanks for the information, I absolutely love learning new stuff like this. Keep it up!
#2 - Fabien Sangladr - 11/23/2011 - 12:07
@Daniel

The range HTTP header allows to start downloading a resource starting at a certain offset. In VLC/iOS case, it allows to skip the mdata atom, reach the moov atom and start playback immediately.
#3 - Daniel (NessDan) - 11/23/2011 - 15:33
Thanks, I understand it more now. So that means the number of bytes will be different depending on the size of the file.
#4 - Fabien Sangladr - 11/23/2011 - 15:35
Yes it does, for each file the number will be the offset to access the moov atom.
#5 - Luc Trudeau - 11/25/2011 - 09:05
Great post, killer stuff! When you say in your problem statement : "Most video editors" do you mean most video encoders?
Also, I'm surprise this post was not in your RSS feed :(
#6 - Fabien Sangladr - 11/25/2011 - 10:05
@Luc:

1/ Yes, this is what I meant.
2/ I did not put it in the RSS because I did not think it would interest a lot of people.
#7 - G Troupel - 11/26/2011 - 09:21
Shouldn't the WIDE chunk be after the FTYP chunk rather than after the MOOV, if its purpose is to extend the FTYP chunk ?
Or maybe I misunderstood you.
#8 - Fabien Sanglard - 11/26/2011 - 13:00
@G Troupel

The purpose of the WIDE atom is to allow the mdata header to move "back" 12 bytes and encode its length on 8 bytes (+4bytes to indicate the length is a special case ). Hence WIDE must be just before the mdata atom
#9 - PypeBros - 11/27/2011 - 06:13
neat use of HTTP range-request option ;)
#10 - Daniel Lew - 11/27/2011 - 09:17
Fascinating. I dealt with this issue years ago, and at the time the engineer working on it told me "some videos can be played progressively, some can't" with no further explanation. We ended up ditching the project for time, so we never got to the root of the problem. Thanks for explaining why this happens.

 

@2011