Minimizing the initial delay

Hello developers,

This is a new blog post about minimizing the initial delay in VideoKit.
As known VideoKit uses ffmpeg library under the hood, We will focus on some options of ffmpeg.

The initial latency can not be remarkable when try to play a local file, but if the We are streaming from a remote server, it’s very clear to see this latency.

Now, let’s take a look which ffmpeg functions are used till the connection is done. Also, please note that We are not interested in the things that happened after the connection is made for now (like getting packets for buffering …)

Below functions are called sequentially,

  • 1. avformat_alloc_context
    (Allocates an AVFormatContext data object, AVFormatContext is the main data structure that holds information about the media file/network stream)
  • 2. avformat_open_input
    (Opens an input stream and read the header.)
  • 3. avformat_find_stream_info
    (Read packets of a media file to get stream information)
  • 4. av_dump_format
    (Shows the stream profile)
  • 5. avcodec_find_decoder
    (Find&create a AVCodec object with given codec id)
  • 6. avcodec_open2
    (Initialize the AVCodecContext to use the given AVCodec)

Now, We will measure the time consumed in above functions, then We will set probesize and analyzeduration ffmpeg options which are suggested to set for decreasing the connecting time and We will compare the results and see if those option parameters has an effect on connection duration. (BTW, you can find an article written by ffmpeg about this in here)

To see the time profile for each function, We are using an old way as below,

NSDate *start = [NSDate date];
//call method here to measure the its time profile
NSDate *methodFinish = [NSDate date];
NSTimeInterval executionTime = [methodFinish timeIntervalSinceDate:start];
NSLog(@"Execution Time: %f", executionTime);

Below is the results of a realtime stream, Alanya sahil (rtsp://, which is found in the VideoKit Sample project channels.

Method name


duration (secs)













As seen, the longest time is consumed in avformat_find_stream_info method

Now, lets talk about probesize and analyzeduration ffmpeg options,

probesize and analzemaxduration are the values between 32 to 2147483647 in bytes, they are used to determine the limit of data to be read from stream. So, when these values are bigger, then it takes much more time to reach that size of data.

Also, please note that the smaller values are OK only for known muxers, otherwise the connection may not be done because of lacking data about stream.

Let’s try with the lowest values which are 32 bytes for the same stream and see the results,

Method name

duration (secs)













These options are very effective on this stream, and the time consumption in avformat_find_stream_info methos drops from 4.x to 2.11 seconds with setting probesize and analyzeduration options to the lowest values …

In conclusion,

– avformat_open_input
– av_dump_format
– avformat_alloc_context
– avcodec_find_decoder
– avcodec_open2

-> these methods are always very fast, a remarkable time is not lost for each.

avformat_find_stream_info is taking time

-> avformat_find_stream_info can be made faster by reducing probesize & analyzeduration
-> avformat_find_stream_info can fail at low values if used not wellknown muxers

Finally about how to use these parameters,

in VideoKit,

in VKPlayerController.m file find below lines in – (void)play method
and add the extra 2 line as below

//extra parameters
_decodeManager.avPacketCountLogFrequency = 0.01;
[_decodeManager setLogLevel:kVKLogLevelDisable];
_decodeManager.probeSize = 32; // add this line
_decodeManager.maxAnalyzeDuration = 32; // add this line

when using ffplay on terminal,

$ ffplay rtsp:// -probesize 32 -analyzeduration 32

in source code that uses ffmpeg library,

AVFormatContext* _avFmtCtx = avformat_alloc_context();
_avFmtCtx->interrupt_callback.callback = decode_interrupt_cb;
_avFmtCtx->interrupt_callback.opaque = self;
_avFmtCtx->probesize = 32;
_avFmtCtx->max_analyze_duration = 32;

Now, how quickly can the first video frame be shown ?

We’ve minimized the connection duration and now We want to show the video frame as possible it can. FFmpeg is ready to read packets from stream after the connection is done, but VideoKit has a buffering mechanism in order to provide user smooth playing. It buffers 15 packets before start playing as default. This value is adjustable, and can be set to 1 which is the minimum value. By setting this value 1, VideoKit will show the video frame whenever it gets the first packet.
_decodeManager.minFramesToStartPlaying = 1;

Finally our fine-tuning code in – (void)play method in VKPlayerController.m file becomes as below,

//extra parameters
_decodeManager.avPacketCountLogFrequency = 0.01;
[_decodeManager setLogLevel:kVKLogLevelDisable];
_decodeManager.probeSize = 32; // already added
_decodeManager.maxAnalyzeDuration = 32; // already added
_decodeManager.minFramesToStartPlaying = 1;// add this line

What about the server side ?
FFmpeg released an article about reducing latency and it focuses on some important points about optimization of streaming servers, please see here

Below is the key points taken from that article,
> You may be able to decrease initial “startup” latency by specifying that I-frames come “more frequently”. Basically for typical x264 streams, it inserts an I-frame every 250 frames. This means that new clients that connect to the stream may have to wait up to 250 frames before they can start receiving the stream (or start with old data). So increasing I-frame frequency (makes the stream larger, but might decrease latency).

> Sometimes audio codecs also introduce some latency of their own. You may be able to get less latency by using speex, for example, or opus, in place of libmp3lame.

> You will also want to try and decrease latency at the server side, for instance ​wowza hints

> You can also (if capturing from a live source) increase frame rate to decrease latency (which affects throughput and also i-frame frequency, of course). This obvious sends packets more frequently, so (with 5 fps, you introduce at least a 0.2s latency, with 10 fps 0.1s latency) but it also helps clients to fill their internal buffers, etc. more quickly.

That’s all for now, in next blog post, We will discuss about reducing the latency in realtime streaming like streaming from webcam, etc …

Have fun!

Posted in Blog, ffmpeg, videokit and tagged , , .