About Mr.W


FFmpeg and x264 Encoding Guide


x264 is a H.264/MPEG-4 AVC encoder. The goal of this guide is to inform new users how to create a high-quality H.264 video.

There are two rate control modes that are usually suggested for general use: Constant Rate Factor (CRF) or Two-Pass ABR. The rate control is a method that will decide how many bits will be used for each frame. This will determine the file size and also how quality is distributed.

If you need help compiling and installing libx264 see one of our FFmpeg and x264 compiling guides.

Constant Rate Factor (CRF)

This method allows the encoder to attempt to achieve a certain output quality for the whole file when output file size is of less importance. This provides maximum compression efficiency with a single pass. Each frame gets the bitrate it needs to keep the requested quality level. The downside is that you can’t tell it to get a specific filesize or not go over a specific size or bitrate.

1. Choose a CRF value

The range of the quantizer scale is 0-51: where 0 is lossless, 23 is default, and 51 is worst possible. A lower value is a higher quality and a subjectively sane range is 18-28. Consider 18 to be visually lossless or nearly so: it should look the same or nearly the same as the input but it isn’t technically lossless.

The range is exponential, so increasing the CRF value +6 is roughly half the bitrate while -6 is roughly twice the bitrate. General usage is to choose the highest CRF value that still provides an acceptable quality. If the output looks good, then try a higher value and if it looks bad then choose a lower value.

Note: The CRF quantizer scale mentioned on this page only applies to 8-bit x264 (10-bit x264 quantizer scale is 0-63). You can see what you are using by referring to theffmpeg console output during encoding (yuv420p or similar for 8-bit, and yuv420p10le or similar for 10-bit). 8-bit is more common among distributors.

2. Choose a preset

A preset is a collection of options that will provide a certain encoding speed to compression ratio. A slower preset will provide better compression (compression is quality per filesize). This means that, for example, if you target a certain file size or constant bit rate, you will achieve better quality with a slower preset. Similarly, for constant quality encoding, you will simply save bitrate by choosing a slower preset.

The general guideline is to use the slowest preset that you have patience for. Current presets in descending order of speed are: ultrafast,superfastveryfastfasterfast,mediumslowslowerveryslowplacebo. The default preset is medium. Ignore placebo as it is not useful (see FAQ). You can see a list of current presets with -preset help, and what settings they apply with x264 –fullhelp.

You can optionally use -tune to change settings based upon the specifics of your input. Current tunings include: filmanimationgrainstillimagepsnrssim,fastdecodezerolatency. For example, if your input is animation then use the animation tuning, or if you want to preserve grain then use the grain tuning. If you are unsure of what to use or your input does not match any of tunings then omit the -tune option. You can see a list of current tunings with -tune help, and what settings they apply withx264 –fullhelp.

Another optional setting is -profile:v which will limit the output to a specific H.264 profile. This can generally be omitted unless the target device only supports a certain profile (see Compatibility). Current profiles include: baselinemainhighhigh10high422high444. Note that usage of -profile:v is incompatible with lossless encoding.

As a shortcut, you can also list all possible internal presets/tunes for FFmpeg by specifying no preset or tune option at all:

Note: Windows users should use NUL instead of /dev/null.

3. Use your settings

Once you’ve chosen your settings apply them for the rest of your videos if you are encoding more. This will ensure that they will all have similar quality.

CRF Example

Note that in this example the audio stream of the input file is simply ​stream copied over to the output and not re-encoded.


This method is generally used if you are targeting a specific output file size and output quality from frame to frame is of less importance. This is best explained with an example. Your video is 10 minutes (600 seconds) long and an output of 50 MB is desired. Since bitrate = file size / duration:

Two-Pass Example

Note: Windows users should use NUL instead of /dev/null.

As with CRF, choose the slowest preset you can tolerate.

Also see ​Making a high quality MPEG-4 (“DivX”) rip of a DVD movie. It is an MEncoder guide, but it will give you an insight about how important it is to use two-pass when you want to efficiently use every bit when you’re constrained with storage space.

Lossless H.264

You can use -qp 0 or -crf 0 to encode a lossless output. Use of -qp is recommended over -crf for lossless because 8-bit and 10-bit x264 use different -crf values for lossless. Two useful presets for this are ultrafast or veryslow since either a fast encoding speed or best compression are usually the most important factors. Most non-FFmpeg based players will not be able to decode lossless (but YouTube can), so if compatibility is an issue you should not use lossless.

Lossless Example (fastest encoding)

Lossless Example (best compression)

Overwriting default preset settings

You can overwrite default preset settings with the x264opts option, the x264-params option, or by using the libx264 private options (see ffmpeg -h encoder=libx264). This is not recommended unless you know what you are doing. The presets were created by the x264 developers and tweaking values to get a better output is usually a waste of time.


Additional Information & Tips

ABR (Average Bit Rate)

This provides something of a “running average” target, with the end goal that the final file match this number “overall on average” (so basically, if it gets a lot of black frames, which cost very little, it will encode them with less than the requested bitrate, but then the next few seconds of (non-black) frames it will encode at very high quality, to bring the average back in line). Using 2-pass can help this method to be more effective. You can also use this in combination with a “max bit rate” setting in order to prevent some of the swings.

CBR (Constant Bit Rate)

There is no native CBR mode, but you can “simulate” a constant bit rate setting by tuning the parameters of ABR:

In the above example, -bufsize is the “rate control buffer” so it will enforce your requested “average” (4000k in this case) across each 1835k worth of video. So basically it is assumed that the receiver/end player will buffer that much data so it’s ok to fluctuate within that much.

Of course, if it’s all just empty/black frames then it will still serve less than that many bits/s but it will raise the quality level as much as it can, up to the crf level.

CRF with maximum bit rate

You can also also use a crf with a maximum bit rate by specifying both crf *and* maxrate settings, like

This will effectively “target” crf 20, but if the output exceeds 400kb/s, it will degrade to something less than crf 20 in that case.

Low Latency

libx264 offers a -tune zerolatency option. See the StreamingGuide.


All devices

If you want your videos to have highest compatibility with target devices (older iOS versions or all Android devices):

This disables some advanced features but provides for better compatibility. Typically you may not need this setting (and therefore avoid using -profile:v and -level), but if you do use this setting it may increase the bit rate quite a bit compared to what is needed to achieve the same quality in higher profiles.


iOS Compatability (​source)
Profile Level Devices Options
Baseline 3.0 All devices -profile:v baseline -level 3.0
Baseline 3.1 iPhone 3G and later, iPod touch 2nd generation and later -profile:v baseline -level 3.1
Main 3.1 iPad (all versions), Apple TV 2 and later, iPhone 4 and later -profile:v main -level 3.1
Main 4.0 Apple TV 3 and later, iPad 2 and later, iPhone 4S and later -profile:v main -level 4.0
High 4.0 Apple TV 3 and later, iPad 2 and later, iPhone 4S and later -profile:v high -level 4.0
High 4.1 iPad 2 and later and iPhone 4S and later -profile:v high -level 4.1
  • This table does not include any additional restrictions which may be required by your device.
  • -level currently does not affect the number of reference frames (-refs). See ticket #3307.

Apple Quicktime

Apple QuickTime only supports YUV planar color space with 4:2:0 chroma subsampling (use -pix_fmt yuv420p) for H.264 video. For information on additional restrictions see​QuickTime-compatible Encoding.


See ​Authoring a professional Blu-ray Disc with x264.

Pre-testing your settings

Encode a random section instead of the whole video with the -ss and -t options to quickly get a general idea of what the output will look like.

  • -ss: Offset time from beginning. Value can be in seconds or HH:MM:SS format.
  • -t: Output duration. Value can be in seconds or HH:MM:SS format.

faststart for web video

You can add -movflags +faststart as an output option if your videos are going to be viewed in a browser. This will move some information to the beginning of your file and allow the video to begin playing before it is completely downloaded by the viewer. It is not required if you are going to use a video service such as YouTube.


Will two-pass provide a better quality than CRF?

​No, though it does allow you to target a file size more accurately.

Why is placebo a waste of time?

It helps at most ~1% compared to the veryslow preset at the cost of a much higher encoding time. It’s diminishing returns: veryslow helps about 3% compared to the slowerpreset, slower helps about 5% compared to the slow preset, and slow helps about 5-10% compared to the medium preset.

Why doesn’t my lossless output look lossless?

Blame the RGB to YUV color space conversion. If you convert to yuv444 it should still be lossless (which is the default now).

Will a graphics card make x264 encode faster?

Not necessarily. ​x264 supports OpenCL for some lookahead operations. There are also some proprietary encoders that utilize the GPU, but that does not mean they are well optimized, though encoding time may be faster; and they might be ​worse than vanilla x264, and possibly slower. Regardless, FFmpeg today doesn’t support any means of GPU encoding, outside of libx264.

Encoding for dumb players

You may need to use -pix_fmt yuv420p for your output to work in QuickTime and most other players. These players only supports the YUV planar color space with 4:2:0 chroma subsampling for H.264 video. Otherwise, depending on your source, ffmpeg may output to a pixel format that may be incompatible with these players.

Additional Resources




FFMPEG多线程编码器一般以在Slice内分功能模块进行多线程编码,如h263,h263P,msmpeg(v1, v2, v3),wmv1。包含以下几个线程:(1)Pre_estimation_motion_thread运动估计前的准备;(2)Estimation_motion_thread运动估计;(3)Mb_var_thread宏块其他变量;(4)Encode_thread编码主线程。当然也有例外,如FFV1编码器按Slice为线程单位进行多线程编码。


  1. Slice Threading

FFmpeg中,dvvideo_decoder, ffv1_decoder, h264_decoder, mpeg2_video_decoder和mpeg_video_decoder均支持了Slice Threading。


Frame Threading主线程和解码线程的同步如图1所示。

 图1 Frame Threading主线程和解码线程的同步

  1. Frame Threading

目前为止支持Frame Threading的解码器有h264_decoder, huffyuv_decoder, ffvhuff_decoder, mdec_decoder, mimic_decoder, mpeg4_decoder, theora_decoder, vp3_decoder和vp8_decoder。

Frame Threading有如下限制:用户函数draw_horiz_band()必须是线程安全的;为了提升性能,用户应该为codec提供线程安全的get_buffer()回调函数;用户必须能处理多线程带来的延时。另外,支持Frame Threading的codec要求每个包包含一个完整帧。Buffer内容在ff_thread_await_progress()调用之前不能读,同样,包括加边draw_edges()在内的处理,在ff_thread_report_progress()调用之后,Buffer内容不能写。



图2 Codec未实现update_thread_context()和线程安全的get_buffer(),线程状态转换

图3 Codec实现update_thread_context()和线程安全的get_buffer(),线程状态转换


图4 Frame Threading主线程和解码线程的同步




在之前的博文中,我曾经简单把之前阅读文献资料和编译软件的记录和心得记录分享了一下。由于我也是刚刚接触HEVC没几天,有些问题我的理解也不是很深入,在之前的博文中有博友对高层语法中的一些概念提出了疑问。在咨询了了解背景知识的同学之后,经过仔细地重新推敲参考文献(”Overviewof HEVC”)之后,对一些问题找到了一些答案,在此另发一篇博文作为回应。
(1)关于GOP。这是图像组(Group ofPictures)的意思,表示编码的视频序列分成了一组一组的有序的帧的集合进行编码。每个GOP一定是以一个I帧开始的,但是却不一定指代的是两个I帧之间的距离。因为一个GOP内可能包含几个I帧,只有第一个I帧(也就是第一帧)才是关键帧。在程序cfg中,GOP的长度和两个I帧的距离也是两个不同参数指定的(如IntraPeriod和GOPSize或者类似的参数)。所以,两个I帧的间距不可能大于GOP的长度,一般情况是更小的。
(2)关于IDR。这个词儿的全称是Instantaneous DecodingRefresh,是在H.264中定义的结构。在H.264中,IDR帧一定是I帧,而且一定是GOP的开始,也是H.264GOP的关键帧。但是反过来却不成立,I帧不一定是IDR帧。GOP的长度不是定死不变的,在H.264的编码器中,如果判定场景发生变化,那么即使不到原定GOP的末尾,也会在这个位置加入一个IDR,作为新一个GOP的开始。此时这个GOP的长度就被缩小了。
I B B P B B P B B PI B B P B B P B BP(显示顺序)
I P B B P B B P BB I P B B P B B P BB(解码顺序)
I B B P B B P B B PBBI BB P B B P B (显示顺序)
I P BB P B B P B B I B B P B B P BB…(解码顺序)



  • qscale的取值可以是0.01-255但实际使用超过50就很糟糕了
  • ffmpeg的cbr模式可以把码率控制的不错,但是vbr无法限制最高码率(虽然有max的设置,但是程序没有实现)
  • x264标准的封装是x264+aac in flv或者x264+aac in MP4





  • -formats 参数。会显示你机器当前支持的封装、编码、解码器的信息
  • -y参数,会指示ffmpeg覆盖输出文件
  • -t 指定视频流持续的时常,支持以秒为单位的数字或”时:分:秒[.毫秒]”
  • -fs 指定输出文件大小的限制
  • -ss 指定开始的时间,和-t的单位一样
  • -target 直接设定你想要转成的目标格式,所有的相关设置都会采用内设值,当然也你也可以加上自己要修改的参数。可用的选择有:
    “vcd”, “svcd”, “dvd”, “dv”, “dv50”, “pal-vcd”, “ntsc-svcd”, …


  • -b 指定码率注意单位是bit/s,所以我们一般要加k,比如 -b 1000k 就是1000kb/s
  • -g 设置组的大小
  • -vframes 指定要编码的帧数,比如-vframes 1 就是编码1帧,截图的时候就这样写.
  • -r 指定帧率,默认是25
  • -s 指定图像分辨率,用wxh的格式,比如320×240
  • -aspect 指定宽高比 可以些16:9这种,也可以写小数比如1.3333
  • -croptop 指定顶部裁减多少像素,类似的还有-cropleft -cropright -cropbuttom
  • -bt 设置比特率容许的误差,默认4000k,在第一阶段时使用这个参数会告诉码率控制器能够偏移平均码率多远,这个选项和最大最小码率无关.设太小了不利于质量
  • -maxrate 和-minrate 指定允许的最大和最小码率,一般如果要用cbr模式编码的话会用这个:

  • -vcodec 强制使用某种编码器
  • -sameq 使用和源文件相同的质量,这个选项是动态码率的
  • -pass 指定编码阶段,这个只有1和2,第一阶段的时候应该不处理音频,并且把输出导向空,比如:
  • -qscale 使用固定量化因子来量化视频这个是在vbr模式的,前面有提到,越小质量越好,不要超过50,相关的参数还有-qmin –qmax用来设定最大最小可使用的量化值
  • -qdiff 指定固定量化器因子允许的最大偏差
  • -qblur 指定量化器模糊系数,可用范围是0.0-1.0越大使得码率在时间上分配的越平均
  • -qcomp 指定视频量化器压缩系数,默认0.5
  • -me_method 指定运动估计方法(motion estimation method),可用的选择从垃圾到好排列如下:
    zero (0向量)
    epzs (默认)
    full (完全搜索,很慢又好不到哪里去)
  • -mbd 设定宏模块决策,可用的值:
    0 使用mb_cmp,不知道是什么东西,所以这些参数我参考一下mencoder里面的
    1 选用需要比特最少的宏块模式
    2 选用码率失真最优的宏块模式
  • -4mv 使用宏块的4个运动向量,只支持mpeg4
  • -part 使用数据划分,只支持mpeg4
  • -ilme 强制允许交错的编码,只支持mpeg2和mpeg4,你可以选择用-deinterlace来去交错


  • -ar 设置采样频率,默认44100hz
  • -ab 设置比特率,默认64k
  • -an 禁用音频录制
  • -acodec 指定音频编码器

x264+aac in mp4 for psp



FFmpeg option x264 option
-g <frames> –keyint
-b <bits per second> –bitrate
-bufsize <bits> –vbv-bufsize
-maxrate <bits> –vbv-maxrate
-pass <1,2,3> –pass
-crf <float> –crf
-cqp <int> –qp
-bf <int> –bframes
-coder <0,1> –no-cabac
-bframebias <int> –b-bias
-keyint_min <int> –min-keyint
-sc_threshold <int> –scenecut
-deblockalpha <int>-deblockbeta <int> –deblock
-qmin <int> –qpmin
-qmax <int> –qpmax
-qdiff <int> –qpstep
-qcomp <float> –qcomp
-qblur <float> –qblur
-complexityblur <float> –cplxblur
-refs <int> –ref
-directpred <int> –direct
-me_method <epzs,hex,umh,full> –me
-me_range <int> –merange
-subq <int> –subme
-bidir_refine <0,1> –bime
-trellis <0,1,2> –trellis
-nr <int> –nr
-level <int> –level
-bt <bits> –ratetol = -bt / -b
-rc_init_occupancy <bits> –vbv-init = -rc_init_occupancy / -bufsize
-i_qfactor <float> –ipratio = 1 / -i_qfactor
-b_qfactor <float> –pbratio
-chromaoffset <int> –chroma-qp-offset
-rc_eq <string> –rc_eq
-threads <int> –threads
-cmp <-chroma/+chroma> –no-chroma-me
-partitions –partitions
+parti8×8 i8×8
+parti4×4 i4×4
+partp8×8 p8×8
+partp4×4 p4×4
+partb8×8 b8×8
-loop/+loop –no-deblock/–deblock
-psnr/+psnr –no-psnr/nothing
+bpyramid –b-pyramid
+wpred –weightb
+brdo –b-rdo 我这里的ffmpeg已经不能用这个了
+mixed_refs –mixed-refs
+dct8×8 –8×8dct
-fastpskip/+fastpskip –no-fast-pskip
+aud –aud


Frame-type options:

  • –keyint <integer> (x264)
    -g <integer>
    Keyframe interval, also known as GOP length. This determines the maximum distance between I-frames. Very high GOP lengths will result in slightly more efficient compression, but will make seeking in the video somewhat more difficult. Recommended default: 250
  • –min-keyint <integer> (x264)
    -keyint_min <integer> (FFmpeg)
    Minimum GOP length, the minimum distance between I-frames. Recommended default: 25
  • –scenecut <integer> (x264)
    -sc_threshold <integer> (FFmpeg)
    Adjusts the sensitivity of x264’s scenecut detection. Rarely needs to be adjusted. Recommended default: 40
  • –pre-scenecut (x264)
    UNKNOWN (FFmpeg)
    Slightly faster (but less precise) scenecut detection. Normal scenecut detection decides whether a frame is a scenecut after the frame is encoded, and if so then re-encodes the frame as an I-frame. This is not compatible with threading, however, and so –pre-scenecut is automatically activated when multiple encoding threads are used.
  • –bframes <integer> (x264)
    -bf <integer> (FFmpeg)
    B-frames are a core element of H.264 and are more efficient in H.264 than any previous standard. Some specific targets, such as HD-DVD and Blu-Ray, have limitations on the number of consecutive B-frames. Most, however, do not; as a result, there is rarely any negative effect to setting this to the maximum (16) since x264 will, if B-adapt is used, automatically choose the best number of B-frames anyways. This parameter simply serves to limit the max number of B-frames. Note that Baseline Profile, such as that used by iPods, does not support B-frames. Recommended default: 16
    • –b-adapt <integer> (x264)
      -b_strategy <integer> (FFmpeg)
      x264, by default, adaptively decides through a low-resolution lookahead the best number of B-frames to use. It is possible to disable this adaptivity; this is not recommended. Recommended default: 1

0: Very fast, but not recommended. Does not work with pre-scenecut (scenecut must be off to force off b-adapt).

1: Fast, default mode in x264. A good balance between speed and quality.

2: A much slower but more accurate B-frame decision mode that correctly detects fades and generally gives considerably better quality. Its speed gets considerably slower at high bframes values, so its recommended to keep bframes relatively low (perhaps around 3) when using this option. It also may slow down the first pass of x264 when in threaded mode.
  • –b-bias 0 (x264)
    -bframebias 0 (FFmpeg)
    Make x264 more likely to choose higher numbers of B-frames during the adaptive lookahead. Not generally recommended. Recommended default: 0
  • –b-pyramid (x264)
    -flags2 +bpyramid (FFmpeg)
    Allows B-frames to be kept as references. The name is technically misleading, as x264 does not actually use pyramid coding; it simply adds B-references to the normal reference list. B-references get a quantizer halfway between that of a B-frame and P-frame. This setting is generally beneficial, but it increases the DPB (decoding picture buffer) size required for playback, so when encoding for hardware, disabling it may help compatibility.
  • –no-cabac (x264)
    -coder 0,1 (FFmpeg)
    CABAC is the default entropy encoder used by x264. Though somewhat slower on both the decoding and encoding end, it offers 10-15% improved compression on live-action sources and considerably higher improvements on animated sources, especially at low bitrates. It is also required for the use of trellis quantization. Disabling CABAC may somewhat improve decoding performance, especially at high bitrates. CABAC is not allowed in Baseline Profile. Recommended default: -coder 1 (CABAC enabled)
  • –ref <integer> (x264)
    -refs <integer> (FFmpeg)
    One of H.264’s most useful features is the abillity to reference frames other than the one immediately prior to the current frame. This parameter lets one specify how many references can be used, through a maximum of 16. Increasing the number of refs increases the DPB (Decoded Picture Buffer) requirement, which means hardware playback devices will often have strict limits to the number of refs they can handle. In live-action sources, more reference have limited use beyond 4-8, but in cartoon sources up to the maximum value of 16 is often useful. More reference frames require more processing power because every frame is searched by the motion search (except when an early skip decision is made). The slowdown is especially apparent with slower motion estimation methods. Recommended default: -refs 6
  • –no-deblock (x264)
    -flags -loop (FFmpeg)
    Disable loop filter. Recommended default: -flags +loop (Enabled)
  • –deblock (x264)
    -deblockalpha <integer>(FFmpeg)
    -deblockbeta <integer>(FFmpeg)
    One of H.264’s main features is the in-loop deblocker, which avoids the problem of blocking artifacts disrupting motion estimation. This requires a small amount of decoding CPU, but considerably increases quality in nearly all cases. Its strength may be raised or lowered in order to avoid more artifacts or keep more detail, respectively. Deblock has two parameters: alpha (strength) and beta (threshold). Recommended defaults:-deblockalpha 0 -deblockbeta 0 (Must have ‘-flags +loop’)
  • –interlaced (x264)
    UNKNOWN (FFmpeg)
    Enables interlaced encoding. x264’s interlaced encoding is not as efficient as its progressive encoding; consider deinterlacing for maximum effectiveness.


  • –qp <integer>(x264)
    -cqp <integer>(FFmpeg)
    Constant quantizer mode. Not exactly constant completely–B-frames and I-frames have different quantizers from P-frames. Generally should not be used, since CRF gives better quality at the same bitrate.
  • –bitrate <integer>(x264)
    -b <integer>(FFmpeg)
    Enables target bitrate mode. Attempts to reach a specific bitrate. Should be used in 2-pass mode whenever possible; 1-pass bitrate mode is generally the worst ratecontrol mode x264 has.
  • –crf <float>(x264)
    -crf <float>(FFmpeg)
    Constant quality mode (also known as constant ratefactor). Bitrate corresponds approximately to that of constant quantizer, but gives better quality overall at little speed cost. The best one-pass option in x264.
  • –vbv-maxrate <integer>(x264)
    -maxrate <integer>(FFmpeg)
    Specifies the maximum bitrate at any point in the video. Requires the VBV buffersize to be set. This option is generally used when encoding for a piece of hardware with bitrate limitations.
  • –vbv-bufsize <integer>(x264)
    -bufsize <integer> (FFmpeg)
    Depends on the profile level of the video being encoded. Set only if you’re encoding for a hardware device.
  • –vbv-init <float>(x264)
    -rc_init_occupancy <float>(FFmpeg)
    Initial VBV buffer occupancy. Note: Don’t mess with this.
  • –qpmin <integer> (x264)
    -qmin <integer> (FFmpeg)
    Minimum quantizer. Doesn’t need to be changed. Recommended default: -qmin 10
  • –qpmax <integer> (x264)
    -qmax <integer> (FFmpeg)
    Maximum quantizer. Doesn’t need to be changed. Recommended default: -qmax 51
  • –qpstep <integer> (x264)
    -qdiff <integer> (FFmpeg)
    Set max QP step. Recommended default: -qdiff 4
  • –ratetol <float>(x264)
    -bt <float>(FFmpeg)
    Allowed variance of average bitrate
  • –ipratio <float>(x264)
    -i_qfactor <float>(FFmpeg)
    Qscale difference between I-frames and P-frames.
  • –pbratio <float>(x264)
    -b_qfactor <float>(FFmpeg)
    Qscale difference between P-frames and B-frames.
  • –chroma-qp-offset <integer>(x264)
    -chromaoffset <integer>(FFmpeg)
    QP difference between chroma and luma.
  • –aq-strength <float> (x264)
    UNKNOWN (FFmpeg)
    Adjusts the strength of adaptive quantization. Higher values take more bits away from complex areas and edges and move them towards simpler, flatter areas to maintain fine detail. Default: 1.0
  • –pass <1,2,3> (x264)
    -pass <1,2,3>(FFmpeg)
    Used with –bitrate. Pass 1 writes the stats file, pass 2 reads it, and 3 both reads and writes it. If you want to use three pass, this means you will have to use –pass 1 for the first pass, –pass 3 for the second, and –pass 2 or 3 for the third.
  • –stats <string>(x264)
    UNKNOWN (FFmpeg)
    Allows setting a specific filename for the firstpass stats file.
    • –rceq <string>(x264)
      -rc_eq <string>(FFmpeg)
      Ratecontrol equation. Recommended default: -rc_eq ‘blurCplx^(1-qComp)’
  • –qcomp <float>(x264)
    -qcomp <float> (FFmpeg)
    QP curve compression: 0.0 => CBR, 1.0 => CQP. Recommended default: -qcomp 0.60
  • –cplxblur <float>(x264)
    -complexityblur <float>(FFmpeg)
    Reduce fluctuations in QP (before curve compression) [20.0]
  • –qblur <float>(x264)
    -qblur <float>(FFmpeg)
    Reduce fluctuations in QP (after curve compression) [0.5]
  • –zones /(x264)
    UNKNOWN (FFmpeg)
    Allows setting a specific quantizer for a specific region of video.
  • –qpfile (x264)
    UNKNOWN (FFmpeg)
    Allows one to read in a set of frametypes and quantizers from a file. Useful for testing various encoding options while ensuring the exact same quantizer distribution.


  • –partitions <string> (x264)
    -partitions <string> (FFmpeg)
    p8x8 (x264) /+partp8x8 (FFmpeg)p4x4 (x264) /+partp4x4 (FFmpeg)b8x8 (x264) /+partb8x8 (FFmpeg)

    i8x8 (x264) /+parti8x8 (FFmpeg)

    i4x4 (x264) /+parti4x4 (FFmpeg)

    One of H.264’s most useful features is the ability to choose among many combinations of inter and intra partitions. P-macroblocks can be subdivided into 16×8, 8×16, 8×8, 4×8, 8×4, and 4×4 partitions. B-macroblocks can be divided into 16×8, 8×16, and 8×8 partitions. I-macroblocks can be divided into 4×4 or 8×8 partitions. Analyzing more partition options improves quality at the cost of speed. The default is to analyze all partitions except p4x4 (p8x8, i8x8, i4x4, b8x8), since p4x4 is not particularly useful except at high bitrates and lower resolutions. Note that i8x8 requires 8x8dct, and is therefore a High Profile-only partition. p8x8 is the most costly, speed-wise, of the partitions, but also gives the most benefit. Generally, whenever possible, all partition types except p4x4 should be used.

  • –direct <integer>(x264)
    -directpred <integer>(FFmpeg)
    B-frames in H.264 can choose between spatial and temporal prediction mode. Auto allows x264 to pick the best of these; the heuristic used is whichever mode allows more skip macroblocks. Auto should generally be used.
  • –direct-8×8 (x264)
    UNKONWN (FFmpeg)
    This should be left at the default (-1).
  • –weightb (x264)
    -flags2 +wpred(FFmpeg)
    This allows B-frames to use weighted prediction options other than the default. There is no real speed cost for this, so it should always be enabled.
  • –me (x264)
    -me_method (FFmpeg)
    dia (x264) / epzs (FFmpeg) is the simplest search, consisting of starting at the best predictor, checking the motion vectors at one pixel upwards, left, down, and to the right, picking the best, and repeating the process until it no longer finds any better motion vector.hex (x264) / hex (FFmpeg) consists of a similar strategy, except it uses a range-2 search of 6 surrounding points, thus the name. It is considerably more efficient than DIA and hardly any slower, and therefore makes a good choice for general-use encoding.umh (x264) / umh (FFmpeg) is considerably slower than HEX, but searches a complex multi-hexagon pattern in order to avoid missing harder-to-find motion vectors. Unlike HEX and DIA, the merange parameter directly controls UMH’s search radius, allowing one to increase or decrease the size of the wide search.

    esa (x264) / full (FFmpeg) is a highly optimized intelligent search of the entire motion search space within merange of the best predictor. It is mathematically equivalent to the bruteforce method of searching every single motion vector in that area, though faster. However, it is still considerably slower than UMH, with not too much benefit, so is not particularly useful for everyday encoding.

    One of the most important settings for x264, both speed and quality-wise.

  • –merange <integer>(x264)
    -me_range <integer>(FFmpeg)
    MErange controls the max range of the motion search. For HEX and DIA, this is clamped to between 4 and 16, with a default of 16. For UMH and ESA, it can be increased beyond the default 16 to allow for a wider-range motion search, which is useful on HD footage and for high-motion footage. Note that for UMH and ESA, increasing MErange will significantly slow down encoding.
  • –mvrange(x264)
    UNKNOWN (FFmpeg)
    Limits the maximum motion vector range. Since x264 by default limits this to 511.75 for standards compliance, this should not be changed.
  • –subme 6(x264)
    -subq 6(FFmpeg)
    1: Fastest, but extremely low quality. Should be avoided except on first pass encoding.

    2-5: Progressively better and slower, 5 serves as a good medium for higher speed encoding.

    6-7: 6 is the default. Activates rate-distortion optimization for partition decision. This can considerably improve efficiency, though it has a notable speed cost. 6 activates it in I/P frames, and subme7 activates it in B frames.

    8-9: Activates rate-distortion refinement, which uses RDO to refine both motion vectors and intra prediction modes. Slower than subme 6, but again, more efficient.

    An extremely important encoding parameter which determines what algorithms are used for both subpixel motion searching and partition decision.

  • –psy-rd <float>:<float> (x264)
    UNKNOWN (FFmpeg)
    First value represents the amount that x264 biases in favor of detail retention instead of max PSNR in mode decision. Requires subme >= 6. Second value is psy-trellis, an experimental algorithm that tries to improve sharpness and detail retention at the expense of more artifacting. Recommended starting values are 0.1-0.2. Requires trellis >= 1. Recommended default: 1.0:0.0
  • –mixed-refs(x264)
    -flags2 +mixed_refs(FFmpeg)
    H.264 allows p8x8 blocks to select different references for each p8x8 block. This option allows this analysis to be done, and boosts quality with little speed impact. It should generally be used, though it obviously has no effect with only one reference frame.
  • –no-chroma-me(x264)
    UNKNOWN (FFmpeg)
    Chroma is used in the last steps of the subpixel refinement by default. For a slight speed increase, this can be disabled (at the cost of quality).
  • –8x8dct (x264)
    -flags2 +dct8x8(FFmpeg)
    Gives a notable quality boost by allowing x264 to choose between 8×8 and 4×4 frequency transform size. Required for i8x8 partitions. Speed cost for this option is near-zero both for encoding and decoding; the only reason to disable it is when one needs support on a device not compatible with High Profile.
  • –trellis <0,1,2>(x264)
    -trellis <0,1,2>(FFmpeg)
    0: disabled1: enabled only on the final encode of a MB2: enabled on all mode decisions

    The main decision made in quantization is which coefficients to round up and which to round down. Trellis chooses the optimal rounding choices for the maximum rate-distortion score, to maximize PSNR relative to bitrate. This generally increases quality relative to bitrate by about 5% for a somewhat small speed cost. It should generally be enabled. Note that trellis requires CABAC.

  • –no-fast-pskip(x264)
    -flags2 -fastpskip(FFmpeg)
    By default, x264 will skip macroblocks in P-frames that don’t appear to have changed enough between two frames to justify encoding the difference. This considerably speeds up encoding. However, for a slight quality boost, P-skip can be disabled. In this case, the full analysis will be done on all P-blocks, and the only skips in the output stream will be the blocks whose motion vectors happen to match that of the skip vector and motion vectors happen to match that of the skip vector and which have no residual. The speed cost of enabling no-fast-pskip is relatively high, especially with many reference frames. There is a similar B-skip internal to x264, which is why B-frames generally encode much faster than P-frames, but it cannot be disabled on the commandline.
  • –no-dct-decimate(x264)
    UNKNOWN (FFmpeg)
    By default, x264 will decimate (remove all coefficients from) P-blocks that are extremely close to empty of coefficents. This can improve overall efficiency with little visual cost, but may work against an attempt to retain grain or similar. DCT decimation should be left on unless there’s a good reason to disable it.
  • –nr(x264)
    UNKNOWN (FFmpeg)
    a fast, built-in noise reduction routine. Not as effective as external filters such as hqdn3d, but faster. Since x264 already naturally reduces noise through its quantization process, this parameter is not usually necessary.
  • –deadzone-inter(264)
    –deadzone-intra (x264)
    UNKNOWN (FFmpeg)
    UNKNOWN (FFmpeg)
    When trellis isn’t activated, deadzone parameters determine how many DCT coefficients are rounded up or down. Rounding up results in higher quality and more detail retention, but costs more bits–so rounding is a balance between quality and bit cost. Lowering these settings will result in more coefficients being rounded up, and raising the settings will result in more coefficients being rounded down. Recommended: keep them at the defaults.
  • –cqm (264)
    –cqpfile(x264)UNKNOWN (FFmpeg)
    UNKNOWN (FFmpeg)
    Allows the use of a custom quantization matrix to weight frequencies differently in the quantization process. The presets quant matrices are “jvt” and “flat”. –cqpfile reads a custom quant matrices from a JM-compatible file. Recommended only if you know what you’re doing.



通过 ffmpeg 无损剪切/拼接视频

剪切/拼接视频文件是一种常见需求。在线视频网站现在往往将一个视频文件分割成 n 段,以减少流量消耗。使用 DownloadHelper/DownThemAll 这类工具下载下来的往往就是分割后的文件。能实现剪切/拼接视频文件的工具多种多样,但往往都需要进行视频重编码(transcoding),这就不可避免的带来了视频质量上的损耗,更不用提那长的令人发指的转换时间了…

其实借助 ffmpeg 我们就可以在不进行视频重编码的情况下完成此类任务:


其中 START_TIME/STOP_TIME 的格式可以写成两种格式:

  1. 以秒为单位计数: 80
  2. 时:分:秒: 00:01:20

拼接 :

拼接的情况稍微复杂些,我们需要将需要拼接的视频文件按以下格式保存在一个列表 list.txt 中:



方便起见,我写了一个脚本来简化操作。放在 github 上,请自取:




-ss START_TIME 这个东西写在 -i 前面会好一点,因为是用得copy,所以显示不出什么差别,但是如果是有进行转码的话就可以体现出优势了

在-i参数前面的 -ss 是针对源文件的,所以会直接从对应的位置开始截取。
在后面的 -ss 是针对目标文件的,会对源文件解码,然后直到对应位置才开始输出。。

-ss 在 -i 前面和后面是不同的 在前面的时候是fast seeking 是不准的 会差几个frame 在后面的时候是accurate seeking 准但是速度慢 根据官网说明(http://trac.ffmpeg.org/wiki/Seeking%20with%20FFmpeg) 应该两个地方都用(fast and accurate seeking)



Seeking with FFmpeg

Cutting small sections

To extract only a small segment in the middle of a movie, it can be used in combination with -t which specifies the duration, like -ss 60 -t 10 to capture from second 60 to 70. Or you can use the -to option to specify an out point, like -ss 60 -to 70 to capture from second 60 to 70. -t and -to are mutually exclusive. If you use both, -t will be used.

Note that if you specify -ss before -i only, the timestamps will be reset to zero, so -t and -to have the same effect:

Here, the first command will cut from 00:01:00 to 00:03:00 (in the original), whereas the second command would cut from 00:01:00 to 00:02:00, as intended.

If you cut with stream copy (-c copy) you need to use the -avoid_negative_ts 1 option if you want to use that segment with the ​concat demuxer .


Time unit syntax

Note that you can use two different time unit formats: sexagesimal (HOURS:MM:SS.MICROSECONDS, as in 01:23:45.678), or in seconds. If a fraction is used, such as 02:30.05, this is interpreted as “5 100ths of a second”, not as frame 5. For instance, 02:30.5 would be 2 minutes, 30 seconds, and a half a second, which would be the same as using 150.5 in seconds.

Doing a bitstream copy gives me a broken file?

If you use -ss with -c:v copy, the resulting bitstream might end up being choppy, not playable, or out of sync with the audio stream, since ffmpeg is forced to only use/split on i-frames.



How to concatenate (join, merge) media files

Concat demuxer

The concat demuxer was added to FFmpeg 1.1. You can read about it in the documentation.


Create a file mylist.txt with all the files you want to have concatenated in the following form (lines starting with a # are ignored):

Note that these can be either relative or absolute paths. Then you can stream copy or re-encode your files:

It is possible to generate this list file with a bash for loop, or using printfEither of the following would generate a list file containing every *.wav in the working directory:

If your shell supports process substitution (like Bash and Zsh), you can avoid explicitly creating a list file and do the whole thing in a single line. This would be impossible with the concat protocol (see below):

You can also loop a video. This example will loop input.mkv 10 times:

Concat protocol

While the demuxer works at the stream level, the concat protocol works at the file level. Certain files (mpg and mpeg transport streams, possibly others) can be concatenated. This is analogous to using cat on UNIX-like systems or copy on Windows.


If you have MP4 files, these could be losslessly concatenated by first transcoding them to mpeg transport streams. With h.264 video and AAC audio, the following can be used:

If you’re using a system that supports named pipes, you can use those to avoid creating intermediate files – this sends stderr (which ffmpeg sends all the written data to) to /dev/null, to avoid cluttering up the command-line:

All MPEG codecs (H.264, MPEG4/divx/xvid, MPEG2; MP2, MP3, AAC) are supported in the mpegts container format, though the commands above would require some alteration (the-bsf bitstream filters will have to be changed).



FFMPEG fields drop and dup

“I was wondering about the meaning of drop and dup at the output of ffmpeg? What is their meaning?”

“Drop” means it has dropped a frame during encoding.

“Dup” means it has duplicated a frame during encoding.




在网上找了个视频,其中有部分内容是需要的,能不能从整个文件中抽取一部分呢?或抽取一部分内容的音频列?在查阅了资料后发现 FFmpeg 很容易实现这些功能。

这行命令解释为:从文件 Morning_News.asf 第 46:28 分秒开始,截取 03: 25 的时间,其中视频和音频解码不变,输出文件名为 output.asf 。

那如果视频文件太大,该如何保存为的 mp3 文件呢?





一 裁剪视频


ffmpeg -ss START -t DURATION -i INPUT -vcodec copy -acodec copy OUTPUT


-ss 开始时间,如: 00:00:20,表示从20秒开始;

-t 时长,如: 00:00:10,表示截取10秒长的视频;

-i 输入,后面是空格,紧跟着就是输入视频文件;

-vcodec copy 和 -acodec copy表示所要使用的视频和音频的编码格式,这里指定为copy表示原样拷贝;




ffmpeg -ss 00:00:20 -t 00:00:10 -i D:/MyVideo.mpg -vcodec copy -acopy copy D:/Split.mpg







ffmpeg -i INPUT -sameq -intra OUTPUT

-i 输入,后面是空格,紧跟着就是输入视频文件;

INPUT 输入文件;

-sameq 表示保持同样的视频质量;

-intra, 帧内编码;

OUTPUT 输出文件名。



ffmpeg -i D:/MyVideo.mpg -sameq -intra D:/temp.mpg



ffmpeg -ss START -vsync 0 -t DURATION -i INPUT -vcodec VIDEOCODEC-acodec AUDIOCODEC OUTPUT

-ss 开始时间,如: 00:00:20,表示从20秒开始;

-t 时长,如: 00:00:10,表示截取10秒长的视频;

-i 输入,后面是空格,紧跟着就是输入视频文件;

-vcodec 视频的编码格表示所要使用的视频式;

-acodec 音频的编码格表示所要使用的视频式;




ffmpeg -ss 00:00:30 -vsync 0 -t 00:00:30 -i D:/temp.mpg -vcodec libx264-acodec libfaac D:/result.mpg



二 合并视频



1. 首先将各个视频全部转换为mpeg格式:

ffmpeg  -i INPUT -f mpeg  OUTPUT


ffmpeg  -i D:/temp1.avi -f mpeg  D:/result1.mpg

ffmpeg  -i D:/temp2.mp4 -f mpeg  D:/result2.mpg

2. 通过copy或者cat命令合并视频



copy /b “D:/result1.mpg”+”D:/result1.mpg” “D:/result.mpge”

3. 将合并的视频进行编码生成最终的结果视频



ffmpeg -i “D:/result.mpge” -f mp4 “D:/result.mp4”





What is HTTP Persistent Connections?
HTTP persistent connections, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using the same TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new one for every single request/response pair. Using persistent connections is very important for improving HTTP performance.

HTTP长连接,与一般每次发起http请求或响应都要建立一个tcp连接不同,http长连接利用同一个tcp连接处理多个http请求和响应,也叫HTTP keep-alive,或者http连接重用。使用http长连接可以提高http请求/响应的性能。

There are several advantages of using persistent connections, including:

Network friendly. Less network traffic due to fewer setting up and tearing down of TCP connections.
Reduced latency on subsequent request. Due to avoidance of initial TCP handshake
Long lasting connections allowing TCP sufficient time to determine the congestion state of the network, thus to react appropriately.



The advantages are even more obvious with HTTPS or HTTP over SSL/TLS. There, persistent connections may reduce the number of costly SSL/TLS handshake to establish security associations, in addition to the initial TCP connection set up.
In HTTP/1.1, persistent connections are the default behavior of any connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server. However, the protocol provides means for a client and a server to signal the closing of a TCP connection.


What makes a connection reusable?
Since TCP by its nature is a stream based protocol, in order to reuse an existing connection, the HTTP protocol has to have a way to indicate the end of the previous response and the beginning of the next one. Thus, it is required that all messages on the connection MUST have a self-defined message length (i.e., one not defined by closure of the connection). Self demarcation is achieved by either setting the Content-Length header, or in the case of chunked transfer encoded entity body, each chunk starts with a size, and the response body ends with a special last chunk.

因为TCP是基于流的协议,所以HTTP协议需要有一种方式来指示前一个响应的结束和后一个响应的开始来重用已建立的连接。所以,它要求连接中传输的信息必须有自定义的消息长度。自定义消息长度可以通过设置 Content-Length 消息头,若传输编码的实体内容块,则每个数据块的标明数据块的大小,而且响应体也是以一个特殊的数据块结束。

What happens if there are proxy servers in between?
Since persistent connections applies to only one transport link, it is important that proxy servers correctly signal persistent/or-non-persistent connections separately with its clients and the origin servers (or to other proxy servers). From a HTTP client or server’s perspective, as far as persistence connection is concerned, the presence or absence of proxy servers is transparent.


What does the current JDK do for Keep-Alive?
The JDK supports both HTTP/1.1 and HTTP/1.0 persistent connections.

When the application finishes reading the response body or when the application calls close() on the InputStream returned by URLConnection.getInputStream(), the JDK’s HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.

The support for HTTP keep-Alive is done transparently. However, it can be controlled by system properties http.keepAlive, and http.maxConnections, as well as by HTTP/1.1 specified request and response headers.

JDK同时支持HTTP/1.1 和 HTTP/1.0。
当应用程序读取完响应体内容后或者调用 close() 关闭了URLConnection.getInputStream()返回的流,JDK中的HTTP协议句柄将关闭连接,并将连接放到连接缓存中,以便后面的HTTP请求使用。
对HTTP keep-Alive 的支持是透明的。但是,你也可以通过系统属性http.keepAlive和http.maxConnections以及HTTP/1.1协议中的特定的请求响应头来控制。

The system properties that control the behavior of Keep-Alive are:
default: true

Indicates if keep alive (persistent) connections should be supported.
default: 5

Indicates the maximum number of connections per destination to be kept alive at any given time

HTTP header that influences connection persistence is:
Connection: close

If the “Connection” header is specified with the value “close” in either the request or the response header fields, it indicates that the connection should not be considered ‘persistent’ after the current request/response is complete.


默认: true

默认: 5

影响长连接的HTTP header是:
Connection: close
如果请求或响应中的Connection header被指定为close,表示在当前请求或响应完成后将关闭TCP连接。

The current implementation doesn’t buffer the response body. Which means that the application has to finish reading the response body or call close() to abandon the rest of the response body, in order for that connection to be reused. Furthermore, current implementation will not try block-reading when cleaning up the connection, meaning if the whole response body is not available, the connection will not be reused.


What’s new in Tiger?
When the application encounters a HTTP 400 or 500 response, it may ignore the IOException and then may issue another HTTP request. In this case, the underlying TCP connection won’t be Kept-Alive because the response body is still there to be consumed, so the socket connection is not cleared, therefore not available for reuse. What the application needs to do is call HttpURLConnection.getErrorStream() after catching the IOException , read the response body, then close the stream. However, some existing applications are not doing this. As a result, they do not benefit from persistent connections. To address this problem, we have introduced a workaround.

The workaround involves buffering the response body if the response is >=400, up to a certain amount and within a time limit, thus freeing up the underlying socket connection for reuse. The rationale behind this is that when the server responds with a >=400 error (client error or server error. One example is “404: File Not Found” error), the server usually sends a small response body to explain whom to contact and what to do to recover.

当应用接收到400或500的HTTP响应时,它将忽略IOException 而另发一个HTTP 请求。这种情况下,底层的TCP连接将不会再保持,因为响应内容还在等待被读取,socket 连接未清理,不能被重用。应用可以在捕获IOException 以后调用HttpURLConnection.getErrorStream() ,读取响应内容然后关闭流。但是现存的应用没有这么做,不能体现出长连接的优势。为了解决这个问题,介绍下workaround。

当响应体的状态码大于或等于400的时候,workaround 将在一定时间内缓存一定数量的响应内容,释放底层的socket连接来重用。基本原理是当响应状态码大于或等于400时,服务器端会发送一个简短的响应体来指明连接谁以及如何恢复连接。

Several new Sun implementation specific properties are introduced to help clean up the connections after error response from the server.

The major one is:

default: false

With the above system property set to true (default is false), when the response code is >=400, the HTTP handler will try to buffer the response body. Thus freeing up the underlying socket connection for reuse. Thus, even if the application doesn’t call getErrorStream(), read the response body, and then call close(), the underlying socket connection may still be kept-alive and reused.

The following two system properties provide further control to the error stream buffering behavior:

sun.net.http.errorstream.timeout=<int> in millisecond
default: 300 millisecond

sun.net.http.errorstream.bufferSize=<int> in bytes
default: 4096 bytes

默认: false

当上面属性设置为true后,在接收到响应码大于或等于400是,HTTP 句柄将尝试缓存响应内容。释放底层的socket连接来重用。所以,即便应用不调用getErrorStream()来读取响应内容,或者调用close()关闭流,底层的socket连接也将保持连接状态。

sun.net.http.errorstream.timeout=<int> in 毫秒
默认: 300 毫秒

sun.net.http.errorstream.bufferSize=<int> in bytes
默认: 4096 bytes

What can you do to help with Keep-Alive?
Do not abandon a connection by ignoring the response body. Doing so may results in idle TCP connections. That needs to be garbage collected when they are no longer referenced.

If getInputStream() successfully returns, read the entire response body.

When calling getInputStream() from HttpURLConnection, if an IOException occurs, catch the exception and call getErrorStream() to get the response body (if there is any).

Reading the response body cleans up the connection even if you are not interested in the response content itself. But if the response body is long and you are not interested in the rest of it after seeing the beginning, you can close the InputStream. But you need to be aware that more data could be on its way. Thus the connection may not be cleared for reuse.

Here’s a code example that complies to the above recommendation:

如果getInputStream()返回成功,读取全部响应内容。如果抛出IOException ,捕获异常并调用getErrorStream() 读取响应内容(如果存在响应内容)。



If you know ahead of time that you won’t be interested in the response body, you should issue a HEAD request instead of a GET request. For example when you are only interested in the meta info of the web resource or when testing for its validity, accessibility and recent modification. Here’s a code snippet:

如果你预先就对响应内容不感兴趣,你可以使用HEAD 请求来代替GET 请求。例如,获取web资源的meta信息或者测试它的有效性,可访问性以及最近的修改。下面是代码片段: