Video

[source]

Utilities for reading, writing, inspecting and processing videos. Also supports image inspection.

CLI

You can use the video tools either via the module:

python -m kiui.video --help

or via the kivi shortcut:

kivi --help

Inspect video (or image) information

kivi info input.mp4

This prints a rich table with:

  • Video: Resolution, FPS, duration, frames, codec (with encoder hints), codec tag, bitrate, file size, compression ratio vs raw RGB.

  • Image: Resolution, channels, dtype, format, file size, compression ratio (respects actual dtype for raw-size calculation — e.g. float32 EXRs).

Example output:

      Video Info: input.mp4
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Field       ┃ Value                                 ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Type        │ Video                                 │
│ Resolution  │ 640 x 480                             │
│ FPS         │ 30.000                                │
│ Duration    │ 00:00:05 (5.00 s)                     │
│ Frames      │ 150                                   │
│ Codec       │ h264 (encoders: libx264 / h264_nvenc) │
│ Codec tag   │ avc1                                  │
│ Bitrate     │ 1569577 bps                           │
│ File size   │ 957.99 KB                             │
│ Compression │ 140.92x (raw / encoded)               │
└─────────────┴───────────────────────────────────────┘
# Inspect an image
kivi info photo.jpg

Resize a video

kivi resize input.mp4 output_640p.mp4 \
    --width 640 --height 480

By default, kivi:

  • Infers a reasonable encoder (libx264, libx265, mpeg4, …) from the input codec.

  • Tries to match the original bitrate per pixel (scaled by resolution / fps), to preserve quality and file size characteristics.

  • Keeps the codec tag (e.g. hvc1 vs hev1) when using HEVC / H.264 encoders for better compatibility.

  • Warns and rounds the output size to the closest encoder-compatible dimensions when chroma subsampling requires it (for example, libx265 / 4:2:0 output needs even width and height).

  • Runs ffmpeg quietly on successful resizes, while still showing kivi warnings and ffmpeg errors.

You can also override the encoder, quality, and framerate:

kivi resize input.mp4 output.mp4 \
    --width 640 --height 480 \
    --codec libx264 \
    --crf 23 \
    --fps 30

The --fps flag resamples frames: for downsampling it drops frames; for upsampling it uses motion-compensated interpolation (minterpolate) for smoother motion.

Create a share-friendly preview

# Create a preview that stays under 10 MB (default)
kivi preview input.mp4

# Specify output path and target size
kivi preview input.mp4 output.mp4 --target-mb 20

# Cap resolution and framerate
kivi preview input.mp4 --max-resolution 1280 --max-fps 24

# Adjust quality floor
kivi preview input.mp4 --crf 18 --preset slow

# Drop audio
kivi preview input.mp4 --audio-kbps 0

Preview uses capped CRF encoding: quality-driven CRF with a maxrate cap derived from the size target, so short/simple clips keep high quality while long/large ones stay under the cap. It defaults to H.264 / yuv420p / mp4 with faststart and AAC audio for maximum compatibility.

When the input is too large for the budget, the function downsamples — preferring resolution reduction over framerate reduction — until the target bits-per-pixel is met.

Full options:

Option

Default

Description

--target-mb

10.0

Target maximum file size in MB

--codec

libx264

Video encoder (libx264, libx265, h264_nvenc, hevc_nvenc)

--crf

23

CRF quality ceiling (lower = better quality)

--preset

medium

ffmpeg preset (slow, medium, fast, etc.)

--max-resolution

1920

Cap on the longest spatial side (px)

--min-resolution

480

Floor on the longest side (px)

--max-fps

30.0

Cap on framerate

--min-fps

15.0

Floor on framerate

--audio-kbps

128

AAC audio bitrate; 0 to drop audio

Split a video

# Split at custom timestamps (absolute seconds)
kivi split input.mp4 out_dir \
    --timestamps 10 20 30

# Split into uniform 10-second segments (last may be shorter)
kivi split input.mp4 out_dir \
    --uniform 10

# Only keep explicitly selected segments, drop the rest
kivi split input.mp4 out_dir \
    --timestamps 60 --drop_last

With --drop_last, timestamps are interpreted as segment lengths (not absolute boundaries): the first segment is [0, 60], and any remaining tail is dropped.

Output clips are named as:

<basename>_<idx>_<seconds>.mp4

where <idx> is a zero-padded index and <seconds> is the rounded duration of that segment.

You can also specify codec, CRF, and preset:

kivi split input.mp4 out_dir --uniform 10 \
    --codec libx265 --crf 22 --preset fast

API

kiui.video.read_video(path: str, mode: Literal[‘float’, ‘uint8’, ‘torch’, ‘tensor’] = 'float') Tuple[ndarray | Tensor, float][source]

Read a video file into a tensor / numpy array.

Parameters:
  • path – Path to the video file.

  • mode – Returned data type. - "uint8": uint8 numpy array, [T, H, W, 3], range [0, 255] - "float": float32 numpy array, [T, H, W, 3], range [0, 1] - "torch" / "tensor": float32 torch tensor, [T, H, W, 3], range [0, 1]

Returns:

Video frames in the requested format. fps: Frames per second of the video.

Return type:

video

kiui.video.write_video(path: str, video: Tensor | ndarray, fps: float, order: Literal[‘RGB’, ‘BGR’] = 'RGB', codec: str = 'mp4v') None[source]

Write a video from frames.

Parameters:
  • path – Path to write the video file.

  • video – Video frames, [T, H, W, C] where C is 3 or 4. Can be numpy array (uint8 or float in [0, 1]) or torch tensor.

  • fps – Frames per second.

  • order – Channel order of the input frames, "RGB" or "BGR".

  • codec – FourCC codec string for OpenCV, e.g. "mp4v", "XVID".

kiui.video.get_video_info(path: str) Dict[str, Any][source]

Inspect a video file using ffprobe and return metadata.

Requires ffmpeg / ffprobe to be installed in the system.

Parameters:

path – Path to the video file.

Returns:

  • path

  • width, height

  • fps

  • duration (seconds)

  • codec

  • codec_tag (fourcc / sample entry, e.g. "hvc1" or "hev1")

  • bitrate (bits per second)

  • filesize (bytes)

  • num_frames

  • raw_size (uncompressed RGB size in bytes)

  • compression_ratio (raw_size / filesize)

Return type:

dict with keys

kiui.video.print_video_info(path: str) None[source]

Pretty-print video (or image) information.

kiui.video.resize_video(input_path: str, output_path: str, width: int | None = None, height: int | None = None, codec: str | None = None, crf: int | None = None, preset: str = 'medium', fps: float | None = None) None[source]

Resize a video and save to a new file using ffmpeg.

Parameters:
  • input_path – Path to the input video.

  • output_path – Path to the output video.

  • width – Target width. If None, it will be inferred from height while keeping the aspect ratio.

  • height – Target height. If None, it will be inferred from width while keeping the aspect ratio.

  • codec – Video codec / encoder name for ffmpeg, e.g. "h264", "hevc", "libx264", "libx265", "mpeg4", "h264_nvenc", "hevc_nvenc". If None, try to pick a reasonable encoder based on the input codec.

  • crf – Constant Rate Factor (quality, lower is better) for CRF-based codecs (e.g. libx264 / libx265). If None, the function will try to roughly match the source video’s bitrate (scaled by resolution/fps) instead of using CRF. For "mpeg4", this is mapped to a quantizer value q:v internally when CRF is provided.

  • preset – ffmpeg preset, e.g. "slow", "medium", "fast".

  • fps – If not None, resample video to this FPS.

If the requested size is incompatible with the selected encoder’s default chroma subsampling, it is rounded to the closest compatible size and a warning is printed.

kiui.video.preview_video(input_path: str, output_path: str | None = None, target_mb: float = 10.0, codec: str = 'libx264', crf: int = 23, preset: str = 'medium', max_resolution: int = 1920, min_resolution: int = 480, max_fps: float = 30.0, min_fps: float = 15.0, audio_kbps: int = 128, target_bpp: float = 0.06, verbose: bool = True) str[source]

Create a share-friendly preview that stays under a file-size target.

The preview uses a widely compatible codec (defaults to H.264 / yuv420p in an mp4 container with faststart and AAC audio) and is encoded with “capped CRF”: quality-driven CRF with a maxrate cap derived from the size target, so short/simple clips keep high quality while long/large ones stay under the cap. Downsampling is applied only when the input is too large for the budget, preferring resolution reduction over framerate reduction.

Parameters:
  • input_path – Path to the input video.

  • output_path – Path to the output video. If None, <input>_preview.mp4 next to the input is used.

  • target_mb – Target maximum file size in MB (mebibytes). Default 10.

  • codec – Video encoder. Default "libx264" for best compatibility.

  • crf – Constant Rate Factor (quality; lower is better). Default 23.

  • preset – ffmpeg preset, e.g. "slow", "medium", "fast".

  • max_resolution – Cap on the longest spatial side (px). Inputs larger than this (e.g. 2K/4K) are downscaled. Default 1920.

  • min_resolution – Floor on the longest side (px) when downscaling for quality. Default 480.

  • max_fps – Cap on framerate. Inputs faster than this are resampled down.

  • min_fps – Floor on framerate when reducing fps for quality.

  • audio_kbps – AAC audio bitrate; also reserved from the size budget. Set 0 to drop audio. Ignored if the input has no audio.

  • target_bpp – Desired bits-per-pixel-per-frame used to decide downsampling.

  • verbose – Print a before/after summary.

Returns:

The output path.

kiui.video.split_video(input_path: str, output_dir: str, timestamps: Sequence[float], codec: str | None = None, crf: int | None = None, preset: str = 'medium', uniform: float | None = None, drop_last: bool = False) None[source]

Split a long video into shorter clips given split timestamps.

Parameters:
  • input_path – Path to the input video.

  • output_dir – Directory to save the clips.

  • timestamps

    Sequence of timestamps in seconds. - If drop_last=False and uniform=None: treated as absolute

    boundaries; the first segment always starts at 0 and the last one ends at the video duration.

    • If drop_last=True and uniform is None: treated as segment lengths; the first segment is [0, timestamps[0]], the second is [timestamps[0], timestamps[0] + timestamps[1]], etc. Any remaining tail of the video is dropped.

    Ignored if uniform is not None.

  • codec – Video codec / encoder name for ffmpeg, e.g. "h264", "hevc", "libx264", "libx265", "mpeg4", "h264_nvenc", "hevc_nvenc". If None, try to pick a reasonable encoder based on the input codec.

  • crf – Constant Rate Factor (quality, lower is better). If None, the function will try to roughly match the source video’s bitrate.

  • preset – ffmpeg preset, e.g. "slow", "medium", "fast".

  • uniform – If not None, split the video into uniform segments of this many seconds. If drop_last=True, any remaining tail shorter than this interval is dropped. If drop_last=False, a final shorter segment is kept. Cannot be used together with explicit timestamps.

  • drop_last – Drop any remaining part of the video that is not explicitly covered by timestamps or a full uniform interval.