Video tools

[source]

Utilities for reading, writing, inspecting and processing videos.

CLI

You can use the video tools either via the module:

kivi --help

or via the kivi shortcut:

kivi --help

Inspect video information

kivi info input.mp4

This prints a rich table including:

  • Resolution, FPS, duration

  • Codec and codec tag (e.g., hevc + hvc1 vs hev1)

  • Bitrate, file size and estimated compression ratio vs raw RGB frames

Resize a video

kivi resize input.mp4 output_640p.mp4 \
    --width 640 --height 480

By default, kivi:

  • Infers a reasonable encoder (libx264, libx265, mpeg4, …) from the input codec.

  • Tries to match the original bitrate per pixel (scaled by resolution / fps), to preserve quality and file size characteristics.

  • Keeps the codec tag (e.g. hvc1 vs hev1) when using HEVC / H.264 encoders for better compatibility.

You can also override the encoder and quality explicitly:

kivi resize input.mp4 output.mp4 \
    --width 640 --height 480 \
    --codec libx264 \
    --crf 23

Split a video

# Split at custom timestamps (absolute seconds)
kivi split input.mp4 out_dir \
    --timestamps 10 20 30

# Split into uniform 10-second segments (last may be shorter)
kivi split input.mp4 out_dir \
    --uniform 10

# Only keep explicitly selected segments, drop the rest
kivi split input.mp4 out_dir \
    --timestamps 60 --drop_last

Output clips are named as:

<basename>_<idx>_<seconds>.mp4

where <idx> is a zero-padded index and <seconds> is the rounded duration of that segment.

API

kiui.video.read_video(path: str, mode: Literal[‘float’, ‘uint8’, ‘torch’, ‘tensor’] = 'float', order: Literal[‘RGB’, ‘BGR’] = 'RGB') Tuple[ndarray | Tensor, float][source]

Read a video file into a tensor / numpy array.

Parameters:
  • path – Path to the video file.

  • mode – Returned data type. - "uint8": uint8 numpy array, [T, H, W, 3], range [0, 255] - "float": float32 numpy array, [T, H, W, 3], range [0, 1] - "torch" / "tensor": float32 torch tensor, [T, H, W, 3], range [0, 1]

  • order – Channel order, "RGB" or "BGR".

Returns:

Video frames in the requested format. fps: Frames per second of the video.

Return type:

video

kiui.video.write_video(path: str, video: Tensor | ndarray, fps: float, order: Literal[‘RGB’, ‘BGR’] = 'RGB', codec: str = 'mp4v') None[source]

Write a video from frames.

Parameters:
  • path – Path to write the video file.

  • video – Video frames, [T, H, W, C] where C is 3 or 4. Can be numpy array (uint8 or float in [0, 1]) or torch tensor.

  • fps – Frames per second.

  • order – Channel order of the input frames, "RGB" or "BGR".

  • codec – FourCC codec string for OpenCV, e.g. "mp4v", "XVID".

kiui.video.get_video_info(path: str) Dict[str, Any][source]

Inspect a video file using ffprobe and return metadata.

Requires ffmpeg / ffprobe to be installed in the system.

Parameters:

path – Path to the video file.

Returns:

  • path

  • width, height

  • fps

  • duration (seconds)

  • codec

  • codec_tag (fourcc / sample entry, e.g. "hvc1" or "hev1")

  • bitrate (bits per second)

  • filesize (bytes)

  • num_frames

  • raw_size (uncompressed RGB size in bytes)

  • compression_ratio (raw_size / filesize)

Return type:

dict with keys

kiui.video.print_video_info(path: str) None[source]

Pretty-print video (or image) information.

kiui.video.resize_video(input_path: str, output_path: str, width: int | None = None, height: int | None = None, codec: str | None = None, crf: int | None = None, preset: str = 'medium', fps: float | None = None) None[source]

Resize a video and save to a new file using ffmpeg.

Parameters:
  • input_path – Path to the input video.

  • output_path – Path to the output video.

  • width – Target width. If None, it will be inferred from height while keeping the aspect ratio.

  • height – Target height. If None, it will be inferred from width while keeping the aspect ratio.

  • codec – Video codec / encoder name for ffmpeg, e.g. "h264", "hevc", "libx264", "libx265", "mpeg4", "h264_nvenc", "hevc_nvenc". If None, try to pick a reasonable encoder based on the input codec.

  • crf – Constant Rate Factor (quality, lower is better) for CRF-based codecs (e.g. libx264 / libx265). If None, the function will try to roughly match the source video’s bitrate (scaled by resolution/fps) instead of using CRF. For "mpeg4", this is mapped to a quantizer value q:v internally when CRF is provided.

  • preset – ffmpeg preset, e.g. "slow", "medium", "fast".

  • fps – If not None, resample video to this FPS.

kiui.video.split_video(input_path: str, output_dir: str, timestamps: Sequence[float], codec: str | None = None, crf: int | None = None, preset: str = 'medium', uniform: float | None = None, drop_last: bool = False) None[source]

Split a long video into shorter clips given split timestamps.

Parameters:
  • input_path – Path to the input video.

  • output_dir – Directory to save the clips.

  • timestamps

    Sequence of timestamps in seconds. - If drop_last=False and uniform=None: treated as absolute

    boundaries; the first segment always starts at 0 and the last one ends at the video duration.

    • If drop_last=True and uniform is None: treated as segment lengths; the first segment is [0, timestamps[0]], the second is [timestamps[0], timestamps[0] + timestamps[1]], etc. Any remaining tail of the video is dropped.

    Ignored if uniform is not None.

  • codec – Video codec / encoder name for ffmpeg, e.g. "h264", "hevc", "libx264", "libx265", "mpeg4", "h264_nvenc", "hevc_nvenc". If None, try to pick a reasonable encoder based on the input codec.

  • crf – Constant Rate Factor (quality, lower is better). If None, the function will try to roughly match the source video’s bitrate.

  • preset – ffmpeg preset, e.g. "slow", "medium", "fast".

  • uniform – If not None, split the video into uniform segments of this many seconds. If drop_last=True, any remaining tail shorter than this interval is dropped. If drop_last=False, a final shorter segment is kept. Cannot be used together with explicit timestamps.

  • drop_last – Drop any remaining part of the video that is not explicitly covered by timestamps or a full uniform interval.