Skip to content

Similar Videos - enable option to use hardware acceleration in ffmpeg operations wherever possible #1447

Open
@mardab

Description

Feature Description

Since Similar Videos mode uses ffmpeg to dump video samples in common (raw) format, and that part of the process takes majority of total time, I suggest adding UI options to enable use of hardware acceleration flags in ffmpeg to significantly reduce video hashing time.

Points to consider:

  • there are several methods for acceleration available (note: outdated info in Platform API Availability section, there are now more platforms partially covered by VA-API) , though the most popular ones are VA-API for most currently-supported video transcoding circuits and NVDEC for nVidia
  • each acceleration API requires different flags in ffmpeg, resulting more work to prepare and maintain, once supported
  • hardware acceleration is hard-limited to supported codecs, profiles, and features, throwing exit error upon mismatch, this could be alleviated by retrying same video file on CPU (fallback/processing unaccelatable videos) upon error exit from "accelerated" ffmpeg call
  • with hardware acceleration, worker management will become more complex, since a large portion of hardware setups do contain 2 or more hardware video transcoding units, e.g. an AMD/Intel CPU (VA-API) with AMD (VA-API) or nVidia (NVDEC) dedicated GPU, to tap into the most potential speedup gains the workers management would have to include addressing those "special" workers, with all previous points taken to consideration

Reasoning for this FR:

While I was slowly writing and editing above text, freshly-built krokiet set to 7 threads on PC with 8-thread 2nd generation Ryzen achieved hashing 8% of videos on topped external storage, not saturating I/O, but often hitting 100% of CPU along with browser window I'm writing this in, while I have 2 VA-API devices (iGPU and dGPU) basically idling. Idea came from past use of VA-API in ffmpeg batch jobs, where accelerated transcode managed to speed up 3-5x over 7-thread CPU usage per device, thus even if a non-negligible portion of videos had to fallback to CPU processing, whereas I'm certain that is not the case in my current case, the time save for just decoding small portions of videos for hashing could greatly reduce total processing time, of which I am also certain that wouldn't be a minor case.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions