How to extract relevant frames from youtube (or any) video?

I am working on video summarization task and want to extract relevant static frames from the video.

Idea is: user describes his interests and AI summarizes video in text as well as compliments text sumary with relevant images extracted from video.

Ex:

"how to grill a steak" - images of BBQ, unpacking, spicing, temperature measurement, flares and then final result.

"top investment advisers" - face shots of top advisers, snapshots of charts of their performance, etc...

Looking for ideas on approaches to accomplish this.

2 comments

skool.com/ai-developer-accelerator

Master AI & software development to build apps and unlock new income streams. Transform ideas into profits. 💡➕🤖➕👨‍💻🟰💰

Leaderboard (30-day)

+11

+10