Videos are typically already summarized substitutes for complex topics--topics for which you might need to read text or literature to get the full context of. Now we want to min-max and summarize the video themselves. Then what? If that video summary is too long, we throw it into another LLM to summarize the summary of the summary? To what extent does this end?
There's more to learning than just information density. There's visuals, presentations, explanations. And if you want more proof, then a video played a 2x speed is twice the information density, yet we all know that many videos would be extremely hard to retain anything from at that speed.
Lots of youtube videos are not the well-organized presentations you describe, but instead have a minute or two of good information with ten minutes of meandering asides, background you already know, and other fluff. Some are well-disguised clickbait. A good summary prevents a lot of time wasting.
As for the good videos, I can't watch them all. If I can skim a good summary, I can decide whether, for me, this video is worth watching, or ignoring, or just reading the summary more carefully.
There's more to learning than just information density. There's visuals, presentations, explanations. And if you want more proof, then a video played a 2x speed is twice the information density, yet we all know that many videos would be extremely hard to retain anything from at that speed.