Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I recently downloaded an "avif" thinking that I was downloading a gif. A little annoyed, I started poking around at it with ffmpeg and discovered that the file contained two video streams. I extracted both individually with -c copy into individual mkv's so I could play them individually.

The first video was the content I desired, the second video was solid white but for the same length of time as the first video. I was honestly a little flummoxed about the white stream for about 15 seconds before it hit me "this must be an alpha channel".

I don't suspect I have ever seen it in action, so I am extremely curious how well alpha channels can possibly turn out with lossy video formats where the result of any given frame is a matter of interpretation? In lossless formats like GIF the borders of objects at any given frame are perfectly defined, but lossy formats, especially ones using discrete cosine transform, where the object ends and background begins is not clear cut.



GIF only has binary transparency. Period. Also GIF is only lossless if you don't care about file size (using multiple 0 duration frames with different palettes trick) or material is limited in palette.

From my testing VP9 videos with transparency are fine if you aren't stingy with bitrate, and in general if source material isn't CGI things will be crusty at edges anyway (e.g. greenscreen with motion blur).


> multiple 0 duration frames with different palettes trick

What's that?


You get 256 colors per frame. Want more colors? Use more frames! There's some really impressive software/gifs at https://gif.ski/ if you want to see just how far it's possible to push this terrible format.


That's amazing.


In theory, AVIF supports palette-based blocks, so it can express perfectly sharp edges and even transcode a GIF with perfect accuracy. In practice these modes are not used.

A blurry alpha channel is just having soft edges. A naive encoder will cause some of the previously transparent background bleed into visible pixels, usually causing darkened halos around the edges. This is a common problem in video game assets, and the fixes are the same: either bleed color around the edges (make transparent pixels have the right RGB values), or use premultiplied alpha color space. AVIF (also in theory) supports marking images as having premultiplied alpha.


Video compression may only cause issues at the edges due to chroma subsampling, in which case bleeding chroma to transparent pixels would help. All other cases would be errors in the pipeline, such as incorrect gamma handling, wrong supersampling implementation, or mixing up whether content is premultiplied or not.

Also premultiplication is just a way to cache the multiplication operation:

    // using normal image
    // note that everything after the plus depends entirely on the image
    out = current * (1-img.a) + vec4(img.r*img.a, img.g*img.a, img.b*img.a, img.a)
    // using premultiplied image
    out = current * (1-img.a) + img
Which might make sense if you're drawing same bitmap very many times.


I didn't mention it in the article, but the web component supports premultiplied alpha https://github.com/jakearchibald/stacked-alpha-video?tab=rea...


GIF is not lossless. It reduces the pallette to 256 colors. From your normal 16.7 million colors for 8-bit videos or images.


As a little known quirk of history:

    a statement that the GIF image file format is limited to 256 colors is simply false.
https://web.archive.org/web/20140908023322/http://phil.ipal....

How is pretty sneaky.


Wow, that’s fascinating. I have a tiny macOS app that makes GIFs that I use for screen recordings, it has occurred to me before now that it produces surprisingly good quality images with no visible dithering, I wonder if it’s doing this.

(still feels very stupid to have to do it but until Google Docs lets you embed a silent auto playing mp4…)


By that definition 24-bit image is "lossy" to 32-bit images, and 32-bit images lossy to 64-bit images ad infinitum and lossless formats don't exist.

A gif is lossless in that it does not lose data. When you define a frame within the formats limitations, you will always receive that exact frame in return.

The conversion might have lost data but that's not the gif losing it. That's just what happens trying to fit a larger domain in a smaller domain.

The individual frames within a gif are bitmaps that do not lose detail across saves, extractions, rotations, etc. Each individual pixel is displayed exactly as originally defined.

Compare this to JPEG or MP4 where your individual pixel becomes part of a larger cosine field. There is no guarantee this pixel will exist and even if it does, its exact value is almost certainly changed. This is what "lossy" is.


> By that definition 24-bit image is "lossy" to 32-bit images, and 32-bit images lossy to 64-bit images

Only assuming your base images have that many bits of information.

Most cameras max out at 16 bits per channel when shooting RAW, and even those rarely have mostly noise in the lower bits.

I'm sure you can find an example of some supercooled telescope sensor that actually does capture 16 bits per channel, maybe even more.

In the real world your image source is usually either JPG (24 bits per pixel, already debayered and already lossy) or RAW (16 bits per pixel max, bayered).


In realistic scenes dynamic range is more important than precision. That is you can frequently find things that are many orders of magnitude brighter than each other and proper representation involves lots of bits and using them properly.


Yep, that second stream is the alpha channel. Lossy alpha channels have been used since VP8 (~15 years I think). They seem pretty well tested. You can also see examples in the article.


If you want a lossless format with animation and alpha channel and gigantic file sizes, there's always APNG




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: