What every coder should know about gamma (2016)
Posted by sph 3 days ago
Comments
Comment by 20k 1 day ago
>The transformation used to represent the physically linear intensity data either generated synthetically via an algorithm or captured by a linear device (such as a CMOS of a digital camera or a scanner) with the discrete values of the perceptually linear scale is called gamma encoding.
This isn't super correct, and it underscores the biggest issue in this article:
sRGB (and its gamma encoding function) has absolutely nothing to do with perceptual linearity. sRGB is not perceptually linear! The original gamma encoding as far as I'm aware was made to compensate for the nonlinear transfer function of CRTs back in ye olde days. Its true that human vision is nonlinear, but sRGB is not a particularly good match to the perceptual linearity of human vision. Its a really common error to make, and leads to people wondering why we can't use sRGB to blend in if the reason why it was invented is because its perceptually linear
The article goes to compound on this mistake, which is why this is such a problematic misconception:
> Interestingly, Photoshop antialiases text using γ=1.42 by default, and this indeed seems to yield the best looking results (middle image). The reason for this is that most fonts have been designed for gamma-incorrect font rasterizers, hence if you use linear space (correctly), then the fonts will look thinner than they should.
This is where the mistakes start to add up
Consider what you're trying to achieve during antialiasing: when rasterising a line, lets say we discover that a pixel is only 40% covered and want to darken it. This means that we want our pixel's brightness to decrease by 40% to a human being. We don't want to emit 40% less light, because that's not what antialiasing is trying to achieve!
Both sRGB and linear colour are the wrong colour spaces to use. You want to blend in a perceptually linear colourspace, and photoshop's 1.42 gamma exponent probably maps better to human vision than 2.2 or 1.0 while being cheaper than a LUV conversion
>The standard gamma (γ) value to use in computer display systems is 2.2. The main reason for this is because a gamma of 2.2 approximately matches the power law sensitivity of human vision
The gamma transfer functions are also wrong. Its worth getting hung up on because it actually causes nontrivial errors, especially in the age of hardware accelerated sRGB conversions where doing it correctly is free
Comment by vektorsigma 1 day ago
When doing a smooth fade-to-black in video, you may want to gradually decrease the amount of emitted light from the whole frame in a way that is smooth to a human. Here I think you should consider how a perceptual space can help.
Comment by 20k 1 day ago
The antialiasing value you get represents how much the pixel is covered by the glyph in question, and directly represents a desired change in perceptual brightness. There's no physical underlying lighting process, so it doesn't make sense to use physical light units
Blending in linear RGB models different strength lights being mixed together. This is why you do want to blend images together in linear RGB, but not fonts - because its not an underlying light based transport process
To take a direct example: Imagine two cases
1. We blend a white font on a black background
2. We blend a black font on a white background
Using perceptual blending, the antialiasing will be exactly the same efficacy in both cases. Using blending in linear space, these two test cases will look very different and render incorrectly!
Comment by penteract 21 hours ago
I disagree with you here. Text rendering specifically is incredibly complicated, but for antialiasing in other contexts, the problem can be seen as trying to approximate what would be seen if the display had higher resolution and the viewer has blurry eyesight. In this model, a linear color space makes sense - if 60% of the pixels within a region on a higher dpi display would be lit, then that is best approximated* by a single pixel emitting the same number of photons as those pixels would (which is 60% as many photons as there should be in the situation where all pixels on the higher dpi display would be lit).
See https://en.wikipedia.org/wiki/Spatial_anti-aliasing#Anti-ali... .
*there are better filters if you're looking at more than one pixel at once
> Using perceptual blending, the antialiasing will be exactly the same efficacy in both cases. Using blending in linear space, these two test cases will look very different and render incorrectly!
Whatever color space you do your blending in, 40% black onto white should look the same as 60% white onto black.
Comment by 20k 19 hours ago
This assumes that the output from the coverage process represents a semi transparent line with a light shining through it, which isn't what font rendering outputs. It outputs a perceptual brightness, because if a cell is 50% covered, we want it to be 50% dark. Not emitting 50% of the photons
>Whatever color space you do your blending in, 40% black onto white should look the same as 60% white onto black.
What you want is 40% black onto white to have a similar difference in intensity as 40% white onto black, otherwise your darkmode font will look significantly different at the same intensity as your lightmode font. This is why it doesn't make sense to do it in a linear colourspace
Note that the wikipedia article is wrong, given that photoshop uses a nontrivial gamma exponent
Comment by penteract 11 hours ago
Thanks for putting this clearly. I had not given this argument enough thought and respect previously. Would you agree if I said this is about maximizing the amount of useful information given to the reader (even if it deviates from approximating a printed page) and a perceptual colour space is the way to measure that information?
I should mention that I can find plenty of resources that suggest you should use a different font for dark-on-light vs light-on-dark (although I'm aware I'm not a good judge of the quality of said resources). This is not necessarily opposed to your point, since your reasoning can be extended to conclude that identically shaped printed text subject to blurring in linear colour space would be perceived differently depending on whether it's light-on-dark or dark-on-light (including when it's naturally blurred due to imperfect eyesight).
> Note that the Wikipedia article is wrong, given that photoshop uses a nontrivial gamma exponent
If we set text rendering aside, and consider something like games which prioritize photorealism rather than legibility, would you agree that linear colour space is the sensible one to do antialiasing in? This is for essentially the same reason you should do image resizing in linear color space, for which Wikipedia's citation [6] provides a convincing demonstration.
Comment by brookst 23 hours ago
Comment by adrian_b 1 day ago
The reason why gamma has been preserved in digital television even after CRTs have become obsolete is that it happens to perform a dynamic range compression that allows the use of 8 bits for luminance or for color components without making too visible the steps between adjacent color values.
If you want to encode the color components linearly, you need to use more than 8 bits, preferably the FP16 format, which was originally introduced in GPUs especially for this purpose.
So today the only purpose of gamma is as a method of data size compression that is specific to images, by allowing the reduction of the number of bits per pixel, while keeping acceptable the degradation of the image quality.
It is probable that the standard gamma curves are not optimal for data compression, but the slight improvements in image quality that could have been attained with other curves are not worth the complications that would have been created by abandoning the compatibility with legacy recordings.
Comment by arghwhat 22 hours ago
Even within "SDR"/"sRGB", many mistakes crop up from people erroneously mixing content encoded with the piecewise sRGB transfer function with content encoded according to a plain gamma 2.2 transfer function. And this is before we are getting into e.g., incorrect blending spaces or mismatched primaries.
But yes, it is purely a matter of compression, with many options for exactly what dynamic range you need and how you want your content defined (e.g., sRGB, gamma2.2, scRGB, HLG, PQ, ...), with linear light primarily reserved as an intermediate space for color conversions and blending - something your display server and any software working with arbitrary color spaces will be using.
Comment by adrian_b 22 hours ago
Such differences in standards already existed in analog television, because, depending on how they were made, the CRTs also had slightly different transfer curves from grid voltage (where the video signal was applied) to anode current (which is proportional with the luminance of the pixel component), and the regional TV standards accounted for the dominant manufacturers of the CRTs sold in that region.
Comment by arghwhat 20 hours ago
NTSC was gamma 2.2, and PAL/SECAM was gamma 2.8, which was indeed initially partly caused by local manufacturing differences before international brands took over, but neither "standard" was really followed by anyone. In the end, concluding that it was all a total mess, we split the difference in the early 90's by formally defining both to gamma 2.4 in BT.709. As such, their curves are the same.
(Manufacturing derivation was outside the scope, as manufacturers did whatever was convenient or sold sets, going all over the place with their response curves regardless of what region they were from or targeted. This remains true today - see any new TVs standard color response.)
Comment by evilturnip 1 day ago
This is why when these older games crashed on your PC, the monitor would look all washed out due to the manual gamma adjustments the game made that didn't get restored.
Comment by Dwedit 1 day ago
Comment by joedrago 1 day ago
Comment by joedrago 1 day ago
Comment by zokier 1 day ago
Comment by raphlinus 1 day ago
The gradient examples between high-chroma colors of similar luminance are highly misleading in my opinion. In that particular case, linear just happens to do well (and device RGB of course poorly), but in other cases linear is not great. For example, blue to white is especially bad, with hue shifts as well as lightness non-uniformity.
You can experiment with this in the interactive tester in my Oklab review[1].
[1]: https://raphlinus.github.io/color/2021/01/18/oklab-critique....
Comment by leni536 1 day ago
Comment by Espressosaurus 1 day ago
Comment by leni536 1 day ago
Comment by shaggie76 1 day ago
Comment by tomhow 1 day ago
What every coder should know about gamma (2016) - https://news.ycombinator.com/item?id=27721094 - July 2021 (50 comments)
What every coder should know about gamma - https://news.ycombinator.com/item?id=12552094 - Sept 2016 (183 comments)
Comment by dmitshur 1 day ago
It's interesting how this part of the trade-off changes when using float16 for color components (as is common when HDR is involved) rather uint8.
Good timing that Safari 27 adds support for srgb-linear and display-p3-linear color spaces.
Comment by keynha 1 day ago
Comment by dheera 1 day ago
I felt the first one looked more even. On the first I could tell the difference between every two adjacent bars. On the second one I couldn't tell any difference between the first 4-5 bars.
Comment by tobr 1 day ago
Comment by account42 1 day ago
Comment by hatthew 1 day ago
Comment by Const-me 19 hours ago
Can’t reproduce. Tested on two monitors on my desk, designer-targeted Benq and cheap laptop. On the Benq, darkest 3 segments are indistinguishable, the 4-th one barely distinguishable. On the laptop, darkest 4 segments are indistinguishable, the 5-th barely distinguishable. However, on the “emitted light intensity” all bars are clearly visible.
> Image resizing
“Unsurprisingly, C gives the correct result” On my computers B very similar to A, just a tiny bit darker. While the “correct” C result is a lot lighter than A.
Also from the same section:
> B the result of resizing the pattern by 50% directly in sRGB-space (using bicubic interpolation)
Bicubic interpolation is only applicable when enraging images; downsampling is very different problem from interpolation.
Comment by jbritton 1 day ago
Comment by blt 1 day ago
Comment by mianos 1 day ago
An exponential/log function requires arbitrary clamping or offsets because you cannot represent pure black, 0, on a pure log scale without hitting negative infinity.
Basically, it fits better, aside from a good map of human perception
Comment by LoganDark 1 day ago
Technically, this is not always incorrect, if your working color space is linear and 0 is no light. The problem only comes if you hand that same data to routines or surfaces expecting sRGB or another nonlinear color space (or one where 0 is not no light).
Comment by nomel 1 day ago
Oh, interesting. What's an example of this? Some sort of log space?
Comment by Someone 1 day ago
The bevel of a black iPhone is darker than its screen, even when powered off. Similarly, switched off CRT displays aren’t truly black.
Comment by LoganDark 1 day ago
Comment by account42 1 day ago
Comment by LoganDark 23 hours ago
Comment by LoganDark 1 day ago
Comment by pavlov 1 day ago
The most common 8-bit YUV format (e.g. in MPEG-2) uses a 16-235 range for valid luma values, so black is at 16 and white is at 235.
The reason for leaving this “headroom” and “footroom” had to do both with digitizing analog signals and avoiding clipping during processing.
Comment by nicebyte 1 day ago
at the fundamental level, if a surface is illuminated with one lightbulb and we add another light bulb, the difference is exteremely noticeable to the human eye. if we add one more lightbulb to a surface that is already illuminated by a hundred light bulbs, there will be no perceptible difference. the exact response can be modeled with a pretty simple power law (with a modification in the low range, as the article mentions).
that's all there really is to "gamma correction". it's a hack that exploits this quirk of the human visual system in order to more efficiently allocate bits for encoding different "lightness" values.
all of the confusion and bugs stem from one or more of the systems in the chain that forms the final image, making an incorrect assumption about what the others are doing. it's a bit like coordinate spaces in that regard.
Comment by nicebyte 1 day ago