Most people don't realize that bilinear filtering an RGBA8 ( fixed point, 8 bits per channel ) texture results in a fixed point value.
What I mean is extremely simple. RGB8/RGBA8 textures are a standard used commonly everywhere. In shaders, you usually sample a texture like this (GLSL):
vec4 color = texture2D(tex, uv);
If filtering has been enabled on the texture, the values in 'color' are not full fp32 precision. They are still 8 bits.
I blame this on legacy hardware. I have confirmed this behavior on both NVidia and ATI cards, including the modern ones.
Of course, it only becomes visible when you sum up a lot of textures together, or when you scale a value by a huge factor.
I always have a shudder when I think of all the people using lookup tables or scaled textures in RGBA8, and who don't realize what they are doing.
Here is an example. First a picture of a relatively dark area of a texture, heavily magnified ( needed to demonstrate filtering after all ). There are blocky square artifacts coming from the jpeg compression, ignore them, and just verify in photoshop or your preferred image editor that the pixel values have a smooth gradiant, decreasing 1 by 1 (actually 1/255) between each adjacent:
No scaling
Now the pixel shader simply multiplies this pixel by a constant value of 5. Notice how the pixel values jump by 5 by 5:
Scaling x5
To fix this problem, there are two ways that I know:
1. Perform bilinear filtering yourself: the shader pipeline is fully 32 bits, so while you're sampling 4 texels in 8 bits, the bilinear interpolation will be in full precision. This has a high performance cost.
2. Use a fp16 or fp32 internal format ( at the cost of additional video memory ).
Those two solutions both have a serious cost, either in performance or memory. Why NVidia and ATI haven't implemented bilinear filtering of RGBA8 textures in full precision in hardware yet is beyond my understanding.
Are you certain that the banding that you see is not due to the fact that there is only 8-bits per channel of information in the source texture? For example, if you were to use an fp16 or fp32 texture with the same values as the sample you provided, wouldn't you get the same result by multplying by 5???