Tip of the day: bilinear filtering accuracy

posted in Journal of Ysaneya

Published April 16, 2008

Today's a short tip of the day aimed at graphics programmers.

Most people don't realize that bilinear filtering an RGBA8 ( fixed point, 8 bits per channel ) texture results in a fixed point value.

What I mean is extremely simple. RGB8/RGBA8 textures are a standard used commonly everywhere. In shaders, you usually sample a texture like this (GLSL):

vec4 color = texture2D(tex, uv);

If filtering has been enabled on the texture, the values in 'color' are not full fp32 precision. They are still 8 bits.

I blame this on legacy hardware. I have confirmed this behavior on both NVidia and ATI cards, including the modern ones.

Of course, it only becomes visible when you sum up a lot of textures together, or when you scale a value by a huge factor.

I always have a shudder when I think of all the people using lookup tables or scaled textures in RGBA8, and who don't realize what they are doing.

Here is an example. First a picture of a relatively dark area of a texture, heavily magnified ( needed to demonstrate filtering after all ). There are blocky square artifacts coming from the jpeg compression, ignore them, and just verify in photoshop or your preferred image editor that the pixel values have a smooth gradiant, decreasing 1 by 1 (actually 1/255) between each adjacent:

No scaling

Now the pixel shader simply multiplies this pixel by a constant value of 5. Notice how the pixel values jump by 5 by 5:

Scaling x5

To fix this problem, there are two ways that I know:

1. Perform bilinear filtering yourself: the shader pipeline is fully 32 bits, so while you're sampling 4 texels in 8 bits, the bilinear interpolation will be in full precision. This has a high performance cost.

2. Use a fp16 or fp32 internal format ( at the cost of additional video memory ).

Those two solutions both have a serious cost, either in performance or memory. Why NVidia and ATI haven't implemented bilinear filtering of RGBA8 textures in full precision in hardware yet is beyond my understanding.

Previous Entry Terrain texturing explained

Next Entry Meta server system

0 likes 12 comments

Comments

Jason Z

Hi Ysaneya,

Are you certain that the banding that you see is not due to the fact that there is only 8-bits per channel of information in the source texture? For example, if you were to use an fp16 or fp32 texture with the same values as the sample you provided, wouldn't you get the same result by multplying by 5???

April 16, 2008 12:28 PM

Ysaneya

Of course there is only 8 bits of source data, this is the whole point I'm making. But bilinear interpolation also gives a result that is 8 bits.

If you have a texel with a value of 7/255 and its adjacent is 8/255, then if you are sampling a pixel that is near the middle, like at 0.499, you won't get a value of 7.4999/255 in the shader as you would expect to, you will get 7/255.

April 16, 2008 12:46 PM

Jason Z

I am not saying that your statement about bilinear filtering is incorrect - in fact, I was not aware of this and thus can not dispute what you are saying.

What I am saying is that your example does not really show the effect that you are talking about. Multiplying an integer by an integer will produce an integer regardless of the underlying data type. I don't understand the connection between multiplying an 8 bit number by an integer (producing banding) and getting lower precision bilinear samples - what else would you expect to happen in your example if the pipeline was full fp32?

To test if bilinear samples are returned in a lower precision format than full fp32, I would expect that you need to have some information that can only be expressed in fp32 - i.e. a repeating decimal value like 0.1111111111, that if represented in fp32 and summed a bunch of times would produce a different integer value than a lower precision version doing the same thing.

Am I missing the point of your statement?

April 16, 2008 02:17 PM

Ysaneya

Quote: Original post by Jason Z
What I am saying is that your example does not really show the effect that you are talking about.

Let me be more precise.


float val = texture2D(tex, uv).x;
val = val * 10.0

What kind of values do you expect "val" to take ?

My point is that for a texel that is filtered (taking my previous example of 7/255 and 8/255), the value "val" can take *after the multiplication by 10* is either 70/255 *OR* 80/255.

With full precision bilinear filtering, you'd get all values in the range between 70/255 and 80/255.

Wouldn't you naturally expect (even if the source data is 8 bits) a pixel in the middle of those two texels to have a value of 75/255 (7.5/255 * 10.0) ?

April 16, 2008 03:58 PM

Ysaneya

Here is an example of an actual usage that would lead to incorrect values. Imagine that you want to texture a spherical surface with some noise computed in the shader. Let's imagine that you use an RGB8 renormalization cube map:


/// gl_TexCoord[0].xyz is the interpolated vertex position from the vertex shader
vec3 xyz = textureCube(renormCubeMap, gl_TexCoord[0].xyz);

float noiseValue = noise3D(xyz * frequency);

Then you'll get blocky artifacts in your returned noise value, that are more and more visible when frequency becomes much bigger than 1.0

Any code that uses lookup tables should be suspicious.

April 16, 2008 04:05 PM

mbaitoff

it could be checked in the following way:

consider you have a 8-bit int.fmt. texture with dimensions 2x1 pix. left pixel stores black (0), right pixel - white (255). consider there is a triangle on the screen 100 pixels "wide". you say that this triangle textured with the texture will contain only black OR white pixels with no shades?

April 17, 2008 03:26 AM

jhoward

mbaitoff said:
consider you have a 8-bit int.fmt. texture with dimensions 2x1 pix. left pixel stores black (0), right pixel - white (255). consider there is a triangle on the screen 100 pixels "wide". you say that this triangle textured with the texture will contain only black OR white pixels with no shades?

I don't think this is what he's saying. The bilinear filtering would still work to 8-bit precision so from black to white you would get a smooth gradient with variations of grey across the triangle. The point is that if you sampled the texture the precision only lies in 8-bit, i.e 0, 1, 2 etc not 0.1, 0.2, 0.3 etc

I actually noticed this happening in a water shader I was trying to write recently. I was getting some horrible artifacts which made me think that I had bilinear filtering disabled but it was actually the lack of precision (one texture was a coordinate offset lookup into another texture)

What would be the processing impact on using a floating point texture format? Would you recommend this over doing the bilinear filtering yourself?

James

April 17, 2008 03:50 AM

Ysaneya

Quote: Original post by jhoward
mbaitoff said:
consider you have a 8-bit int.fmt. texture with dimensions 2x1 pix. left pixel stores black (0), right pixel - white (255). consider there is a triangle on the screen 100 pixels "wide". you say that this triangle textured with the texture will contain only black OR white pixels with no shades?

I don't think this is what he's saying. The bilinear filtering would still work to 8-bit precision so from black to white you would get a smooth gradient with variations of grey across the triangle. The point is that if you sampled the texture the precision only lies in 8-bit, i.e 0, 1, 2 etc not 0.1, 0.2, 0.3 etc

Correct.

Quote: Original post by jhoward
What would be the processing impact on using a floating point texture format? Would you recommend this over doing the bilinear filtering yourself?

You can test it yourself: changing to a fp16 format only requires to change the internal format argument (in OpenGL), so it's a quick 1 line change.

It should be sensibly faster than sampling 4 times and computing the filtering yourself. However it'll cost twice as much video memory.

April 17, 2008 04:05 AM

Jason Z

I see now - your example is clear to me now. Thinking about it now, it may very well be part of a specification that the sample return the data in the precision of the source data. Clearly NV/AMD have hardware floating point filtering, so I can't imagine why they wouldn't use it for everything if it wasn't required to do something else.

Anyhow, thanks for the clarification. Keep up the good work on Infinity Station!

April 17, 2008 05:12 AM

mbaitoff

my bad, my mistake!

i'd like to correct myself: one should try to use 2x1 texture with pixels (1,0,0) and (2,0,0) for example, then texturing a wide triangle with those colors multiplied by 64. one should get a two-shade (64 and 128 only) triangle, is that the point?

April 17, 2008 10:33 PM

Ysaneya

Quote: Original post by mbaitoff
my bad, my mistake!

i'd like to correct myself: one should try to use 2x1 texture with pixels (1,0,0) and (2,0,0) for example, then texturing a wide triangle with those colors multiplied by 64. one should get a two-shade (64 and 128 only) triangle, is that the point?

Yes, exactly.

April 18, 2008 03:35 AM

JHoule

OK, I'm way past interest time in this post, but I felt I should say something.

Quote: Those two solutions both have a serious cost, either in performance or memory. Why NVidia and ATI haven't implemented bilinear filtering of RGBA8 textures in full precision in hardware yet is beyond my understanding.

It's funny that you would complain about the cost of full precision in your alternatives, yet not realize that this is exactly why hardware actually limits precision... Full FP32 bilinear is really costly compared to FP16, and even much more compared 8b fixed-point LERP units. Hardware engineers also have restrictions, and need to balance things out when deciding where to allocate gates. What would you rather have? RGBA8 twice (if not four times) as fast? Or full precision for this specific corner case?...

That said, we will eventually get to full 32b texture units. Support has already appeared. I'd argue it's just not quite the time. The way I see it, you are a trailblazer here...! [wink]

P.S. You might be interested to know that the most recent ATI architectures try to improve bilinear precision beyond the native texture format (e.g. 8b in this example). It's not full 32b float precision, but it's a start...!

May 13, 2008 09:13 AM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

Ysaneya

Author

🎉 Celebrating 25 Years of GameDev.net! 🎉

Tip of the day: bilinear filtering accuracy

Comments

Ysaneya

Latest Entries

Patch 0.1.6.0 screenshots

A retrospective on the Infinity project

Tech Demo Video 2010

ASEToBin 1.0 release

Tip of the day: logarithmic zbuffer artifacts fix

Seamless filtering across faces of dynamic cube map

Audio engine and various updates

Galaxy generation

Deferred lighting and instant radiosity

Detail textures

🎉 Celebrating 25 Years of GameDev.net! 🎉

Tip of the day: bilinear filtering accuracy

Comments

Ysaneya

Latest Entries

Patch 0.1.6.0 screenshots

A retrospective on the Infinity project

Tech Demo Video 2010

ASEToBin 1.0 release

Tip of the day: logarithmic zbuffer artifacts fix

Seamless filtering across faces of dynamic cube map

Audio engine and various updates

Galaxy generation

Deferred lighting and instant radiosity

Detail textures

Reticulating splines