In recent years, there’s been a lot of discussion and interest in “data-oriented design”—a programming style that emphasizes thinking about how your data is laid out in memory, how you access it and how many cache misses it’s going to incur. With memory reads taking orders of magnitude longer for cache misses than hits, the number of misses is often the key metric to optimize. It’s not just about performance-sensitive code—data structures designed without sufficient attention to memory effects may be a big contributor to the general slowness and bloatiness of software.
Note: this post is adapted from an answer I wrote for the Computer Graphics StackExchange beta, which was shut down a few months ago. A dump of all the CGSE site data can be found on Area 51.
To perform antialiasing in synthetic images (whether real-time or offline), we distribute samples over the image plane, and ensure that each pixel gets contributions from many samples with different subpixel locations. This approximates the result of applying a low-pass kernel to the underlying infinite-resolution image—ideally resulting in a finite-resolution image without objectionable artifacts like jaggies, Moiré patterns, ringing, or excessive blurring. Read more…
It’s difficult to develop intuition for radiometric units. Radiant power, radiant intensity, irradiance, radiance—on first encountering these terms and their associated mathematical definitions, anyone’s legs would go wobbly! Building technical fluency with these concepts requires one to sit down and practice working with the math directly, and nothing can substitute for that—but reasoning can be greatly accelerated by having some good mental images that capture the essence of things. Read more…
Note: this post is adapted from an answer I wrote for the Computer Graphics StackExchange beta, which was shut down a couple of weeks ago. A dump of all the CGSE site data can be found on Area 51.
In computer graphics, we deal a lot with various radiometric units, such as flux, irradiance, and radiance, which quantify light in various ways. But there’s a whole other set of units for light, called photometric units, that also show up sometimes. It’s important to understand the relationship between radiometry and photometry, and when it’s appropriate to use one or the other.
When I was in college, I took a one-semester electronics course, and for a final project, we had a few weeks to build pretty much whatever we wanted. As a computer-science major (the only one in the class), obviously I jumped at the chance to do something with microprocessors and digital logic. I ended up building an electronic music box—a device that plays a preprogrammed song over headphones, using a PIC microcontroller.
It was an interesting journey into the (to me) strange and mysterious world of hardware and embedded software, and I got to do some fun things along the way, like etching and soldering my own circuit board. I came across the files again the other day and thought I’d share it.
Unless you’ve been living under a rock, you know that HDR and physically-based shading are practically the defining features of “next-gen” rendering. A new set of industry standard practices has emerged: RGBA16f framebuffer, GGX BRDF, filmic tonemapping, and so on. The runtime rendering/shading end of these techniques has been discussed extensively, for example in John Hable’s talk Uncharted 2: HDR Lighting and the SIGGRAPH Physically-Based Shading courses. But what I don’t see talked about so often is the other end of things: how do you acquire and author the HDR assets to render in your spiffy new physically-based engine?
The second coming of VR will be a great thing for the GPU hardware industry. With the requirements for high resolution, good AA, high framerates and low latency, successful VR applications will need substantial GPU horsepower behind them.
The PS4 GPU has 1.84 Tflop/s available, so—making a very rough back-of-the-envelope analysis—for a game running at 1080p / 30 Hz, it has about 30 Kflop per pixel per frame available. If we take this as the benchmark of the compute per pixel needed for “next-gen”-quality visuals, we can apply it to the Oculus DK2. There we want 75 Hz, and we probably want a resolution of at least 2460×1230 (extrapolated from the recommendation of 1820×910 for the DK1—this is the rendering resolution, before resampling to the warped output). That comes to 6.8 Tflop/s. A future VR headset with a 4K display would need 27 Tflop/s—and if you go up to 90 Hz, make it 32 Tflop/s!
The NVIDIA Titan Black offers 5.1 Tflop/s and the AMD R9 290X offers 5.6 Tflop/s. These are the biggest single GPUs on the market at present, but they appear to fall short of the compute power needed for “next-gen”-quality graphics on the DK2. This is very rough analysis, but there’s clearly reason to be interested in multi-GPU rendering for VR.
There’s an idea that’s been bouncing around in my head for awhile about a kind of deferred renderer that (as far as I know) hasn’t been done before. I’m sure I’m not the first to have this idea, but recently it crystallized in my head a bit more and I wanted to write a bit about it.
For the last several years, graphics folks have been experimenting with a variety of deferred approaches to rendering—deferred shading, light-prepass, tiled deferred, tiled forward, and so on. These techniques improve both the cleanliness and performance of a renderer by doing some or all of the following:
- Elegantly separate lighting code from “material” code (that is, code that controls the properties of the BRDF at each point on a surface), avoiding shader combination explosion
- Only process the scene geometry once per frame (excluding shadow maps and such)
- Reduce unnecessary shading work for pixels that will end up occluded later
- Reduce memory bandwidth costs incurred in reading, modifying and writing screen-sized buffers
Different rendering methods offer different tradeoffs among these goals. For example, light-prepass renderers reduce the memory bandwidth relative to deferred shading, at the cost of requiring an extra geometry pass.
Deferred texturing is another possible point in the deferred constellation—which as far as I know has not been implemented, though it’s been discussed. The idea here, as the name suggests, is to defer sampling textures and doing material calculations until you know what’s on screen.
Quaternions are pretty well-known to graphics programmers at this point, but even though we can use them for rotating our objects and cameras, they’re still pretty mysterious. Why are they 4D instead of 3D? Why do the formulas for constructing them have factors like \(\cos\theta/2\) instead of just \(\cos\theta\)? And why do they have the “double-cover” property, where there are two different quaternions (negatives of each other) that represent the same 3D rotation?
These properties need not be inexplicable teachings from on high—and you don’t have to go get a degree in group theory to comprehend them, either. There are actually quite understandable reasons for quaternions to be so weird, and I’m going to have a go at explaining them.
For graphics coders, vector and matrix math libraries are something we use nearly every day, and in just about every function we write. If you’ve been working in this field for long, you’ve probably used a dozen different vector libs in C++, not to mention the first-class vectors and matrices in HLSL and GLSL—and you’ve probably written your own vector lib, at least twice! It seems like every project and every coder has their own special flavor of this basic utility—nearly identical with, yet always slightly different from, all the other versions out there.
Unfortunately, despite the ubiquity of these libs and the fact that every seasoned graphics coder probably knows their vector lib better than their own spouse, still most of the vector math libs I’ve seen are pretty deficient. Sometimes these deficiencies are obvious, sometimes rather subtle, but vectors and matrices are used so often and so deeply that it’s clearly worth investing time to get them right.