# Normals and the Inverse Transpose, Part 3: Grassmann On Duals

Welcome back! In the last couple of articles, we learned about different ways to understand normal vectors in 3D space—either as bivectors (part 1), or as dual vectors (part 2). Both can be valid interpretations, but they carry different units, and react differently to transformations.

In this third and final installment, we’re going leave behind the focus on normal vectors, and explore
a couple of other unitful vector quantities. We’ve seen how Grassmann bivectors and trivectors act as
oriented areas and volumes, respectively; and we saw how dual vectors act as oriented *line densities*, with
units of inverse length. Now, we’re going to put these two geometric concepts together, and find out
what they can accomplish with their combined powers. (Get it? Powers? Like powers of a scale factor?
Uh, you know what, never mind.)

I’m going to dive right in, so if you need a refresher on either Grassmann algebra or dual spaces, you may want to re-skim the previous articles.

## Wedge Products of Dual Vectors

Grassmann algebra allows us to take wedge products of vectors, producing higher-grade algebraic
entities such as bivectors and trivectors. Just as we can do this with base vectors, we can do the
same thing on dual vectors, producing *dual bivectors* and *dual trivectors*.

A dual bivector is formed by wedging two dual vectors, like: $$ {\bf e_x^*} \wedge {\bf e_y^*} = {\bf e_{xy}^*} $$ and a dual trivector is the product of three: $$ {\bf e_x^*} \wedge {\bf e_y^*} \wedge {\bf e_z^*} = {\bf e_{xy}^*} \wedge {\bf e_z^*} = {\bf e_{xyz}^*} $$ This works exactly the same way that wedge products of ordinary vectors do; in particular, the same anticommutative law applies.

So what’s the geometric meaning of these dual $k$-vectors? Recall that a dual vector is defined as a linear form—a function from some vector space $V$ to scalars $\Bbb R$. Conveniently, the wedge products of dual vectors turn out to be isomorphic to the duals of wedge products of vectors. (Mathematically, we can say, for finite-dimensional $V$: $$ \textstyle \bigwedge^k \bigl( V^* \bigr) \cong \bigl(\bigwedge^k V \bigr)^* $$ where $\bigwedge^k$ is the operation to construct the set of $k$-vectors over a given base vector space.)

The upshot is that dual $k$-vectors can be understood as *linear forms on $k$-vectors*: a dual
bivector is a linear function from bivectors to scalars, and a dual trivector is a linear function
from trivectors to scalars. Let’s see how this works in more detail.

## Dual Bivectors

In the previous article, we saw how a dual vector can be visualized as a field of parallel, uniformly spaced planes, representing the level sets of a linear form:

You can think of the discrete planes in this picture as representing intervals of one unit
in the output of the linear form. Keep in mind, though, that there are actually a *continuous
infinity* of these planes, filling space—one for every possible output value of the linear form.
When you evaluate the linear form—i.e. pair a dual vector with a vector—the result represents *how
many planes* the vector crosses, from its tail to its tip (in a continuous-measure sense of “how many”).
This will depend on both the length and orientation of the vector: for example, a vector parallel to
the planes will return zero, no matter its length.

A dual *bivector* can be thought of in a similar way—but instead of planes, we now picture a field
of parallel *lines*, uniformly spaced over the plane perpendicular to them.

As suggested by this diagram, when you wedge two dual vectors, the resulting dual bivector consists
of all the *lines of intersection* of the two dual vectors’ respective planes.

What happens when we pair this dual bivector with a base bivector? As before, the
result is a scalar—this time representing *how many lines* the bivector crosses! If you visualize
the bivector as a parallelogram, or circle or any other shape, it will have a certain area. It
will therefore intersect some quantity of the continuous mass of lines. This quantity won’t depend on
the *shape* of the bivector—remember, bivectors don’t actually *have* any defined shape—only on
its area (magnitude) and orientation. A bivector whose plane runs parallel to the lines will return
zero, no matter its area.

Because dual vectors have units of inverse length, and a dual bivector is a product of dual vectors,
**a dual bivector has units of inverse area**. It represents an oriented areal
density, such as a probability density over a surface! When you pair the dual bivector with a
bivector, the result tells you how much probability (or whatever else) is covered by that bivector’s
area. And as implied by their units, dual bivectors scale as $1/a^2$. (If you scale an object *up* by
a factor of $a$, the probablity density on its surface goes *down* by a factor of $a^2$, because the
same total probability is now spread over an $a^2$-larger area.)

How about the transformation rule for dual bivectors? Well, we learned in part 1 that bivectors transform as $\text{cofactor}(M)$; and in part 2, we found that dual vectors transform as the inverse transpose, $M^{-T}$. It follows that dual bivectors transform as $\text{cofactor}\bigl(M^{-T}\bigr)$, or equivalently $\bigl(\text{cofactor}(M)\bigr)^{-T}$. Startlingly, for 3×3 matrices these formulas reduce to just $$ \frac{M}{\det M} $$ So, dual bivectors simply transform using $M$ divided by its own determinant.

## Dual Trivectors

Follow the pattern: if a dual vector in 3D looks like a stack of parallel planes, and a dual bivector
looks like a field of parallel lines, then a dual *trivector* looks like a cloud of parallel *points*.
Well, drop the “parallel”—it doesn’t mean anything. It’s just uniformly spaced points.

As before, the wedge product of three dual vectors—or a dual vector and dual bivector—constructs the continuous point cloud made of all the intersection points of the wedge factors. This quantity scales as $1/a^3$ and represents a volume density. When you pair it with a trivector, the result tells you how much of the point cloud is enclosed in that trivector’s volume.

The transformation rule for this one is easy—dual trivectors in 3D just get multiplied by $1/\det M$.

## A Few More Topics

With the introduction of dual bi- and trivectors, our “scaling zoo” is now complete! We’ve got the full ecosystem of vectorial quantities with scaling powers from −3 to +3, each with its proper units and matching transformation formula.

In the rest of this section, I’ll quickly touch on a few more mathematical aspects of this extended Grassmann algebra with dual spaces.

### The Interior Product

As we saw in part 2, a vector space and its dual have a “natural pairing” operation, much like
an inner product, between vectors and dual vectors. This pairing
extends to $k$-vectors and their duals, too. In fact, we can further extend the natural pairing to
work between $k$-vectors and duals *of different grades*. For example, we can define a
way to “pair” a dual vector $w$ with a bivector $B = u \wedge v$, yielding a vector:
$$
\langle w, B \rangle = \langle w, u \rangle v - u \langle w, v \rangle
$$
Geometrically, the resulting vector lies in the plane of $B$, and runs parallel to the level planes
of $w$. In some sense, $w$ is “eating” the dimension of $B$ that lies along the direction of $w$’s
density, and leaving the leftover dimension behind as a vector.

This extended pairing operation is known as the interior product or contraction product, although different references often define it in slightly different ways (there are various conventions in the literature). I’m not going to go into it too deeply. The key point is that you can combine a $k$-vector with a dual $\ell$-vector, for any grades $k$ and $\ell$; the result will be a $(k-\ell)$-vector, interpreting negative grades as duals.

### The Hodge Star

In addition to the vector-space duality we’ve been talking about, Grassmann algebra contains another, distinct notion of duality: Hodge duality, represented by the Hodge star operator, $\star$. (Note that this is a different symbol from the asterisk $*$ used for the dual vector space!)

The vector-space notion of duality relates $k$-vectors to duals of *equal grade*—vectors to dual
vectors, bivectors to dual bivectors, and so on. Hodge duality, however, connects things to duals of a
complementary grade. Applying the Hodge star to a $k$-vector produces an element of grade $n - k$,
where $n$ is the dimension of space. In 3D, it interchanges vectors (grade 1) with bivectors (grade 2),
and scalars (grade 0) with trivectors (grade 3).

The way I’ll define the Hodge star initially is a bit different than the standard way. In fact,
there are actually *two* Hodge star operations: one that goes from $k$-vectors to dual $(n-k)$-vectors,
and another that goes the other way. I’ll denote these by $\star$ and $-\star$ respectively. The
two are inverses of each other (in 3D, at least). They’re defined as follows:
$$
\begin{aligned}
\star&: \textstyle\bigwedge^k V \to \textstyle\bigwedge^{n-k}V^* &&:
& v^\star &= \langle {\bf e_{xyz}^*}, v \rangle \\
-\star&: \textstyle\bigwedge^k V^* \to \textstyle\bigwedge^{n-k}V &&:
& v^{-\star} &= \langle v, {\bf e_{xyz}} \rangle
\end{aligned}
$$
The angle brackets on the right here are the interior product. What we’re saying is: to do
the Hodge star on a $k$-vector, we take its interior product with ${\bf e_{xyz}^*}$, the standard
unit dual trivector (or, in $n$ dimensions, the unit dual $n$-vector). This results in a dual
$(n-k)$-vector, which geometrically represents a density over all the dimensions *not* included in
the original $k$-vector.

Conversely, to do the anti-Hodge-star on a dual $k$-vector, we take its interior product with
${\bf e_{xyz}}$, giving an $(n-k)$-vector containing all the dimensions *not* represented by the
original dual $k$-vector, i.e. all the dimensions perpendicular to its level sets.

(These two operations are *almost* defined on disjoint domains, and could therefore be combined into
one “smart” star that automatically knows what to do based on the type of its argument…except for
the $k = 0$ case: when you hodge a scalar, does it go to a trivector, or to a dual trivector? Both
are possible; that’s why we need two distinct operations here.)

For 3D geometry, the interesting cases are vectors interchanging with bivectors:

- A vector $v$ hodges to a dual bivector whose “field lines” run parallel to $v$.
- A bivector $B$ hodges to a dual vector whose level planes are parallel to $B$.
- A dual vector $w$ unhodges to a bivector parallel to $w$’s level planes.
- A dual bivector $D$ unhodges to a vector parallel to $D$’s field lines.

Although the formal definition was somewhat involved, you can see that the geometric result of the Hodge operations is actually pretty simple. It’s all about swapping between the geometry of a $k$-vector and the corresponding level-set geometry of a dual $(n-k)$-vector. The Hodge stars are a very useful tool for working with Grassmann and dual-Grassmann quantities in practice.

### The Inner Product, or Forgetting About Duals

In most treatments of Grassmann or geometric algebra, dual spaces are hardly mentioned. The more conventional definition of the Hodge star has it mapping directly between $k$-vectors and $(n-k)$-vectors—no duals in sight. How does this work?

It turns out that if we have an inner product defined on our vector space, we can use it to convert back and forth between vectors and dual vectors, or $k$-vectors and their duals.

So far, we haven’t discussed any means of mapping individual vectors back and forth between the base and
dual spaces. Although they’re both vector spaces of the same dimension, there’s no natural isomorphism
that would enable us to map them in a non-arbitrary way. However, the presence of an inner
product does pick out a specific isomorphism with the dual space: that which maps each vector $v$ to
a dual vector $v^*$ that implements *dotting with $v$*, using the inner product.

Symbolically, for all vectors $u \in V$, we have $\langle v^*, u \rangle = v \cdot u$. This can be extended to inner products and isomorphisms for all $k$-vectors as well (see Wikipedia for details).

Note, however, that this map is *not* preserved by scaling, or by transformations in general, because
$v^*$ transforms as $M^{-T}$ while $v$ transforms as $M$.

With this correspondence, it becomes possible to largely ignore the existence of dual spaces and dual
elements altogether—we have the fiction that they’re not distinct from the base elements. In an
orthonormal basis, even the *coordinates* of a vector and its corresponding dual will be identical.

For an example of “forgetting” about duals: the Hodge star operations can be defined using the inner product to invisibly dualize their input or output as well as hodging it. Then the two Hodge stars I defined above collapse into one operation, mapping between $\bigwedge^k V$ and $\bigwedge^{n-k} V$.

## What’s The Use of All This?

This is kind of a lot. We started with just vectors and normal vectors—two kinds of vector-shaped
things with different rules, which was confusing enough. But now we have *four*: vectors, dual vectors,
bivectors, and dual bivectors. And on top of that we have three scalar-shaped things, too: true
unitless scalars, trivectors, and dual trivectors.

Evidently, lots of people manage to get along well enough without being totally aware of all these distinctions! Even texts on Grassmann or geometric algebra may not fully delve into the “duals” story, instead treating $k$-vectors and their duals as the same thing (implicitly using the isomorphism defined above). Their differing transformation behavior becomes sort of a curiosity, an unsystematic ornamental detail. And this comes at the cost of making some aspects of the algebra require an inner product or a metric, and only work properly in an orthonormal basis. In contrast, when you’re “cooking with duals”, you can derive formulas that work properly in any basis.

As a quick example of this, let’s look at a concrete problem you might encounter in graphics. Let’s say you have a triangle mesh and you want to select a random point on it, chosen uniformly over the surface area. To do this, we must first select a random triangle, with probability proportional to area. The standard technique is to precompute the areas of all the triangles and build a prefix-sum table; then, to select a triangle, we take a uniform random value and binary-search on it in the table.

Let’s throw in another wrinkle, though. What if the triangle mesh is transformed—possibly by a
nonuniform scaling, or a shear? In general, this will alter the areas of all the triangles, in an
orientation-dependent way. A uniform distribution over surface area in the mesh’s *local* space
will no longer be uniform in world space. We could address this by pre-transforming the whole mesh
into world space and doing the sampling process there—but that’s more expensive than necessary.

We can use bivectors to help. Instead of calculating just a scalar area for each triangle, calculate the bivector representing its orientation and area. (If the triangle’s vertices are $p_1, p_2, p_3$, this is $\tfrac{1}{2}(p_2 - p_1) \wedge (p_3 - p_1)$.) Now we can transform all the bivectors into world space, using their transformation rule, and they will accurately represent the areas of the transformed triangles. Then we can calculate their magnitudes and build the prefix-sum table, as before.

Conversely, suppose we have an existing, non-uniform areal probability measure defined over our triangle mesh. (Maybe it’s a light source with a texture defining its emissive brightness, and we want to sample with respect to emitted power; or maybe we want to sample with respect to solid angle subtended at some point, or some sort of visual importance, etc.) We can represent these probability densities as dual bivectors, and again we can take them back and forth between local and world space—even in the presence of shear or nonuniform scaling—with confidence that we’re still representing the same distribution.

Some other examples where dual $k$-vectors show up:

- The derivative (gradient) of a scalar field, such as an SDF, is naturally a dual vector.
- Dual vectors represent spatial frequencies (wavevectors) in Fourier analysis.
- The radiance carried by a ray is a density with respect to projected area, and can therefore be represented, at least in part, as a dual bivector.

Like many theoretical math concepts, I think these ideas are mostly useful for enriching your own
mental models of geometry, strengthening your thought process, and deriving results that you can
then use in code in a more “conventional” way. I’m not *necessarily* suggesting we should
all go off and start implementing $k$-vectors and their duals as classes in our math libraries.
(Frankly, our math libraries are enough of a mess already.)

## Organizing the Zoo

One more thing to muse on before I leave you. We’ve seen that there is a “scaling zoo” of mathematical elements with different physical, geometric interpretations and behaviors. Different branches of science and math have distinct ways of conceptually organizing this zoo, and thinking about its denizens and their relationships.

In computer science, for example, we would probably understand vectors, bivectors, dual vectors, and
so forth as different *types*. Each might have an internal structure as a composition of more elementary
values (real numbers), and a suite of allowed operations that define what you can do with them and
how they interact with one another.

Physicists, meanwhile, tend to take a more rough-and-ready approach: geometric elements are
thought of as simply matrices of real (or sometimes complex) numbers, together with *transformation
laws*—rules that define what happens to a given matrix under a change of coordinates. Algebraic
properties such as anticommutativity are obtained by constructing the matrices in such a way that matrix
multiplication implements the desired algebra. For example, a bivector can be represented as an
antisymmetric matrix; wedging two vectors $u, v$ to make a bivector corresponds to calculating the matrix
$$
uv^T - vu^T
$$
which has the same anticommutative property as a wedge product. Multiplying this matrix by a (dual)
vector $w$ then represents the interior product of the bivector with $w$. Meanwhile, a dual
bivector would be structurally similar, but have a different transformation law (“covariant” versus
“contravariant”).

Lastly, mathematicians like to formalize things by saying that different geometric
quantities are elements of different *spaces* and/or *algebras*. Both terms ultimately mean a
set (in the mathematical sense), together with some extra structure—such as algebraic operations,
a topology, a norm or metric, and so on—defined on top of the bare set. The exact kind of structures
you need depends on what you’re doing, and there’s a whole menagerie of such structures that
might be invoked in different contexts.

So which structure is behind the scaling zoo? We know we’ve got the vector space structure, and the Grassmann algebraic structure. But neither of these fully accounts for the different scaling and transformation behaviors of dual elements: dual spaces are isomorphic to their base spaces (in finite dimensions), totally identical insofar as the vector and Grassmann structures are concerned.

I don’t have a fully developed answer yet—but I suspect it’s got to do with the representation theory of Lie groups. My guess is that the different types of scaling elements we’ve seen can be codified as vector spaces acted on by different representations of $GL(n)$, the Lie group of all linear maps on $\Bbb R^n$. But I’m not going to get into that here. (If you’d like to read more on this, here are a couple web references: one, two. Also: Peter Woit’s book on the role of representation theory in particle physics.)

## Conclusion

I hope this has been an entertaining and enlightening tour through some of the layers
beneath the surface of your favorite Euclidean geometry. We started with a seemingly simple
question—why do normal vectors transform using the inverse transpose matrix?—and found
that there was *much* more rich structure there than meets the eye.

The “scaling zoo” of $k$-vectors and their duals makes a pleasingly complete and symmetrical whole. Even if I’m not going to be employing these things in practical work every day, I feel that studying them has helped me understand some things that were vague and foggy in my mind before. It’s worth appreciating that these subtle distinctions exist. One of my general axioms in life is that everything is more complicated than it first appears, and nowhere is this more consummately borne out than mathematics!