Hi everyone!
Consider the following problem. You're given two arrays $$$a_1, a_2, \dots, a_{n!}$$$ and $$$b_1, b_2, \dots, b_{n!}$$$.
You need to compute the array $$$c_1, c_2, \dots, c_{n!}$$$, such that
$$$ c_k = \sum\limits_{i \circ j = k} a_i b_j, $$$where $$$i \circ j$$$ denotes the composition of the $$$i$$$-th and the $$$j$$$-th lexicographically smallest permutations of $$$n$$$ elements.
Generally, typical approach to such things would be to escape into space in which the convolution corresponds to the component-wise product of two vectors, and then to transform back into initial space. For example, with the summation rule being $$$i + j \equiv k \pmod n$$$, it is well known that the corresponding transform is exactly the discrete Fourier transform of vectors $$$a_i$$$ and $$$b_j$$$.
But with more generic groups, things may be much more complicated. This is the second blog in a series of blog posts explaining it.
Great thanks to Endagorion and Golovanov399 for helping to understand the theory for this part!
In this blog, we will learn the basics of character theory which will allow us to define the inverse Fourier transform on finite groups.
Prerequisites
Difficulty: ★★★★★
One is expected to have some reasonable intuition with group theory and be well-familiar with linear algebra, and be familiar with some tangent definitions (e.g. what is a ring, what is a field, what is a linear span, what is a vector space, dot product, hermite product etc). Familiarity with discrete Fourier transform, and similar transforms used to compute convolutions is also essential.
$$$\operatorname{Hom}(V, W)$$$ as a representation
As this section is particularly dense on the mathematical notation, I decided to hide it under spoiler tag. It is highly recommended to familiarize with it, as it provides a whole lot of necessary intuition and motivation as to why exactly we're introducing the characters the way we do in the next section. However, if you're more interested in concrete results, and are fine with skipping less relevant details, you can proceed to the next section directly.
Or you can see for yourself how deep the rabbit hole goes...Let $$$x$$$ and $$$y$$$ be representations of a group $$$G$$$ in $$$V$$$ and $$$W$$$ correspondingly.
Recall that we used the notion of equivariant maps to define the isomorphism of representations. In particular, we said that a linear map $$$f : V \to W$$$ is equivariant if $$$f x^g = y^g f$$$ for any $$$g \in G$$$. Now, let's make a deeper dive into how equivariant maps are structured.
Def. 1. The vector space of linear maps $$$f : V \to W$$$ is denoted $$$\operatorname{Hom}(V, W)$$$.
In case you forgot, the sum of linear maps $$$f_1 + f_2$$$ is defined as $$$(f_1 + f_2)(v) = f_1(v) + f_2(v)$$$, and the multiplication by a scalar $$$k$$$ is defined as $$$(k f_1)(v) = k f_1(v)$$$. Under these definition one important observation is that if $$$f_1$$$ and $$$f_2$$$ are equivariant maps, then $$$f_1 + f_2$$$ and $$$k f_1$$$ are also equivariant maps. It means that the set of equivariant maps forms a subspace of the vector space of all linear maps.
Def. 2. The space of equivariant maps between representations $$$x$$$ and $$$y$$$ in spaces $$$V$$$ and $$$W$$$ is denoted by $$$\operatorname{Hom}_G(V, W)$$$.
Now, let's look again on the condition $$$f x^g = y^g f$$$. For the isomorphism of representations, we rewrote it as
$$$ x^g = f^{-1} y^g f. $$$The identity above requires that $$$f$$$ is invertible transform. On the other hand, $$$x^g$$$ is always invertible, hence we could also rewrite it as
$$$ f = y^g f (x^g)^{-1}. $$$The right hand side of the expression above can be perceived as a linear transform $$$z^g$$$ defined on the vector space $$$\operatorname{Hom}(V, W)$$$ as
$$$ z^g(f) = y^g f (x^g)^{-1}. $$$In this way, $$$z^g$$$ can be treated as a representation $$$z$$$ in the space $$$\operatorname{Hom}(V, W)$$$. One important property of such representation would be that the subspace $$$\operatorname{Hom}_G(V, W)$$$ under this representation can be defined as the invariant subspace of the fixed points of the representation (that is, of linear maps $$$f : V \to W$$$ such that $$$z^g f = f$$$ for every $$$g \in G$$$).
Dimension of $$$\operatorname{Hom}_G(V, W)$$$ for irreducible representations
Let $$$x$$$ and $$$y$$$ be irreducible representations. Consider an equivariant map $$$f : V \to W$$$ between them.
- The kernel $$$\ker f= \{v \in V : f(v) = 0\}$$$ is an invariant subspace of every $$$x^g$$$, as $$$f(x^g(v)) = y^g(f(v)) = y^g(0) = 0$$$;
- The image $$$\operatorname{Im} f = \{f(v) \in W : v \in V\}$$$ is an invariant subspace of every $$$y^g$$$, as $$$y^g(f(v)) = f(x^g(v)) \in \operatorname{Im} f$$$.
Since $$$x$$$ and $$$y$$$ are irreducible, it means that one of the following is true:
- $$$\ker f = V$$$ and $$$\operatorname{Im} f = \{0\}$$$, hence $$$f$$$ is a trivial map;
- $$$\ker f = \{0\}$$$ and $$$\operatorname{Im} f = W$$$, hence $$$f$$$ is an isomorphism.
So, if $$$x$$$ and $$$y$$$ are not isomorphic, the only possible equivariant map is $$$f(v) = 0$$$, meaning that $$$\dim \operatorname{Hom}(V, W) = 0$$$.
And what if $$$x$$$ and $$$y$$$ are isomorphic? In this case, $$$V = W$$$. Since $$$\mathbb C$$$ is algebraically closed, $$$f : V \to W$$$ has an eigenvalue $$$\lambda$$$, and a corresponding eigenvector $$$v$$$ such that $$$(f - \lambda I) v = 0$$$, meaning that the equivariant map $$$f' = f - \lambda I$$$ has a non-trivial kernel.
On the other hand, the result above still holds, and implies that it is a trivial map, which means that $$$f' = 0$$$, and $$$f = \lambda I$$$. In other words, all equivariant maps are scalar multiples of identity map, and $$$\dim \operatorname{Hom}(V, W) = 1$$$ in this case.
To summarize, for irreducible representations $$$x$$$ and $$$y$$$ over $$$\mathbb C$$$ it always holds that
$$$ \dim \operatorname{Hom}_G(V, W) = \begin{cases} 0, & x \not\sim y, \\ 1, & x \sim y. \\ \end{cases} $$$This result is commonly known in the representation theory as the Schur's lemma.
Projection on $$$\operatorname{Hom}_G(V, W)$$$
Consider a linear transformation $$$T$$$ of $$$\operatorname{Hom}(V, W)$$$, defined as
$$$ Tf = \frac{1}{|G|} \sum\limits_{g \in G} z^g f. $$$Defined this way, the result of applying $$$T$$$ to any linear map $$$f : V \to W$$$ lies in $$$\operatorname{Hom}_G(V, W)$$$, as
$$$ z^h T f = \frac{1}{|G|} \sum\limits_{g \in G} z^h z^g f = \frac{1}{|G|} \sum\limits_{hg \in G} z^{hg} f = T f. $$$At the same time, $$$Tf = f$$$ for any $$$f$$$ that is already an equivariant map, since $$$z^g f = f$$$. This allows us to conclude that $$$T^2 = T$$$, from which follows that $$$T$$$ is a projection transformation that projects each element of $$$\operatorname{Hom}(V, W)$$$ in $$$\operatorname{Hom}_G(V, W)$$$.
This is a very important fact to us, because we already know the dimension of $$$\operatorname{Hom}_G(V, W)$$$. On the other hand, the trace of a projection operator is always equal to the dimension $$$d$$$ of the subspace on which the projection is done. It follows from the fact that projection operator can be diagonalized in a basis where first $$$d$$$ basis vectors are from the subspace, on which the projection is done, and the remaining vectors are from its orthogonal complement. Then, first $$$d$$$ diagonal elements are equal to $$$1$$$, and the remaining are equal to $$$0$$$.
At the same time, when $$$V$$$ and $$$W$$$ are irreducible representation, we already know the dimension of $$$\operatorname{Hom}_G(V, W)$$$ from the formula above. So, by finding the trace of $$$T$$$, we can connect it with the result above.
The trace of $$$T$$$
Since the trace of a linear map is a linear function, we can write it as
$$$ \operatorname{tr} T = \frac{1}{|G|} \sum\limits_{g \in G} \operatorname{tr} z^g. $$$Recall that $$$z^g(f) = y^g f (x^g)^{-1}$$$. How do we find the trace of $$$z^g$$$ from $$$x^g$$$ and $$$y^g$$$? Since $$$x^g$$$ and $$$y^g$$$ are, without loss of generality, unitary, they are diagonalizable. Let $$$v \in V$$$ be the eigenvector of $$$x^g$$$ corresponding to an eigenvalue $$$\lambda$$$ and $$$w \in W$$$ be the eigenvector of $$$y^g$$$ corresponding to an eigenvalue $$$\mu$$$. Then the linear map described in the matrix form as $$$f = wv^*$$$ is an eigenvector of $$$z^g$$$ corresponding to an eigenvalue $$$\lambda^* \mu$$$. Indeed, $$$(x^g)^{-1} = (x^g)^*$$$ for unitary matrices, and
$$$ z^g (wv^*) = y^g w (x^g v)^* = \mu (wv^*)\lambda^*. $$$From this follows that if $$$\lambda_1, \dots, \lambda_n$$$ are the eigenvalues of $$$x^g$$$ and $$$\mu_1, \dots, \mu_m$$$ are the eigenvalues of $$$y^g$$$, then all the eigenvalues of $$$z^g$$$ have the form $$$\lambda_i^* \mu_j$$$ for all $$$i$$$ and $$$j$$$. This, in turn, means that the trace of $$$z^g$$$ can be obtained as
$$$ \operatorname{tr} z^g = \operatorname{tr} (x^g)^* \operatorname{tr} y^g. $$$This result reflects that $$$z^g$$$ is, in fact, isomorphic to the Kronecker product $$$y^g \otimes (x^g)^*$$$ of the transforms $$$y^g$$$ and $$$(x^g)^*$$$.
Characters
The result above allows us to introduce the key concept of the representation theory, the characters of the representation.
Def. 3. Let $$$x$$$ be the representation of a group $$$G$$$ in a vector space $$$V$$$. The character of a group element $$$g \in G$$$ is defined as
$$$ \chi(g) = \operatorname{tr} x^g. $$$Let $$$\chi$$$ and $$$\theta$$$ be the characters of irreducible representations $$$x$$$ and $$$y$$$. Then, using the results above, we get
$$$ \frac{1}{|G|} \sum\limits_{g \in G} \chi(g) \cdot \theta(g)^* = \begin{cases} 1, & x \sim y, \\ 0, & x \not\sim y. \end{cases} $$$The expression above can be treated as an inner product $$$\langle \theta, \chi \rangle$$$ on the vector space $$$\mathbb C^{|G|}$$$.
Orthonormality of the characters
One particularly nice property of characters is that if $$$x \sim x_1 + x_2$$$, and the representations $$$x_1$$$ and $$$x_2$$$ have characters $$$\chi_1$$$ and $$$\chi_2$$$, then the characters $$$\chi$$$ of $$$x$$$ can be obtained as the sum of characters of $$$x_1$$$ and $$$x_2$$$, that is
$$$ \chi(g) = \chi_1(g) + \chi_2(g). $$$Let $$$x_1, x_2, \dots$$$ be the set of pairwise non-isomorphic, irreducible representations, and let $$$\chi_1, \chi_2, \dots \in \mathbb C^{|G|}$$$ be their character vectors. Then, as was already noted above, $$$\langle \chi_i, \chi_j\rangle = \delta_{ij}$$$, that is, the inner product is equal to $$$1$$$ when $$$i = j$$$ and $$$0$$$ otherwise. In particular, it means that irreducible characters are linearly independent, so there may be at most $$$|G|$$$ irreducible representations.
Another corollary here is that every representation $$$x$$$ is uniquely determined by its characters (up to isomorphism). Indeed, let
$$$ x \sim v_1 x_1 + v_2 x_2 + \dots, $$$where $$$v_1, v_2, \dots$$$ are non-negative integers and $$$v_i x_i \sim x_i + x_i + \dots$$$, repeated $$$v_i$$$ times. Then
$$$ \chi = v_1 \chi_1 + v_2 \chi_2 + \dots, $$$and the numbers $$$v_1, v_2, \dots$$$ are uniquely determined as $$$v_i = \langle \chi, \chi_i \rangle$$$ due to orthonormality of $$$\chi_1, \chi_2, \dots$$$. From this also follows that
$$$ \langle \chi, \chi \rangle = v_1^2 + v_2^2 + \dots $$$Then, since any representation of a finite group can be decomposed into a direct sum of irreducible representations, we can deduce that the representation $$$x$$$ of a finite group $$$G$$$ is irreducible if and only if for its characters holds
$$$ \langle \chi, \chi \rangle = \frac{1}{|G|}\sum\limits_{g \in G} \chi(g) \chi(g)^* = 1. $$$Regular representation
Def. 4. The regular representation of $$$G$$$ is a representation $$$x$$$ in the space $$$\mathbb C^{|G|}$$$, such that $$$x^g(e_h) = e_{gh}$$$ for a basis $$$e_1, e_2, \dots, e_{|G|}$$$.
This representation, while being very simple and straight-forward in its definition, has a bunch of very interesting properties in terms of its characters. First of all, we may notice that in this representation, the characters are expressed as
$$$ r(g) = \operatorname{tr} x^g = \begin{cases} |G|, & g = 1, \\ 0, & g \neq 1. \end{cases} $$$This is due to the fact that $$$x^1 = I$$$ is the identity matrix, as $$$x^1 e_h = e_{1h} = e_h$$$, while any other $$$x^g$$$ has only zero elements on the diagonal, if it is represented as a matrix, hence it has a zero trace. This fact allows to express the coefficients in the decomposition of the regular representation character $$$r$$$ into irreducible characters as
$$$ \langle r, \chi_i \rangle = \frac{1}{|G|} r(1) \chi_i^*(1) = \chi_i^*(1) = \dim V_i, $$$where $$$V_i$$$ is the space of the irreducible representation $$$x_i$$$. Here we used the fact that $$$\chi(1)$$$ always equates to the dimension of $$$V$$$, as $$$x^1$$$ is always the unit matrix. That being said, each unique irreducible representation $$$x_i$$$ is included in the regular representation $$$x$$$ with multiplicity that is equal to the degree of $$$x_i$$$. From this also follows that
$$$ |G| = \frac{1}{|G|} r(1) r(1)^* = \langle r, r \rangle = (\deg x_1)^2 + (\deg x_2)^2 + \dots, $$$where $$$\deg x_i = \dim V_i$$$. Another important result here is that $$$r$$$ is expressed as
$$$ r = \chi_1 \deg x_1 + \chi_2 \deg x_2 + \dots, $$$which allows us to reformulate initial formula for $$$r(g)$$$ as
$$$ \frac{1}{|G|} \sum\limits_i \chi_i (g) \deg x_i = \begin{cases} 1, & g = 1 \\ 0, & g \neq 1. \end{cases} $$$Note: Rings a bell yet? Recall the identity
$$$ \frac{1}{n} \sum\limits_{j=0}^{n-1} \omega_n^{jk} = \begin{cases} 1, & k \equiv 0 \pmod n, \\ 0, & k \not\equiv 0 \pmod n. \end{cases}, $$$where $$$\omega_n$$$ is the $$$n$$$-th root of unity s. t. $$$\omega_n^n = 1$$$. And, indeed, $$$\chi_j(k) = \omega_n^{kj}$$$ are the $$$1$$$-dimensional irreducible characters of the cyclic group, which gives the connection of the standard discrete Fourier transform with the formula above.
Inverse Fourier transform
Now, assume once again that
$$$ F(x) = \sum\limits_{g \in G} f_g x^g, $$$and that we know $$$F(x_1), F(x_2), \dots$$$ for all irreducible representations $$$x_1, x_2, \dots$$$ of $$$G$$$. How do we recover $$$f_1, f_2, \dots, f_{|G|}$$$?
First of all, let's think for a moment why it is possible at all. Above, while analyzing the regular representation, we found out that
$$$ |G| = (\deg x_1)^2 + (\deg x_2)^2 + \dots $$$It is a very reassuring fact for us, because $$$|G|$$$ is the dimension of the $$$f_1, f_2, \dots, f_{|G|}$$$, if we treat it as a vector. On the other hand, each $$$F(x_i)$$$ can be represented as a square matrix of degree $$$\deg x_i$$$, meaning that overall the space of all possible $$$F(x_1), F(x_2), \dots$$$ has a dimension which is equal to the sum of the squared $$$\deg x_i$$$. So, at the very least we clearly see that the dimensions match up, heavily hinting that the inverse transform should exist. How should such transform look like?
Let's once again consider the regular representation $$$x$$$. As we found out above,
$$$ x \sim x_1 \deg x_1 + x_2 \deg x_2 + \dots $$$So, if we compute $$$F(x)$$$ in the regular representation $$$x$$$ (assuming the basis corresponds to the decomposition), what we get is a block-diagonal matrix, where the diagonal has $$$\deg x_1$$$ blocks that are equal to $$$F(x_1)$$$, then $$$\deg x_2$$$ blocks that are equal to $$$F(x_2)$$$, and so on. We can introduce a natural inner product on the linear space of all possible values of $$$F(x)$$$, defined in the coordinate form as
$$$ \langle F, G \rangle = \sum\limits_{i,j} F(x)_{ij} G(x)_{ij}^*. $$$What is even more important is that the same inner product can be rewritten in a coordinate-agnostic form as
$$$ \langle F, G \rangle = \sum\limits_i \left(\sum\limits_j F(x)_{ij} G(x)_{ij}^* \right) = \sum\limits_i (F(x) G(x)^*)_{ij} = \operatorname{tr} [F(x) G(x)^*]. $$$Note that for $$$F(x) = x^g$$$ and $$$G(x) = x^h$$$, by definition of the regular representation, it holds that
$$$ \langle x^g, x^h \rangle = \operatorname{tr}[x^g (x^h)^*] = \operatorname{tr} x^{gh^{-1}} = \begin{cases} |G|, & g = h, \\ 0, & g \neq h. \end{cases} $$$as $$$(x^h)^* = (x^h)^{-1} = x^{h^{-1}}$$$ for unitary matrices, and $$$x^g x^{h^{-1}} = x^{gh^{-1}}$$$. Hence, if we redefine the inner product to be divided by $$$|G|$$$, the family of transforms $$$x^g$$$ for all $$$g \in G$$$ would form an orthonormal basis in the space of all possible $$$F(x)$$$. On the other hand, since $$$F(x)$$$ and $$$G(x)$$$ are block-diagonal with same sizes of blocks, we can further rewrite it as
$$$ \langle F, G \rangle = \frac{1}{|G|}\sum\limits_i \deg x_i \operatorname{tr} [F(x_i) G(x_i)^*]. $$$Using this formula, which is also known as the Plancherel formula, the decomposition in this basis is expressed through the inner product as
$$$ \boxed{f_g = \langle F, x^g \rangle = \frac{1}{|G|} \sum\limits_i d_i \operatorname{tr} \left[ y_i x_i^{g^{-1}}\right]} $$$where $$$y_i = F(x_i)$$$, $$$d_i = \deg x_i$$$, and $$$x_1, x_2, \dots$$$ are all pairwise non-isomorphic irreducible representations of $$$G$$$. The formula above is known as the inverse Fourier transform on finite groups.
Extra: Conjugacy classes
Let's recall the following definition from a group theory:
Def. 5. The elements $$$a, b \in G$$$ are conjugate if there is an element $$$g \in G$$$ such that $$$a = g^{-1} b g$$$.
Conjugacy is an equivalence relation, meaning that we can define the conjugacy classes:
Def. 6. A conjugacy class is an equivalence class in $$$G$$$ under the conjugacy relation.
If $$$a$$$ and $$$b$$$ are conjugate, for the representation $$$x$$$ it means that $$$x^a = (x^g)^{-1} x^b x^g$$$, so the linear maps $$$x^a$$$ and $$$x^b$$$ must be similar. In particular, it means that $$$x^a$$$ and $$$x^b$$$ have the same characteristic polynomial. Thus, $$$\operatorname{tr} x^a = \operatorname{tr} x^b$$$, hence $$$\chi(a) = \chi(b)$$$.
In other words, for any conjugacy class $$$K$$$, the character is the same for all elements of the conjugacy class. Let $$$\chi(K)$$$ be the value of the character on the conjugacy class $$$K$$$, then the inner product of characters rewrites as
$$$ \langle \theta, \chi \rangle = \frac{1}{|G|} \sum\limits_K |K| \cdot \chi(K) \cdot \theta(K)^*, $$$and we may treat the characters as vectors over $$$\mathbb C^k$$$ instead of $$$\mathbb C^{|G|}$$$, where $$$k$$$ is the number of conjugacy classes in $$$G$$$. This allows us to provide a more strict upper bound for the maximum possible number of the irreducible representations of $$$G$$$. Since the characters of irreducible representations are also orthonormal over $$$\mathbb C^k$$$, it means that there may be no more than $$$k$$$ pairwise non-isomorphic irreducible representations of $$$G$$$. And this bound is, in fact, always achieved!
Proof that it is always achievedAbove, we have shown that the dimension of a vector space formed by irreducible characters is at most $$$k$$$. To prove that it is, indeed, $$$k$$$, consider the vector space of all class functions, that is, of functions $$$f : G \to \mathbb C$$$ that are constant on any conjugacy class $$$K$$$. Obviously, it is a linear space of dimension $$$k$$$. Space of all characters is a subspace of the class function space, hence it suffices to show that irreducible characters form its basis. To do this, we show that if $$$\langle \chi_j, f \rangle = 0$$$ for all irreducible characters, then $$$f(g) = 0$$$.
For this, denoting $$$f_g = f(g)$$$, consider the formal expression
$$$ F(x) = \sum\limits_{g \in G} f_g x^g , $$$similar to the one that we used in the first part of the blog series. One may note that
$$$ (x^h)^{-1} F(x) x^h = \sum\limits_{g \in G} f_g x^{h^{-1} g h} = \sum\limits_{g \in G} f_{hg h^{-1}} x^g = F(x), $$$since $$$f(g) = f(hgh^{-1})$$$ for any $$$g, h \in G$$$, as $$$f(g)$$$ is a class function. Writing it a bit differently, we get
$$$ F(x) x^h = x^h F(x), $$$which means that $$$F(x)$$$ is an equivariant map from $$$V$$$ on itself. From Schur's lemma it follows that $$$F(x)$$$ is, in fact, a homothethy for any irreducible representation $$$x$$$. It means that $$$\operatorname{tr} F(x) = n \lambda$$$, where $$$n$$$ is the dimension of $$$V$$$, and $$$\lambda$$$ is the ratio of the homothety. On the other hand, the trace of $$$F(x)$$$ can be expressed through characters as
$$$ n \lambda = \operatorname{tr} F(x) = \sum\limits_{g \in G} f_g \chi (g) = |G| \langle f, \chi^*\rangle, $$$meaning that the homothethy ratio is expressed as
$$$ \lambda = \frac{|G|}{n} \langle f, \chi^* \rangle. $$$But earlier we assumed that $$$\langle f, \chi_j \rangle = 0$$$ for any irreducible character $$$\chi_j$$$, meaning that, in fact, $$$\lambda = 0$$$, hence $$$F(x) = 0$$$ for any irreducible representation $$$x$$$. Then, since any representation is isomorphic to a direct sum of irreducible representations, it also means that $$$F(x) = 0$$$ for any representation of $$$G$$$. In particular, $$$F(x)=0$$$ for $$$x$$$ being the regular representation of $$$G$$$.
But for the regular representation it means that
$$$ F(x) e_1 = \sum\limits_{g \in G} f_g x^g e_1 = \sum\limits_{g \in G} f_g e_g = 0, $$$meaning that $$$f_g = 0$$$ for all $$$g$$$, hence $$$f = 0$$$.
It means that the number of non-isomorphic irreducible representations over the field $$$\mathbb C$$$ is exactly the number of conjugacy classes in the group $$$G$$$. Then, to find specific representations, it is often possible (in particular, in the case of permutation group $$$S_n$$$) to find a bijection between conjugacy classes and irreducible pairwise non-isomorphic representations. Finding irreducible representations for the symmetric group $$$S_n$$$ will be the main topic of the next blog in the series, so stay tuned!