## Polynomial Multiplication

Given two polynomials $A(x)$ and $B(x)$ of degree-bound $n$:

$$A(x) = \sum_{i=0}^{n-1} a_{i}x^{i}$$ $$B(x) = \sum_{i=0}^{n-1} b_{i}x^{i}$$

Multiply $A(x)$ and $B(x)$, we get the product $C(x) = \sum_{i=0}^{2n-2} c_{i}x^{i}$ with coefficients:

$$c_{i} = \sum_{j=0}^{i} a_{j}b_{i-j}$$

Here $C(x)$ is also called the Convolution of $A(x)$ and $B(x)$ (denoted $A(x) \circledast B(x)$). If we calculate $C(x)$ by multiplying each term in $A(x)$ by each term in $B(x)$ and then combining terms with equal powers (exactly the old way we learnt in the junior high school), the time complexity would therefore raise up to $O(n^{2})$.

## An alternative representation of polynomials

The representation of polynomial we discussed above is called the coefficient representationAn alternative way of representing a polynomial is called the point-value representation. A point-value representation of a polynomial $A(x)$ of degree-bound $n$ is a set of $n$ point-value pairs $\left \{ \left ( x_{0}. y_{0} \right ), \left ( x_{1}. y_{1} \right ), ..., \left ( x_{n-1}. y_{n-1} \right ) \right \}$ such that all of the $x_{k}$ are distinct and $y_{k} = A(x_{k})$ for $k = 0, 1, ..., n-1$.

The point-value representation seems more attractive for polynomial multiplication, since the product $C(x)$ has degree-bound $2n-1$, which could be determined by its value at any $2n$ points. Any value of $C(x)$ at given point $p$ is easy enough to figure out, just $A(p)$ times $B(p)$. Thus polynomial multiplication just takes linear time ($O(n)$) in the point-value representation.

So we have a new algorithm here to get the product of two polynomials:

1. Double degree-bound: create coefficient representations of $A(x)$ and $B(x)$ as degree-bound $2n$ polynomials by adding $n$ high-order zero coefficients to each;
2. Evaluate: calculate point-value representations of $A(x)$ and $B(x)$ of length $2n$;
3. Multiply point-values: Compute a point-value representation for the polynomial $C(x)=A(x)B(x)$ by multiplying these point-values together;
4. Interpolate: transform the polynomials back from the point-value representation to the coefficient representation.

Apparently, step (1) and step (3) take $O(n)$ time, next we’ll figure out the time complexity of step(2) and step(4).

## Evaluation

The operation of evaluating the polynomial $A(x)$ at a given point $x_{0}$ takes time $O(n)$ using Horner’s method:

$$A(x_{0}) = a_{0} + x_{0}(a_{1}+x_{0}(a_{2}+ ... +x_{0}(a_{n-2}+x_{0}(a_{n-1})) ... ))$$

Now, consider step(2) evaluation: to transform the degree-bound $n$ polynomial $A(x)$ to the point-value representation, we need to evaluate $2n$ points, each takes $O(n)$ time, so the time complexity of evaluation would be $O(n^{2})$ in total.

## Interpolation

The inverse of evaluation — transforming a polynomial from a point-value representation to the coefficient representation — is called interpolation.

For any set $\left \{ (x_{0}, y_{0}), (x_{1}, y_{1}), . . ., (x_{n-1}, y_{n-1}) \right \}$ of $n$ point-value pairs, there is a unique polynomial $A(x)$ of degree-bound $n$ such that $y_{k} = A(x_{k})$ for $k = 0, 1, ... n-1$.

#### Proof

The proof is based on the existence of the inverse of a certain matrix:

$$\begin{bmatrix} 1 & x_{0} & x_{0}^{2} & \ldots & x_{0}^{n-1}\\ 1 & x_{1} & x_{1}^{2} & \ldots & x_{1}^{n-1}\\ \ldots & \ldots & \ldots & \ddots & \ldots \\ 1 & x_{n-1} & x_{n-1}^{2} & \ldots & x_{n-1}^{n-1} \end{bmatrix} \begin{bmatrix}a_{0}\\ a_{1}\\ \ldots \\ a_{n-1}\end{bmatrix} = \begin{bmatrix}y_{0}\\ y_{1}\\ \ldots \\ y_{n-1}\end{bmatrix}$$

The matrix on the left is denoted $V(x_{0}, x_{1}, ..., x_{n-1})$ and is known as a Vandermonde matrix, which has determinant:

$$\prod_{j

Therefore, it is invertible (i.e., nonsingular) if the $x_{k}$ are distinct. Thus, the coefficients $\overrightarrow{a}$ can be solved for uniquely given the point-value representation:

$$\overrightarrow{a}=V(x_{0}, x_{1}, ..., x_{n-1})^{-1} \overrightarrow{y}$$

Using the LU decomposition algorithms, we can solve these equations in time $O(n^{3})$.

This is even slower than the original algorithm, sadly. However, a faster algorithm for $n$-point interpolation based on Lagrange’s formula could reduce time complexity to $O(n^{2})$:

$$A(x) = \sum_{k=0}^{n-1} y_{k} \frac{\prod_{j \neq k} \left ( x - x_{j} \right ) }{\prod_{j \neq k} \left ( x_{k} - x_{j} \right )}$$

So far, our new algorithm still needs $O(n^{2})$ to calculate convolutions, but if we pick some delicate numbers deliberately, the time of evaluation and interpolation may accelerate to $O(n \log n)$.

## Evaluation by divide-and-conquer

A polynomial with degree-bound $n$ can be separated into two polynomials of degree-bound $n/2$ by its odd and even powers, for instance:

$$x^{7} + 2x^{6} + 3x^{5} - 4x^{4} - x^{3} + 2x^{2} + 5x - 3 = (2x^{6} - 4x^{4} + 2x^{2} - 3) + x(x^{6} + 3x^{4} - x^{2} + 5)$$

More generally,

$$A(x) = A_{e}(x^{2}) + xA_{o}(x^{2})$$

Here $A_{e}(\cdot)$ are the even-numbered coefficients, while $A_{o}(\cdot)$ are the odd-numbered coefficients. Since the even powers of $x_{i}$ coincide with those of $-x_{i}$,  if we choose $x$ to be positive-negative pairs $\pm x_{i}$:

$$A(x_{i}) = A_{e}(x_{i}^{2}) + x_{i}A_o(x_{i}^{2})$$ $$A(-x_{i}) = A_{e}(x_{i}^{2}) - x_{i}A_o(x_{i}^{2})$$

then evaluating $A(x_{i})$ at $n$ paired points reduces to evaluating $A_{e}(x_{i})$ and $A_{o}(x_{i})$, each contains $n/2$ points. Besides, to get the product of $x_{i}$ and $A_o(x_{i}^{2})$ takes $O(n)$ time, so the time complexity of our divide-and conquer algorithm is:

$$T(n) = 2T(n/2) + O(n)$$

aka $O(n \log n)$.

It seems like that the only work left to do is reverse-computing the coefficients, by calculating the square roots of the coefficient on the parent node:

If we re-write those complex numbers in polar coordinates, computing the square roots would be much easier.

#### The complex plane

Complex number $z = a + b\imath$ is plotted at position $(a, b)$ on the complex plane, which can be transformed into polar coordinates: $z = r(\cos\theta +\imath\sin\theta)$, denoted $(r, \theta)$, here:

• length $r = \sqrt{a^{2} + b^{2}}$
• angle $\theta \in \left [ 0, 2 \pi \right ): \cos \theta = \frac{a}{r}, \sin \theta = \frac{b}{r}$
• $\theta$ can always be reduced module $2\pi$

Multiplying complex numbers in polar coordinates is much easier than in Cartesian coordinates: just multiply the lengths and add the angles, for any $z = (r, \theta)$:

$$(r_{1}, \theta_{1}) \times (r_{2} ,\theta_{2}) = (r_{1}r_{2}, \theta_{1} + \theta_{2})$$

Moreover, if $z$ is on the unit circle (i.e., $r = 1$), then $z^{n} = (1,n\theta)$.

The $n_{th}$ complex roots of unity is the “unit” solution to the equation $z^{n} = 1$, i.e. $z_{n} = (1, \theta)$, here $\theta$ is a multiple of $2\pi/n$ ($n$ is power of 2, for here $n = 16$). In Cartesian coordinates, $z_{n}$ is denoted:

$$z_{n} = exp(2 \pi \imath / n)$$

Therefore, the solutions of $z^{n} = 1$ for here $n$ is a power of 2 are exactly the “delicate numbers” we need. The resulting algorithm is the fast Fourier transform.

## Interpolation at the complex roots of unity

If we plug $x = z_{n}$ in for the equation $\overrightarrow{y}=V(x_{0}, x_{1}, ..., x_{n-1}) \overrightarrow{a}$, our equation becomes:

$$\begin{bmatrix} 1 & 1 & 1 & 1 & \ldots & 1\\ 1 & z_{n} & z_{n}^{2} & z_{n}^{3} & \ldots & z_{n}^{n-1}\\ 1 & z_{n}^{2} & z_{n}^{4} & z_{n}^{6} & \ldots & z_{n}^{2(n-1)}\\ 1 & z_{n}^{3} & z_{n}^{6} & z_{n}^{9} & \ldots & z_{n}^{3(n-1)}\\ \ldots & \ldots & \ldots & \ldots & \ddots & \ldots \\ 1 & z_{n}^{n-1} & z_{n}^{2(n-1)} & z_{n}^{3(n-1)} & \ldots & z_{n-1}^{(n-1)(n-1)} \end{bmatrix} \begin{bmatrix}a_{0}\\ a_{1}\\ a_{2}\\ a_{3}\\ \ldots \\ a_{n-1}\end{bmatrix} = \begin{bmatrix}y_{0}\\ y_{1}\\ y_{2}\\ y_{3}\\ \ldots \\ y_{n-1}\end{bmatrix}$$

i.e.:

$$V_{n}[k][j] = z_{n}^{k \cdot j}, \text{for } k, j = 0, 1, ..., n-1.$$

For the inverse operation, which we write as $\overrightarrow{a}=V(x_{0}, x_{1}, ..., x_{n-1})^{-1} \overrightarrow{y}$, i.e.:

$$V_{n}^{-1}[j][k] = \frac{z_{n}^{-k \cdot j}}{n}, \text{for } k, j = 0, 1, ..., n-1.$$

#### Proof

For any value $(j, {j}')$ in $V_{n}V_{n}^{-1}$:

$$V_{n}V_{n}^{-1}[j][{j}'] = \sum_{k=0}^{n-1} \frac{z_{n}^{-k \cdot j}}{n} \cdot z_{n}^{k \cdot {j}'} = \sum_{k=0}^{n-1} \frac{z_{n}^{k \cdot (j-{j}')}}{n}$$

Apparently, if $j = {j}'$, $V_{n}V_{n}^{-1}[j][{j}'] = 1$. If $j \neq {j}'$:

\begin{aligned} \because \sum_{k=0}^{n-1} x^{k} &= \frac{x^{n} - 1}{x - 1} \\ \therefore \sum_{j=0}^{n-1} (z_{n}^{k})^{j} &= \frac{(z_{n}^{k})^n - 1}{z_{n}^{k} - 1} = \frac{(z_{n}^{n})^k - 1}{z_{n}^{k} - 1} = \frac{1^k - 1}{z_{n}^{k} - 1} = 0 \left ( k \% n != 0 \right )\end{aligned}

Given the inverse matrix $V_{n}^{-1}$, we notice that

$$a_{j} = \frac{1}{n} \sum_{k = 0}^{n - 1} j_{k} \cdot z_{n}^{-k \cdot j}, \text{for } j = 0, 1, ..., n-1$$

by modifying the FFT algorithm to switch the roles of $a$ and $y$, replace $z_{n}$ by $z_{n}^{-1}$, and divide the sum by $n$, we can compute the inverse DFT in $O(n \log n)$ time as well.

## Conclusions

And now we can finally step back and view the whole algorithm geometrically. We first transform the polynomials from the coefficient representations into the point-value representations (evaluation) in $O(n \log n)$ time, then perform the task (multiplication) in $O(n)$ time, and finally transform back (interpolation). To efficiently switch between these, back and forth, is the province of the FFT, which takes only $O(n \log n)$ time in total.