In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Lecture 10: Polynomial Interpolation

Introduction to Mathematical Thinking

November 14, 2018
Suraj Rampure

Announcements

  • Today's lecture: Two interesting topics that you'll see again in future classes, that build upon the material we've seen in this class. Using an iPython notebook since we'll be implementing things in code! Hopefully this material will overlap with what you've seen in courses like CS 61A.
  • Homework 10 won't be due, instead it will be meant for your own practice for the final. Not ready yet, but will be ready in the next few days. Some of the material from today's class will be in Homework 10, and will also be on the final!
  • No class at all next week. The final is in two weeks from today, in class. Our final meeting before then will be on Monday, November 26th.
  • By early next week, your total # of points in this class will be determined.

Polynomial Interpolation

The first key feature is that a polynomial of degree $n$ is uniquely determined by a set of $n+1$ points. Consider our situation in two dimensions:

  • Given any two points, there is exactly one line that passes through them
  • Given just one point, there are infinitely many lines passing through it

For example, there is only one line that passes through $(1, 4)$ and $(3, 10)$: it is the line $y = 3x + 1$. We can then say that $y = 3x + 1$ and $\{ (1, 4), (3, 10) \}$ are equivalent representations of the same function. The first is in the regular, standard form you're used to, and the latter is in point representation.

Here, we see the only degree 1 polynomial, $y = 3x + 1$, that passes through the points $(1, 4)$ and $(3, 10)$.

In [2]:
x = np.linspace(-0, 4)
y = 3 * x + 1
plt.plot(x, y);
plt.scatter([1, 3], [4, 10], color='r');

Here, we see 3 (of infinitely many) degree 2 polynomials that all pass through the points $(0, 4)$ and $(3, 13)$.

In [3]:
x = np.linspace(-3, 6)
y1 = 2*x**2 - 3*x + 4
y2 = x**2 + 4
y3 = -2*x**2 + 9*x + 4
plt.plot(x, y1);
plt.plot(x, y2);
plt.plot(x, y3);
plt.scatter([0, 3], [4, 13], color='red');
plt.ylim(-20, 20);

We can easily convert between standard form and the point representation - given some degree $n$ polynomial, we can plug in $n+1$ points into it and record the $n+1$ pairs $(x_i, y_i)$ and call this our point representation. The question you may be asking, though, is how can we do the opposite - how can we find the standard form of a polynomial given the point representation, without repeated guessing and checking?

For the 2 point, linear polynomial case, this is simple. We first calculate $m = \frac{\Delta y} {\Delta x} = \frac{10 - 4}{3 - 1} = 3$, and substitute either point into the relationship $y = 3x + b$ to solve for $b$.

This doesn't generalize well to polynomials of $n$ degree.

Interpolation – The Naive Approach

Suppose we're given five points, $$S = \{(0, -1), (1, 0), (2, -11), (3, 2), (4, 99)\}$$
Since we know that we're searching for a degree 4 polynomial, we could create $p(x) = ax^4 + bx^3 + cx^2 + dx + e$. Substituting each of the five points into $p(x)$ would give us a solvable system of 5 equations and 5 unknowns. These equations would be as follows:


$$a(0)^4 + b(0)^3 + c(0)^2 + d(0) + e = -1\\\ a(1)^4 + b(1)^3 + c(1)^2 + d(1) + e = 0\\\ a(2)^4 + b(2)^3 + c(2)^2 + d(2) + e = -11\\\ a(3)^4 + b(3)^3 + c(3)^2 + d(3) + e = 2\\\ a(4)^4 + b(4)^3 + c(4)^2 + d(4) + e = 99$$


However, solving this system of equations and unknowns would take quite some time.

no-time

Luckily, there exists a more intuitive way to construct $p(x)$. Since there is only one such degree 4 polynomial that passes through these five points, both methods should (and do) result in the same $p(x)$.

Lagrange Interpolation

Instead of trying to create $p(x)$ at once, let's try and create five smaller polynomials, that we can then sum to create $p(x)$. For each provided point $(x_i, y_i)$, $(x_1, y_1)$ being the first point we were given, we want to craft a sub-polynomial $p_{i}$ with the following properties:

  • $p_i(x_i) = 1$

  • $p_i(x_j) = 0, \forall \: j \neq i$

In other words, sub-polynomial $i$ should evaluate to 1 if $x_i$ is passed in, and to 0 if any of the other four $x_j$s are passed in (we will see why this structure is important very soon). We can create such a sub-polynomial, for each $I$, as follows:


$$ p_{i}(x) = \frac{\Pi_{j \neq i} (x - x_j)}{\Pi_{j \neq i} (x_i - x_j)} $$


For clarity, let's calculate $p_1(x)$ and $p_3(x)$. Recall, we had $S = \{(0, -1), (1, 0), (2, -11), (3, 2), (4, 99)\}$, meaning that $(x_1, y_1) = (0, -1)$ and $(x_3, y_3) = (2, -11)$.


$$ p_1(x) = \frac{\Pi_{j \neq 1}(x - x_j)}{\Pi_{j \neq 1}(0 - x_j)} = \frac{(x-1)(x-2)(x-3)(x-4)}{(0-1)(0-2)(0-3)(0-4)} = \frac{1}{24}(x-1)(x-2)(x-3)(x-4)$$
$$p_3(x) = \frac{\Pi_{j \neq 3}(x - x_j)}{\Pi_{j \neq 3}(0 - x_j)} = \frac{(x-0)(x-1)(x-3)(x-4)}{(2-0)(2-1)(2-3)(2-4)} = \frac{1}{4}(x)(x-1)(x-3)(x-4) $$


The second-to-last step of the above expansions best illustrate why we've chosen to craft our sub-polynomials in this way; if we were to evaluate $p_1(0)$, the numerator and denominator would be exactly the same. If we were to instead evaluate $p_1(1), p_1(2), p_1(3)$ or $p_1(4)$, since $x-1, x-2, x-3, x-4$ are all factors of the numerator the result would be 0.

We're almost done. We can now say that our final polynomial $p(x)$ is constructed as follows:


$$ p(x) = \sum_{i = 1}^n y_i p_i(x) $$


This is where the $y$ values of each of the given points come into play. Looking at our example more closely, we have:


$$ p(x) = -p_1(x) + 0p_2(x) -11p_3(x) + 2p_4(x) + 99p_5(x) $$


From the way each $p_i(x)$ was constructed, $p(0) = (-1) \cdot 1 + 0 \cdot 0 + (-11) \cdot 0 + 2 \cdot 0 + 99 \cdot 0 = -1$, and so on and so forth, as we expected. Doing the arithmetic yields the original polynomial, $p(x) = x^4 - 13x^2 + 13x - 1$.

In [4]:
x = np.linspace(-1, 6)
y = x**4 - 13*x**2 + 13*x - 1
plt.scatter([0, 1, 2, 3, 4], [-1, 0, -11, 2, 99], color='r')
plt.plot(x, y);
plt.ylim(-40, 200);

Implementing Lagrange Interpolation in Code

Let's write a (higher-order) function in Python that will allow us to take a set of points $S$ and return the polynomial that interpolates it.

In [5]:
def sub_polynomial(S, i):
    def f(x):
        num, den = 1, 1
        for p in range(len(S)):
            if p != i:
                num *= (x - S[p][0])
                den *= (S[i][0] - S[p][0])
        return num / den
    return f

def interpolate(S):
    def f(x):
        return sum([S[i][1] * sub_polynomial(S, i)(x) for i in range(len(S))])
    return f

Our function sub_polynomial creates the $p_i(x)$ functions that we saw earlier. interpolate puts all of this together. Let's test out interpolate with the same set S we saw before.

In [6]:
S = [(0, -1), (1, 0), (2, -11), (3, 2), (4, 99)]
x = np.linspace(-2, 6, 1000)
y = [interpolate(S)(i) for i in x]
plt.scatter([s[0] for s in S], [s[1] for s in S], color='r')
plt.plot(x, y);
plt.ylim(-40, 200);

Now, let's write a function that interpolates and plots any arbitrary set.

In [7]:
def interpolate_and_plot(S):
    xs = [s[0] for s in S]
    ys = [s[1] for s in S]
    
    f = interpolate(S)
    
    x = np.linspace(min(xs) - 1, max(xs) + 1, 1000)
    y = [f(i) for i in x]
    
    plt.scatter(xs, ys, color='r');
    plt.plot(x, y);
In [8]:
#interpolate_and_plot([(100, 2), (102, 10), (103, -3)])
interpolate_and_plot([(-5, 2), (1, 5), (2, 17)])

Neat! Notice what happens when we pass in $S = \{ (1, 3), (2, 4), (3, 5) \}$.

In [9]:
interpolate_and_plot([(1, 3), (2, 4), (3, 5)])

Here, we passed in 3 points, so we'd expect the degree of our polynomial to be 2. However, it turns out that these three points were colinear, meaning they all lie on the same line. The $x^2$ term cancelled out in Lagrange Interpolation.


Modular Arithmetic... with Polynomials?


no-time

Now, let's combine our knowledge of polynomial interpolation with modular arithmetic.

Recall, when we dealt with arithmetic $\text{mod } m$, we said that all integers reduced to an integer in the set $\{0, 1, 2, ... , p - 1\}$. The modular inverse of $a$ in $\text{mod } m$ only exists when $\text{gcd}(a, m) = 1$. If we wanted some modular base $m$ such that an inverse exists for every $a$, we needed that $m$ was prime.

Now, suppose we want to find the polynomial that interpolates a set of points $S$, but we consider all points and polynomials to be reduced $\text{mod } m$ (also meaning that we're only looking at integers). The main reason we'd ever do this is if we want to restrict our inputs and outputs to be in the set $\{0, 1, 2, ..., m-1 \}$.

For example, suppose we want to interpolate the polynomial passing through $(1, 3), (2, 5)$ and $(4, 2)$, under $\text{mod } 7$.

Let's start by creating our subpolynomial $p_1(x)$.

$$p_1(x) = \frac{(x-2)(x-4)}{(1-2)(1-4)} = \frac{x^2 - 6x + 8}{3} = \frac{1}{3} (x^2 - 6x + 8)$$

Notice any problems? We can't divide by 3. Instead, we replace $\frac{1}{3}$ with the inverse of 3 in $\text{mod } 7$. You can verify that $3^{-1} \equiv 5 \: (\text{mod } 7)$. To simplify things, we can also rewrite $-6x$ as $+x$, and $8$ as $1$. Then, we have

$$p_1(x) = 5(x^2 + x + 1)$$

Also,

$$p_2(x) = \frac{(x-1)(x-4)}{(2-1)(2-4)} = \frac{x^2 - 5x + 4}{-2} = 3(x^2 - 5x + 4)$$

We rewrite $(-2)^{-1}$ as $5^{-1}$, which is equivalent to $3$ in $\text{mod } 7$.

$$p_3(x) = \frac{(x-1)(x-2)}{(4-1)(4-2)} = \frac{x^2 - 3x + 2}{6} = 6(x^2 - 3x + 2)$$

Putting all of this together - we have

$$p(x) = 3p_1(x) + 5p_2(x) + 2p_3(x)$$ $$= 15(x^2 + x + 1) + 15(x^2 - 5x + 4) + 12(x^2 - 3x + 2)$$

We see that $15 \equiv 1 \: (\text{mod } 7)$, which greatly simplifies our calculations. $12 \equiv 5 \: (\text{mod } 7)$, but we can also say $12 \equiv -2 \: (\text{mod } 7)$. We choose to rewrite $12$ as $-2$ as it will allow us to cancel all $x^2$ terms.

$$p(x) = x^2 + x + 1 + x^2 - 5x + 4 + -2(x^2 - 3x + 2)$$ $$= x + 1 - 5x + 4 + 6x - 4$$ $$= \boxed{2x + 1}$$

We see that $p(x) = 2x + 1$. This $p(x)$ is not the same as the $p(x)$ we'd find by doing standard Lagrange Interpolation – using standard, non-modular interpolation, we'd find a polynomial of degree 2, as we have three points that are not colinear. However, working under $\text{mod } 7$ simplifies our functions; this $p(x)$ is equivalent to the standard Langrange interpolation $p(x)$ for integer inputs $\{0, 1, 2, 3, 4, 5, 6\}$ under $\text{mod } 7$.

For non-integer inputs, our simplified $p(x)$ is not equivalent to the standard non-modular interpolated $p(x)$ we would find without using $\text{mod } 7$.

In [10]:
# This assumes that there is a unique inverse. This will work for our purposes.
def inverse(a, m):
    for i in range(m):
        if (a * i) % m == 1:
            return i
    return "error"
In [11]:
def sub_polynomial_mod(S, i, m):
    def f(x):
        num, den = 1, 1
        for p in range(len(S)):
            if p != i:
                num *= (x - S[p][0]) % m
                den *= (S[i][0] - S[p][0]) % m
        return num * inverse(den, m) % m
    return f

def interpolate_mod(S, m):
    def f(x):
        return round(sum([S[i][1] * sub_polynomial_mod(S, i, m)(x) for i in range(len(S))]) % m)
    return f
In [12]:
def interpolate_and_plot_mod(S, m):
    xs = [s[0] for s in S]
    ys = [s[1] for s in S]
    
    f = interpolate_mod(S, m)
    
    x = np.arange(min(xs) - 2, max(xs) + 2) # Need to use integer only inputs
    y = [f(i) for i in x]
    
    plt.scatter(xs, ys, color='r');
    plt.plot(x, y);
In [13]:
interpolate_and_plot_mod([(1, 3), (2, 5), (4, 2)], 7)
interpolate_and_plot([(1, 3), (2, 5), (4, 2)])

# For fun, we'll also plot y = 2x + 1

xs = np.linspace(-1, 5)
ys = 2*xs + 1
plt.plot(xs, ys)
Out[13]:
[<matplotlib.lines.Line2D at 0x119312b70>]
In [14]:
f = interpolate([(1, 3), (2, 5), (4, 2)])
g = interpolate_mod([(1, 3), (2, 5), (4, 2)], 7)

for i in range(7):
    print(f(i), g(i))
-1.3333333333333335 1
3.0 3
5.0 5
4.666666666666667 0
2.0 2
-3.0 4
-10.333333333333332 6

An Application of Polynomial Interpolation and Modular Arithmetic – Error Correcting Codes

We now have the tools to tackle the following problem.

Suppose I want to text you a message, but I know (for whatever reason) that one of the characters in my message will be deleted. For example, suppose I want to text you BAD; the message you end up receiving could be _AD or B_D or BA_.

How do we proceed?

It turns out we can interpolate the polynomial that passes through our original message, and send one extra character! Since we know one character is going to be dropped, if we send one extra character, we should be able to recover our original message.

Let's pretend "A" corresponds to 0, "B" corresponds to 1, "C" corresponds to 2, and so on and so forth. Note, this means that we have no interpretation of any number greater than 25, or any non-integer. This means we need to reduce our polynomials, modulo some integer.

Furthermore, the modular base we choose needs to be prime, because we need every integer to have an inverse in it. (Think about the denominators of our $p_i(x)$ polynomials.) For that reason, we'll choose $m = 29$, because it is the smallest prime number that is larger than 26. If we instead restricted our alphabet to just be the first five letters, for example, we could choose $m = 7$.

First we'll define a function to convert a message into points.

In [15]:
alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
MOD = 29

def message_to_points(message):
    points = []
    for i in range(len(message)):
        if message[i] != " ":
            points.append((i, alpha.index(message[i].upper())))
    return points

message_to_points('BAD')
Out[15]:
[(0, 1), (1, 0), (2, 3)]

Cool. Now, given our point representation, we need to find one more point so that we can send an extra character to combat the loss.

In [16]:
def find_extra_character(message):
    f = interpolate_mod(message_to_points(message), MOD) # Finds the interpolating polynomial
    alphabet_index = round(f(len(message))) # We know we will be adding a character at the end
    return message + alpha[alphabet_index]

find_extra_character('BAD')
Out[16]:
'BADK'

We'll also define a function that takes a message and randomly deletes a single character.

In [17]:
def delete_character(message):
    i = np.random.randint(len(message))
    message = message[:i] + " " + message[i+1:]
    return message

delete_character(find_extra_character('BAD'))
Out[17]:
'B DK'

Lastly, we need the ability to recover a missing character. Let's write that. (Notice that when we defined message_to_points, we considered the possibility that a certain character was empty.)

Additionally, once we do that, we know our original message is everything except the last character. This means we can chop off the last character before returning it in the following function.

In [18]:
def recover_message(message):
    f = interpolate_mod(message_to_points(message), MOD)
    for i in range(len(message)):
        if message[i] == " ":
            missing_index = i
            break
    missing_char = alpha[round(f(missing_index))]
    return (message[:missing_index] + missing_char + message[missing_index+1:])[:-1]

Just to see if it works, let's try recovering a character on all four possible erasures of BAD.

In [19]:
messages = ["BAD ", "BA K", "B DK", " ADK"]
for message in messages:
    print(recover_message(message))
BAD
BAD
BAD
BAD

Now, we can do the whole process.

  • Start off with a message we want to send.
  • Convert our message to points, and interpolate the polynomial that passes through it.
  • Find one extra character and append it to our message.
  • Send our message, knowing one character will be erased.
  • The receiver can receive the message, interpolate the polynomial again and find the character at the missing index.

We can encapsulate this behavior into two functions, send and receive.

In [20]:
def send(message):
    extended_message = find_extra_character(message)
    dropped_message = delete_character(extended_message)
    return dropped_message

def receive(message):
    return recover_message(message)
In [21]:
receive(send('BAD'))
Out[21]:
'BAD'
In [22]:
sent = send('GORILLA')
sent
Out[22]:
'GOR LLAX'
In [23]:
received = receive(sent)
received
Out[23]:
'GORILLA'