Tag Archives: trigonometry

Numerical Transformations, I

It’s been over a decade since I’ve taught a class where I’ve felt the freedom to really explore transformations with a strong matrix thread.  Whether due to curricular pressures, lack of time, or some other reason, I realized I had drifted away from some nice connections when I recently read Jonathan Dick’s and Maria Childrey’s Enhancing Understanding of Transformation Matrices in the April, 2012 Mathematics Teacher (abstract and complete article here).

Their approach was okay, but I was struck by the absence of a beautiful idea I believe I learned at a UCSMP conference in the early 1990s.  Further, today’s Common Core State Standards for Mathematics explicitly call for students to “Work with 2×2 matrices as transformations of the plane, and interpret the absolute value of the determinant in terms of area” (see Standard NV-M 12 on page 61 of the CCSSM here).  I’m going to take a couple posts to unpack this standard and describe the pretty connection I’ve unfortunately let slip out of my teaching.

What they almost said

At the end of the MT article, the authors performed a double transformation equivalent to reflecting the points (2,0), (3,-4), and (9,-7) over the line y=x via matrices using \left[ \begin{array}{cc} 0&1 \\ 1&0 \end{array} \right] \cdot  \left[ \begin{array}{ccc} 2 & 3 & 9 \\ 0 & -4 & -7 \end{array} \right] = \left[ \begin{array}{ccc} 0 & -4 & -7 \\ 2 & 3 & 9 \end{array} \right] giving image points (0,2), (-4,3), and (-7,9).  That this matrix multiplication reversed all of the points’ coordinates is compelling evidence that \left[ \begin{array}{cc} 0 & 1 \\ 1 & 0\end{array} \right] might be a y=x reflection matrix.

Going much deeper

Here’s how this works.  Assume a set of pre-image points, P, undergoes some transformation T to become image points, P’.  For this procedure, T can be almost any transformation except a translation–reflections, dilations, scale changes, rotations, etc.  Translations can be handled using augmentations of these transformation matrices, but that is another story.  Assuming P is a set of n two-dimensional points, then it can be written as a 2×n pre-image matrix, [P], with all of the x-coordinates in the top row and the corresponding y-coordinates in the second row.  Likewise, [P’] is a 2×n matrix of the image points, while [T] is a 2×2 matrix unique to the transformation. In matrix form, this relationship is written [T] \cdot [P] = [P'].

So what would \left[ \begin{array}{cc} 0 & -1 \\ 1 & 0\end{array} \right] do as a transformation matrix?  To see, transform (2,0), (3,-4), and (9,-7) using this new [T].

\left[ \begin{array}{cc} 0&-1 \\ 1&0 \end{array} \right] \cdot  \left[ \begin{array}{ccc} 2 & 3 & 9 \\ 0 & -4 & -7 \end{array} \right] = \left[ \begin{array}{ccc} 0 & 4 & 7 \\ 2 & 3 & 9 \end{array} \right]

The result might be more easily seen graphically with the points connected to form pre-image and image triangles.

After studying the graphic, hopefully you can see that \left[ \begin{array}{cc} 0 & -1 \\ 1 & 0\end{array} \right] rotated the pre-image points 90 degrees around the origin.

Generalizing

Now you know the effects of two different transformation matrices, but what if you wanted to perform a specific transformation and didn’t know the matrix to use.  If you’re new to transformations via matrices, you may be hoping for something much easier than the experimental approach used thus far.  If you can generalize for a moment, the result will be a stunningly simple way to determine the matrix for any transformation quickly and easily.

Assume you need to find a transformation matrix, [T]= \left[ \begin{array}{cc} a & c \\ b & d \end{array}\right] .  Pick (1,0) and (0,1) as your pre-image points.

\left[ \begin{array}{cc} a&c \\ b&d \end{array} \right] \cdot  \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right] = \left[ \begin{array}{cc} a & c \\ b & d \end{array} \right]

On the surface, this says the image of (1,0) is (a,b) and the image of (0,1) is (c,d), but there is so much more here!

Because the pre-image matrix for (1,0) and (0,1) is the 2×2 identity matrix, [T]= \left[ \begin{array}{cc} a & c \\ b & d \end{array}\right] will always be BOTH the transformation matrix AND (much more importantly), the image matrix.  This is a major find.  It means that if you  know the images of (1,0) and (0,1) under some transformation T, then you automatically know the components of [T]!

For example, when reflecting over the x-axis, (1,0) is unchanged and (0,1) becomes (0,-1), making [T]= \left[ r_{x-axis} \right] = \left[ \begin{array}{cc} 1 & 0 \\ 0 & -1\end{array} \right] .  Remember, coordinates of points are always listed vertically.

Similarly, a scale change that doubles x-coordinates and triples the ys transforms (1,0) to (2,0) and (0,1) to (0,3), making [T]= \left[ S_{2,3} \right] = \left[ \begin{array}{cc} 2 & 0 \\ 0 & 3\end{array} \right] .

In a generic rotation of \theta around the origin, (1,0) becomes (cos(\theta ),sin(\theta )) and (0,1) becomes (-sin(\theta ),cos(\theta )).

Therefore, [T]= \left[ R_\theta \right] = \left[ \begin{array}{cc} cos(\theta ) & -sin(\theta ) \\ sin(\theta ) & cos(\theta ) \end{array} \right] .  Substituting \theta = 90^\circ into this [T] confirms the \left[ R_{90^\circ} \right] = \left[ \begin{array}{cc} 0 & -1 \\ 1 & 0\end{array} \right] matrix from earlier.

As nice as this is, there is even more beautiful meaning hidden within transformation matrices.  I’ll tackle some of that in my next post.

Trig Identities with a Purpose

Yesterday, I was thinking about some changes I could introduce to a unit on polar functions.  Realizing that almost all of the polar functions traditionally explored in precalculus courses have graphs that are complete over the interval 0\le\theta\le 2\pi, I wondered if there were any interesting curves that took more than 2\pi units to graph.

My first attempt was r=cos\left(\frac{\theta}{2}\right) which produced something like a merged double limaçon with loops over its 4\pi period.

Trying for more of the same, I graphed r=cos\left(\frac{\theta}{3}\right) guessing (without really thinking about it) that I’d get more loops.  I didn’t get what I expected at all.

Wow!  That looks exactly like the image of a standard limaçon with a loop under a translation left of 0.5 units.

Further exploration confirms that r=cos\left(\frac{\theta}{3}\right) completes its graph in 3\pi units while r=\frac{1}{2}+cos\left(\theta\right) requires 2\pi units.

As you know, in mathematics, it is never enough to claim things look the same; proof is required.  The acute challenge in this case is that two polar curves (based on angle rotations) appear to be separated by a horizontal translation (a rectangular displacement).  I’m not aware of any clean, general way to apply a rectangular transformation to a polar graph or a rotational transformation to a Cartesian graph.  But what I can do is rewrite the polar equations into a parametric form and translate from there.

For 0\le\theta\le 3\pi , r=cos\left(\frac{\theta}{3}\right) becomes \begin{array}{lcl} x_1 &= &cos\left(\frac{\theta}{3}\right)\cdot cos\left (\theta\right) \\ y_1 &= &cos\left(\frac{\theta}{3}\right)\cdot sin\left (\theta\right) \end{array} .  Sliding this \frac{1}{2} a unit to the right makes the parametric equations \begin{array}{lcl} x_2 &= &\frac{1}{2}+cos\left(\frac{\theta}{3}\right)\cdot cos\left (\theta\right) \\ y_2 &= &cos\left(\frac{\theta}{3}\right)\cdot sin\left (\theta\right) \end{array} .

This should align with the standard limaçon, r=\frac{1}{2}+cos\left(\theta\right) , whose parametric equations for 0\le\theta\le 2\pi  are \begin{array}{lcl} x_3 &= &\left(\frac{1}{2}+cos\left(\theta\right)\right)\cdot cos\left (\theta\right) \\ y_3 &= &\left(\frac{1}{2}+cos\left(\theta\right)\right)\cdot sin\left (\theta\right) \end{array} .

The only problem that remains for comparing (x_2,y_2) and (x_3,y_3) is that their domains are different, but a parameter shift can handle that.

If 0\le\beta\le 3\pi , then (x_2,y_2) becomes \begin{array}{lcl} x_4 &= &\frac{1}{2}+cos\left(\frac{\beta}{3}\right)\cdot cos\left (\beta\right) \\ y_4 &= &cos\left(\frac{\beta}{3}\right)\cdot sin\left (\beta\right) \end{array} and (x_3,y_3) becomes \begin{array}{lcl} x_5 &= &\left(\frac{1}{2}+cos\left(\frac{2\beta}{3}\right)\right)\cdot cos\left (\frac{2\beta}{3}\right) \\ y_5 &= &\left(\frac{1}{2}+cos\left(\frac{2\beta}{3}\right)\right)\cdot sin\left (\frac{2\beta}{3}\right) \end{array} .

Now that the translation has been applied and both functions operate over the same domain, the two functions must be identical iff x_4 = x_5 and y_4 = y_5 .  It’s time to prove those trig identities!

Before blindly manipulating the equations, I take some time to develop some strategy.  I notice that the (x_5, y_5) equations contain only one type of angle–double angles of the form 2\cdot\frac{\beta}{3} –while the (x_4, y_4) equations contain angles of two different types, \beta and \frac{\beta}{3} .  It is generally easier to work with a single type of angle, so my strategy is going to be to turn everything into trig functions of double angles of the form 2\cdot\frac{\beta}{3} .

\displaystyle \begin{array}{lcl} x_4 &= &\frac{1}{2}+cos\left(\frac{\beta}{3}\right)\cdot cos\left (\beta\right) \\  &= &\frac{1}{2}+cos\left(\frac{\beta}{3}\right)\cdot cos\left (\frac{\beta}{3}+\frac{2\beta}{3} \right) \\  &= &\frac{1}{2}+cos\left(\frac{\beta}{3}\right)\cdot\left( cos\left(\frac{\beta}{3}\right) cos\left(\frac{2\beta}{3}\right)-sin\left(\frac{\beta}{3}\right) sin\left(\frac{2\beta}{3}\right)\right) \\  &= &\frac{1}{2}+\left[cos^2\left(\frac{\beta}{3}\right)\right] cos\left(\frac{2\beta}{3}\right)-\frac{1}{2}\cdot 2cos\left(\frac{\beta}{3}\right) sin\left(\frac{\beta}{3}\right) sin\left(\frac{2\beta}{3}\right) \\  &= &\frac{1}{2}+\left[\frac{1+cos\left(2\frac{\beta}{3}\right)}{2}\right] cos\left(\frac{2\beta}{3}\right)-\frac{1}{2}\cdot sin^2\left(\frac{2\beta}{3}\right) \\  &= &\frac{1}{2}+\frac{1}{2}cos\left(\frac{2\beta}{3}\right)+\frac{1}{2} cos^2\left(\frac{2\beta}{3}\right)-\frac{1}{2} \left( 1-cos^2\left(\frac{2\beta}{3}\right)\right) \\  &= & \frac{1}{2}cos\left(\frac{2\beta}{3}\right) + cos^2\left(\frac{2\beta}{3}\right) \\  &= & \left(\frac{1}{2}+cos\left(\frac{2\beta}{3}\right)\right)\cdot cos\left(\frac{2\beta}{3}\right) = x_5  \end{array}

Proving that the x expressions are equivalent.  Now for the ys

\displaystyle \begin{array}{lcl} y_4 &= & cos\left(\frac{\beta}{3}\right)\cdot sin\left(\beta\right) \\  &= & cos\left(\frac{\beta}{3}\right)\cdot sin\left(\frac{\beta}{3}+\frac{2\beta}{3} \right) \\  &= & cos\left(\frac{\beta}{3}\right)\cdot\left( sin\left(\frac{\beta}{3}\right) cos\left(\frac{2\beta}{3}\right)+cos\left(\frac{\beta}{3}\right) sin\left(\frac{2\beta}{3}\right)\right) \\  &= & \frac{1}{2}\cdot 2cos\left(\frac{\beta}{3}\right) sin\left(\frac{\beta}{3}\right) cos\left(\frac{2\beta}{3}\right)+\left[cos^2 \left(\frac{\beta}{3}\right)\right] sin\left(\frac{2\beta}{3}\right) \\  &= & \frac{1}{2}sin\left(2\frac{\beta}{3}\right) cos\left(\frac{2\beta}{3}\right)+\left[\frac{1+cos \left(2\frac{\beta}{3}\right)}{2}\right] sin\left(\frac{2\beta}{3}\right) \\  &= & \left(\frac{1}{2}+cos\left(\frac{2\beta}{3}\right)\right)\cdot sin\left (\frac{2\beta}{3}\right) = y_5  \end{array}

Therefore the graph of r=cos\left(\frac{\theta}{3}\right) is exactly the graph of r=\frac{1}{2}+cos\left(\theta\right) slid \frac{1}{2} unit left.  Nice.

If there are any students reading this, know that it took a few iterations to come up with the versions of the identities proved above.  Remember that published mathematics is almost always cleaner and more concise than the effort it took to create it.  One of the early steps I took used the substitution \gamma =\frac{\beta}{3} to clean up the appearance of the algebra.  In the final proof, I decided that the 2 extra lines of proof to substitute in and then back out were not needed.  I also meandered down a couple unnecessarily long paths that I was able to trim in the proof I presented above.

Despite these changes, my proof still feels cumbersome and inelegant to me.  From one perspective–Who cares?  I proved what I set out to prove.  On the other hand, I’d love to know if someone has a more elegant way to establish this connection.  There is always room to learn more.  Commentary welcome.

In the end, it’s nice to know these two polar curves are identical.  It pays to keep one’s eyes eternally open for unexpected connections!

Polar Graphing Surprise

Nurfatimah Merchant and I were playing around with polar graphs, trying to find something that would stretch students beyond simple circles and types of limacons while still being within the conceptual reach of those who had just been introduced to polar coordinates roughly two weeks earlier.

We remembered that Cartesian graphs of trigonometric functions are much more “interesting” with different center lines.  That is, the graph of y=cos(x)+3 is nothing more than a standard cosine graph oscillating around y=3.

Likewise, the graph of y=cos(x)+0.5x is a standard cosine graph oscillating around y=0.5x.

We teach polar graphing the same way.  To graph r=3+cos(2\theta ), we encourage our students to “read” the function as a cosine curve of period \pi oscillating around the polar function r=3.  Because of its period, this curve will complete a cycle in 0\le\theta\le\pi.  The graph begins this interval at \theta =0 (the positive x-axis) with a cosine graph 1 unit “above” r=3, moving to 1 unit “below” the “center line” at \theta =\frac{\pi}{2}, and returning to 1 unit above the center line at \theta =\pi.  This process repeats for \pi\le\theta\le 2\pi.

Our students graph polar curves far more confidently since we began using this approach (and a couple extensions on it) than those we taught earlier in our careers.  It has become a matter of understanding what functions do and how they interact with each other and almost nothing to do with memorizing particular curve types.

So, now that our students are confidently able to graph polar curves like r=3+cos(2\theta ), we wondered how we could challenge them a bit more.  Remembering variable center lines like the Cartesian y=cos(x)+0.5x, we wondered what a polar curve with a variable center line would look like.  Not knowing where to start, I proposed r=2+cos(\theta )+sin(\theta), thinking I could graph a period 2\pi sine curve around the limacon r=2+cos(\theta ).

There’s a lot going on here, but in its most simplified version, we thought we would get a curve on the center line at \theta =0, 1 unit above at \theta =\frac{\pi}{2}, on at \theta =\pi, 1 unit below at \theta =\frac{3\pi}{2}, and returning to its starting point at \theta =2\pi.  We had a very rough “by hand” sketch, and were quite surprised by the image we got when we turned to our grapher for confirmation.  The oscillation behavior we predicted was certainly there, but there was more!  What do you see in the graph of r=2+cos(\theta )+sin(\theta) below?

This looked to us like some version of a cardioid.  Given the symmetry of the axis intercepts, we suspected it was rotated \frac{\pi}{4} from the x-axis.  An initially x-axis symmetric polar curve rotated \frac{\pi}{4} would contain the term cos(\theta-\frac{\pi}{4}) which expands using a trig identity.

\begin{array}{ccc} cos(\theta-\frac{\pi}{4})&=&cos(\theta )cos(\frac{\pi}{4})+cos(\theta )cos(\frac{\pi}{4}) \\ &=&\frac{1}{\sqrt{2}}(cos(\theta )+sin(\theta )) \end{array}

Eureka!  This identity let us rewrite the original polar equation.

\begin{array}{ccc} r=2+cos(\theta )+sin(\theta )&=&2+\sqrt{2}\cdot\frac{1}{\sqrt{2}} (cos(\theta )+sin(\theta )) \\ &=&2+\sqrt{2}\cdot cos(\theta -\frac{\pi}{4}) \end{array}

And this last form says our original polar function is equivalent to r=2+\sqrt{2}\cdot cos(\theta -\frac{\pi}{4}), or a \frac{\pi}{4} rotated cosine curve of amplitude \sqrt{2} and period 2\pi oscillating around center line r=2.

This last image shows a cosine curve starting at \theta=\frac{\pi}{4} beginning \sqrt{2} above the center circle r=2, crossing the center circle \frac{\pi}{2} later at \theta=\frac{3\pi}{4}, dropping to \sqrt{2} below the center circle at \theta=\frac{5\pi}{4}, back to the center circle at \theta=\frac{7\pi}{4} before finally returning to the starting point at \theta=\frac{9\pi}{4}.  Because the radius is always positive, this also convinced us that this curve is actually a rotated limacon without a loop and not the cardioid that drove our initial investigation.

So, we thought we were departing into some new territory and found ourselves looking back at earlier work from a different angle.  What a nice surprise!

One more added observation:  We got a little lucky in guessing the angle of rotation, but even if it wasn’t known, it is always possible to compute an angle of rotation (or translation in Cartesian) for a sum of two sinusoids with identical periods.  This particular topic is covered in some texts, including Precalculus Transformed.

Generalized Pythagoras through Vectors

Here’s a proof of the Pythagorean Theorem by way of vectors.  Of course, if your students already know vectors, they’re already way past the Pythagorean Theorem, but I thought Richard Pennington‘s statement of this on LinkedIn gave a pretty and stunningly brief (after all the definitions) proof of one of mathematics’ greatest equations.

Let O be the origin, and let \overrightarrow A and \overrightarrow B be two position vectors starting at O. The vector from \overrightarrow A to \overrightarrow B is simply \overrightarrow {B-A}, which I will call \overrightarrow C. Using properties of dot products,

\overrightarrow C\cdot\overrightarrow C = \overrightarrow {B-A}\cdot\overrightarrow {B-A} = \overrightarrow B\cdot\overrightarrow B-2\overrightarrow A\cdot\overrightarrow B+\overrightarrow A\cdot\overrightarrow A

The dot product of a vector with itself is the square of its magnitude, so

|\overrightarrow C|^2=|\overrightarrow B|^2-2|\overrightarrow A||\overrightarrow B|cos\theta+|\overrightarrow A|^2

where \theta is the angle between \overrightarrow A and \overrightarrow B.

This is the Law of Cosines–in my classes, I call it the generalized Pythagorean Theorem for all triangles.  If \theta=\frac{\pi}{2}, then \overrightarrow A and \overrightarrow B are the legs of a right triangle with hypotenuse \overrightarrow C which makes cos(\theta)=cos(\frac{\pi}{2})=0 and

|\overrightarrow C|^2=|\overrightarrow A|^2+|\overrightarrow B|^2

Pretty.

 

An unexpected identity

Typical high school mathematics considers only two types of transformations:  translations & scale changes.  From a broader perspective, a transformation is an operation that changes its input into an output.  A function is also an operation that changes inputs.  Therefore, functions and transformations are the same thing from different perspectives.  This post will explore an unexpected discovery Nurfatimah Merchant and I made when applying the squaring function (transformation) to trigonometric functions, an idea we didn’t fully realize until after the initial publication of PreCalculus Transformed.

When a function is transformed, some points are unchanged (invariant) while others aren’t.  But what makes a point invariant in a transformation?  From a function perspective, point a is invariant under transformation T if T(a)=a.  Using this, a squaring transformation is invariant for an input, a, when a^2=a\Rightarrow a*(a-1)=0 \Rightarrow a=\{0,1\}.

Therefore, input values of  0 and 1 are invariant under squaring, and all other inputs are changed as follows.

  • Negative inputs become positive,
  • a^2<a for any 0<a<1, and
  • a^2>a for any a>1.

So what happens when the squaring transformation is applied to the graph of y=sin(x) (the input) to get the graph of y=(sin(x))^2 (the output)?  Notice that the output of sin(x) is the input to the squaring transformation, so we are transforming y values.  The invariant points in this case are all points where y=0 or y=1.  Because squaring transforms all negative inputs into positive outputs, the first image shows a dashed graph of y=sin(x) with the invariant points marked as black points and the negative inputs made positive with the absolute value function.

All non-invariant points on y=|sin(x)| have magnitude<1 and become smaller in magnitude when squared, as noted above.  Because the original x-intercepts of y=sin(x) are all locally linear, squaring these creates local “bounce” x-intercepts on the output function looking locally similar to the graphs of polynomial double roots.  The result is shown below.

While proof that the final output is precisely another sinusoid comes later, the visual image is very compelling.  This looks like a graph of y=cos(x) under a simple scale change (S_{0.5,-0.5}) and translation (T_{0,0.5}), in that order, giving equation \displaystyle\frac{y-0.5}{-0.5}=cos(\frac{x}{0.5}) or y=\frac{1}{2}-\frac{1}{2}cos(2x).  Therefore,

sin^2(x)=\frac{1}{2}-\frac{1}{2}cos(2x).

We later rewrote this equation to get

cos(2x)=1-2sin^2(x).

The initial equation was a nice enough exercise, but what we realized in the rewriting was that we had just “discovered” the half-angle identity for sine and a double-angle identity for cosine using a graphical variation on the squaring transformation!  No manipulation of angle sum identities was required!  (OK, they really are for an honest  proof, but this is pretty compelling evidence.)

Apply the squaring transformation to the graph of y=cos(x) and you get the half-angle identity for cosine and another variation on the double-angle identity for cosine.

We thought this was a nice way to sneak up on trigonometric identities.  Enjoy.

Thought Variations and Tests as Learning Tools

I love seeing the different ways students think about solving problems.  Many of my classes involve students analyzing the pros and cons of different approaches.

As an example, a recent question on my first trigonometry test in my precalculus class asked students to find all exact solutions to 3sin^2x-cos^2x=2 in the interval 0\leq x\leq 2\pi.  Admittedly, this is not a complicated problem, but after grading several standard approaches to a solution, one student’s answer (Method 3 below) provided a neat thinking alternative.

As an assessment tool, I don’t view any test as a final product.  While optional, all of my students are encouraged to complete corrections on any test question which didn’t receive full credit.  For me, corrections always require two parts:

  1. Specifically identify the error(s) you made on the problem.
  2. Provide a correct solution to the problem.

My students usually take their tests on their own, but after they are returned, they are encouraged to reference any sources they want (classmates, notes, me, the Web, anyone or anything …) to address the two requirements of test corrections.  The point is for my students to learn from their misunderstandings using any source (or sources) that work for them.  Because students are supposed to do self-assessments, I intentionally don’t provide lots of detail on my initial evaluation of their work.

To show their different approaches, I’ve included the solutions of three students.  Complete solutions  are shown so that you can see the initial feedback I offer.  If there’s interest, I’m happy to provide examples of student test corrections in a future post.

Method 1:  Substitution–By far the most common approach taken.  This student solved sin^2x+cos^2x=1 for sin^2x and substituted.  Others substituted for cos^2x.  [You can click on each image for a full-size view]

This solution started well, but she had an algebra error and an angle identification problem.

Method 2:  Elimination–The same Pythagorean identity could be added or subtracted from the given equation.  After talking yesterday with the student who created this particular solution, I was told that he initially completed the left column and attempted the work in the right column as a check at the end of the period.  After committing the same algebra error as the student in method one, he realized at the end of the test that something was amiss when the cosine approach provided an answer different from the two he initially found using the sine approach.

After conversations with classmates yesterday, he caught his algebra error and found the missing answer.  He also corrected the units issue.

[I’m not sure whether I should even care about the units here and am seriously considering removing the 0\leq x\leq 2\pi restriction from future questions.  With enough use in class, they’ll eventually catch on to radian measure.]

Method 3:  Creation–This approach was used by only one student in the class and uses the same Pythagorean identity.  The difference here is that he initially moved the cos^2x term to the other side and then added an additional 3cos^2x to both sides to create a 3 on the left using the identity.  Nothing like this had been discussed in class, and I was quite interested to learn that the student wasn’t even sure his approach was valid.  What I particularly liked was that this student created an expression in his solution rather than eliminating expressions given in the initial equation as every other student in the class had done.  It reflected a mantra I often repeat in class:  If you don’t like the form of a problem (or want a different form), change it!

Also notice how he used an absolute value in the penultimate line rather than the more common \pm.

Again, nothing especially deep about any of these, but I learn so much from watching how students solve problems.  Hopefully they gain at least as much from each other when comparing each others solutions during corrections.

A student taught me

About 10 years ago, I was introducing a lesson on series in a calculus class. This course was the first for which I had a Computer Algebra System (CAS, in this case, a TI-89) in the hands of every student. I’ve learned that “off task” who are still engaged in the class are typically the most creative sources of ideas.

In this case, I was using the CAS as a support and verification tool on a lesson reviewing geometric series and introducing the harmonic series. At the end of class, a student approached me with his CAS in hand and an intensely puzzled look on his face. He asked, “How did this happen?” and showed me his calculator screen.

Note the power of the CAS in this instance to handle way more math than the student understood, but it definitely piqued his interest and mine with its handling of an infinite bound, and especially for they completely unexpected appearance of \pi^2. What in the world was going on here? Surely there must be some error.

I had no clue, but promised to get back to him within the next few days. After a week of trying to solve it on my own and not really knowing to engage a network for help, I humbly returned to my student and confessed that I simply didn’t have an answer for him even though the problem looked amazingly cool. I played with the problem off-and-on over the ensuing months until one Saturday afternoon two years later when I was reading a math article on Euler that offers his ideas on product series. [I had never studied such things. Whether my background should have been broader is an open question.]

If you don’t have the stomach or time or inclination to consume the proof that follows, please scroll to the last few paragraphs of this post for my general comments on the reach of this problem.

Spoiler Alert: If you want to explore why the series sum my student re-discovered is true, stop reading now!

The following proof may not be absolutely water-tight, but it is how I’ve come to understand Euler’s stunning solution.

If you remember Maclaurin Series from differential calculus, you might recall that the polynomial equivalent for the sine function at x=0 is sin(x)=\sum_{n=1}^{infnty}\frac{x}^{2n-1}{(2n-1)!}=x-\frac{x^3}{3!}+\frac{x^5}{5!}-\frac{x^7}{7!}+...

The connection I missed for two years was that this is just a polynomial, and polynomials can be factored. OK, sine’s not really a polynomial, but if you can approximate it as an expanded polynomial (the Maclaurin series above), why can’t you do the same in factored form?

So, the x-intercepts of sine are and by the typical manner of writing factored polynomials today, one would write But polynomial factors can also be written in the slightly more complicated form, , which actually ends of simplifying this problem significantly.

The coefficient A allows the factor (product) series to vertically stretch to fit any curve with the given roots. Notice, too, that the equidistant terms on either side of the x term are conjugates, so the series can be further rewritten as an infinite product of differences of squares.

But now there are two polynomials representing the same function, so the polynomials must be equivalent. Therefore,

a pretty amazing equivalence in its own right.

Whenever two polynomials are equivalent, their coefficients must be equal when the polynomials are written in the same form. To explore this, you can expand the product form. Notice that the Maclaurin polynomial doesn’t have a constant term, and neither does the factored term because any combination of terms from the product eventually must multiply by the Ax term. Continuing this reasoning, the only way to get a linear term is to take the Ax term and then multiply by the 1 from every single binomial factor. Anything else would yield a term with degree higher than 1. Equating the linear terms from the product expansion (Ax) and the Maclaurin series (x) proves that A=1.

You can reason why the product series cannot produce quadratic terms, so turn your attention now to the cubic terms. The only cubic terms in the expansion of the product series come from multiplying the leading x term by a quadratic term in ONE of the binomial factors and by the 1 terms in all of the remaining binomial factors. Collecting all of these cubic terms and equating them to the known Maclaurin cubic term gives

Factoring out allows the coefficients to compared and a minor re-arrangement from there proves my student’s result.

I’ve always been blown away by how quickly this proof collapses from very cumbersome equations into such a beautiful result. It also amazes me that while a sine function was used to leverage the proof, the final result conveys nothing of its presence. Equating higher order terms from the two series allows you to compute the value of for any even values of k, but odd powers of k greater than 1 are still an outstanding problem in mathematics. We know they converge; we simply don’t have closed form values for their sums. Anyone up for the challenge?

(By the way, you can read here some historical context and connections from this problem to much more substantial mathematics today, including a connection to the zeta function which underlies the current search for a proof to the prime number theorem.)

This is some pretty heady stuff for a student who just wanted to know what happened if you tweaked an expression a little bit. But then, isn’t that how all good math starts? My only regret is that the student who initially posed the problem graduated high school before we could find the answer to his question. The positive side is that hundreds of my students since then have been able to see this gorgeous connection.

I hope all my students know that I value their questions (no matter how shallow, deep, or off-topic) and that I learn more and become a better teacher with every new perspective they share.

Post-scripts (9/5/11) from communications with David Doster:  I knew that I was walking in the footsteps of others as I created the proof above on my own, but a serendipitous communication with a colleague I met at a 1993 Woodrow Wilson institute on the Mathematics of Change.  David pointed me toward Robin Chapman’s Home Page which includes 14 different proofs of this result–Euler’s (and my) proof is the 7th in that list.  This result has also been a favorite of William Dunham as published in his books Journey Through Genius and  Euler, The Master of Us All, both definitely worth reading.