Category Archives: Lessons

Powers of i

I was discussing integer powers of i in my summer Algebra 2 last month and started with the “standard” modulus-4 pattern I learned as a student and have always taught.  While not particularly insightful, my students and I considered another approach that might prove simpler for some.

TRADITIONAL APPROACH:

I began with the obvious i^0 and i^1 before invoking the definition of i to get i^2.  From these three you can see every time the power of i increases by 1, you multiply the result by i and simplify the result if possible using these first 3 terms.  The result of i^3 is simple,  taking the known results to

i_1

But i^4=-i^2=-(-1)=1, cycling back to the value initially found with i^0.  Continuing this procedure creates a modulus-4 pattern:

i_2

They noticed that to any multiple of 4 was 1, and other powers were i, -1, or –i, depending on how far removed they were from a multiple of 4.  For an algorithm to compute a simplified form of to an integer power, divide the power by 4, and raise i to the remainder (0, 1, 2, or 3) from that division.

They got the pattern and were ready to move on when one student who had glimpsed this in a math competition at some point noted he could “do it”, but it seemed to him that memorizing the list of 4 base powers was a necessary requirement to invoking the pattern.

Then recalled a comment I made on the first day of class.  I value memorizing as little mathematics as possible and using the mathematics we do know as widely as possible.  His challenge was clear:  Wasn’t asking students to use this 4-cycle approach just a memorization task in disguise?  If I believed in my non-memorization claim, shouldn’t there be another way to achieve our results using nothing more the definition of i?

A POTENTIAL IMPROVEMENT:

By definition, i = \sqrt{-1}, so it’s a very small logical stretch with inverse operations to claim i^2=-1.

Even Powers:  After trying some different examples, one student had an easy way to handle even powers.  For example, if n=148, she invoked an exponent rule “in reverse” to extract an i^2 term which she turned into a -1.  Because -1 to any integer power is either 1 or -1, she used the properties of negative numbers to odd and even powers to determine the sign of her answer.

i_3

Because any even power can always be written as the product of 2 and another number, this gave an easy way to handle half of all cases using nothing more than the definition of i and exponents of -1.

A third student pointed out another efficiency.  Because the final result depended only on whether the integer multiplied by 2 was even or odd, only the last two digits of n were even relevant.  That pattern also exists in the 4-cycle approach, but it felt more natural here.

Odd Powers:  Even powers were so simple, they were initially frustrated that odd powers didn’t seem to be, too.  Then the student who’d issued the memorization challenge said that any odd power of i was just the product of i and an even power of i.  Invoking the efficiency in the last paragraph for n=567, he found

i_4

CONCLUSION:

In the end, powers of i had become nothing more complicated than exponent properties and powers of -1.  The students seemed to have greater comfort with finding powers of complex numbers, but I have begun to question why algebra courses have placed so much emphasis on powers of i.

From one perspective, a surprising property of complex numbers for many students is that any operation on complex numbers creates another complex number.  While they are told that complex numbers are a closed set, to see complex numbers simplify so conveniently surprises many.

Another cool aspect of complex number operations is the stretch-and-rotate graphical property of complex number multiplication.   This is the basis of DeMoivre’s Theorem and explains why there are exactly 4 results when you repeatedly multiply any complex number by i–equivalent to stretching by a factor of 1 and rotating \frac{\pi}{2}.  Multiplying by 1 doesn’t change the magnitude of a number, and after 4 rotations of \frac{\pi}{2}, you are back at the original number.

So, depending on the future goals or needs of your students, there is certainly a reason to explore the 4-cycle nature of repeated multiplication by i.  If the point is just to compute a result, perhaps the 4-cycle approach is unnecessarily “complex”, and the odd/even powers of -1 is less computationally intense.  In the end, maybe it’s all about number sense.

My students discovered a more basic algorithm, but I’m more uncomfortable.  Just because we can ask our students a question doesn’t mean we should.  I can see connections from my longer studies, but do they see or care?  In this case, should they?

Advertisements

Helping Students Visualize Skew

This week during my statistics classes’ final review time for their term final, I had a small idea I wish I’d had years ago.

Early in the course, we talked about means and medians as centers of data and how these values were nearly identical in roughly symmetric data sets.  However, when there are extreme or outlier values on only one side of the center, the more extreme-sensitive mean would be “pulled” in the direction of those outliers.  In simple cases like this, the pull of the extreme values is said to “skew” the data in the direction of the extremes.  Admittedly, in more complicated data sets with one long tail and one heavy tail, skew can be difficult to visualize, but in early statistics classes with appropriate warnings, I’ve found it sufficient to discuss basic skew in terms of the pull of extreme values on the mean.

Despite my efforts to help students understand the direction of skew relative to the tails of a data set, I’ve noticed that many still describe first the side where the “data piles up” before declaring the skew to be the opposite.  For example, using a histogram (below) of data from my school last week where students were given a week to guess the number of Skittles in glass jar, several of my students continued to note that the data “piled up” on the left, so the skew was right, or positive.

skittles1

SIDE NOTE:  There were actually 838 Skittles in the jar.  Clearly most of the students seriously underestimated the total with a few extremely hopeful outliers to the far right.

While my students can properly identify the right skew of this data set, I remained bothered by the mildly convoluted approach they persistently used to determine the skew.  I absolutely understand why their eyes are drawn to the left-side pile up, but I wondered if there was another way I could visualize skew that might get them to look first in the skewed direction.  That’s when I wondered about the possibility of describing the “stretchiness” of skew via boxplots.  Nearly symmetric data have nearly symmetric box plots, but extreme or outlier values would notably pull the whiskers of the boxplot or appear as outlier dots on the ends.

If my first visualization of the Skittles data was with a boxplot (below) would that have made it any easier to see the extreme right-side pull on the Skittles data or would they see the box on the left and then declare right skew?

skittles2

Is it possible the longer right whisker and several right-side outliers in this boxplot make it easier to see right skewness directly rather than as the opposite of the data’s left-side pile up side?  It’ll be the start of class in Fall 2016 before I can try out this idea.

In the meantime, I wonder how others approach helping students see and understand skewness directly rather than as a consequence of something else.  Ideas, anyone?

Probability, Polynomials, and Sicherman Dice

Three years ago, I encountered a question on the TI-Nspire Google group asking if there was a way to use CAS to solve probability problems.  The ideas I pitched in my initial response and follow-up a year later (after first using it with students in a statistics class) have been thoroughly re-confirmed in my first year teaching AP Statistics.  I’ll quickly re-share them below before extending the concept with ideas I picked up a couple weeks ago from Steve Phelps’ session on Probability, Polynomials, and CAS at the 64th annual OCTM conference earlier this month in Cleveland, OH.

BINOMIALS:  FROM POLYNOMIALS TO SAMPLE SPACES

Once you understand them, binomial probability distributions aren’t that difficult, but the initial conjoining of combinatorics and probability makes this a perennially difficult topic for many students.  The standard formula for the probability of determining the chances of K successes in N attempts of a binomial situation where p is the probability of a single success in a single attempt is no less daunting:

\displaystyle \left( \begin{matrix} N \\ K \end{matrix} \right) p^K (1-p)^{N-K} = \frac{N!}{K! (N-K)!} p^K (1-p)^{N-K}

But that is almost exactly the same result one gets by raising binomials to whole number powers, so why not use a CAS to expand a polynomial and at least compute the \displaystyle \left( \begin{matrix} N \\ K \end{matrix} \right) portion of the probability?  One added advantage of using a CAS is that you could use full event names instead of abbreviations, making it even easier to identify the meaning of each event.

prob1

The TI-Nspire output above shows the entire sample space resulting from flipping a coin 6 times.  Each term is an event.  Within each term, the exponent of each variable notes the number of times that variable occurs and the coefficient is the number of times that combination occurs.  The overall exponent in the expand command is the number of trials.  For example, the middle term– 20\cdot heads^3 \cdot tails^3 –says that there are 20 ways you could get 3 heads and 3 tails when tossing a coin 6 times. The last term is just tails^6, and its implied coefficient is 1, meaning there is just one way to flip 6 tails in 6 tosses.

The expand command makes more sense than memorized algorithms and provides context to students until they gain a deeper understanding of what’s actually going on.

FROM POLYNOMIALS TO PROBABILITY

Still using the expand command, if each variable is preceded by its probability, the CAS result combines the entire sample space AND the corresponding probability distribution function.  For example, when rolling a fair die four times, the distribution for 1s vs. not 1s (2, 3, 4, 5, or 6) is given by

prob2

The highlighted term says there is a 38.58% chance that there will be exactly one 1 and any three other numbers (2, 3, 4, 5, or 6) in four rolls of a fair 6-sided die.  The probabilities of the other four events in the sample space are also shown.  Within the TI-Nspire (CAS or non-CAS), one could use a command to give all of these probabilities simultaneously (below), but then one has to remember whether the non-contextualized probabilities are for increasing or decreasing values of which binomial outcome.

prob3

Particularly early on in their explorations of binomial probabilities, students I’ve taught have shown a very clear preference for the polynomial approach, even when allowed to choose any approach that makes sense to them.

TAKING POLYNOMIALS FROM ONE DIE TO MANY

Given these earlier thoughts, I was naturally drawn to Steve Phelps “Probability, Polynomials, and CAS” session at the November 2014 OCTM annual meeting in Cleveland, OH.  Among the ideas he shared was using polynomials to create the distribution function for the sum of two fair 6-sided dice.  My immediate thought was to apply my earlier ideas.  As noted in my initial post, the expansion approach above is not limited to binomial situations.  My first reflexive CAS command in Steve’s session before he share anything was this.

prob4

By writing the outcomes in words, the CAS interprets them as variables.  I got the entire sample space, but didn’t learn gain anything beyond a long polynomial.  The first output– five^2 –with its implied coefficient says there is 1 way to get 2 fives.  The second term– 2\cdot five \cdot four –says there are 2 ways to get 1 five and 1 four.  Nice that the technology gives me all the terms so quickly, but it doesn’t help me get a distribution function of the sum.  I got the distributions of the specific outcomes, but the way I defined the variables didn’t permit sum of their actual numerical values.  Time to listen to the speaker.

He suggested using a common variable, X, for all faces with the value of each face expressed as an exponent.  That is, a standard 6-sided die would be represented by X^1+X^2+ X^3+X^4+X^5+X^6 where the six different exponents represent the numbers on the six faces of a typical 6-sided die.  Rolling two such dice simultaneously is handled as I did earlier with the binomial cases.

NOTE:  Exponents are handled in TWO different ways here.  1) Within a single polynomial, an exponent is an event value, and 2) Outside a polynomial, an exponent indicates the number of times that polynomial is applied within the specific event.  Coefficients have the same meaning as before.

Because the variables are now the same, when specific terms are multiplied, their exponents (face values) will be added–exactly what I wanted to happen.  That means the sum of the faces when you roll two dice is determined by the following.

prob5

Notice that the output is a single polynomial.  Therefore, the exponents are the values of individual cases.  For a couple examples, there are 3 ways to get a sum of 10 \left( 3 \cdot x^{10} \right) , 2 ways to get a sum of 3 \left( 2 \cdot x^3 \right) , etc.  The most commonly occurring outcome is the term with the largest coefficient.  For rolling two standard fair 6-sided dice, a sum of 7 is the most common outcome, occurring 6 times \left( 6 \cdot x^7 \right) .  That certainly simplifies the typical 6×6 tables used to compute the sums and probabilities resulting from rolling two dice.

While not the point of Steve’s talk, I immediately saw that technology had just opened the door to problems that had been computationally inaccessible in the past.  For example, what is the most common sum when rolling 5 dice and what is the probability of that sum?  On my CAS, I entered this.

prob6

In the middle of the expanded polynomial are two terms with the largest coefficients, 780 \cdot x^{18} and 780 \cdot x^{19}, meaning a sums of 17 and 18 are the most common, equally likely outcomes when rolling 5 dice.  As there are 6^5=7776 possible outcomes when rolling a die 5 times, the probability of each of these is \frac{780}{7776} \approx 0.1003, or about 10.03% chance each for a sum of 17 or 18.  This can be verified by inserting the probabilities as coefficients before each term before CAS expanding.

prob7

With thought, this shouldn’t be surprising as the expected mean value of rolling a 6-sided die many times is 3.5, and 5 \cdot 3.5 = 17.5, so the integers on either side of 17.5 (17 & 18) should be the most common.  Technology confirms intuition.

ROLLING DIFFERENT DICE SIMULTANEOUSLY

What is the distribution of sums when rolling a 4-sided and a 6-sided die together?  No problem.  Just multiply two different polynomials, one representative of each die.

prob8

The output shows that sums of 5, 6, and 7 would be the most common, each occurring four times with probability \frac{1}{6} and together accounting for half of all outcomes of rolling these two dice together.

A BEAUTIFUL EXTENSION–SICHERMAN DICE

My most unexpected gain from Steve’s talk happened when he asked if we could get the same distribution of sums as “normal” 6-sided dice, but from two different 6-sided dice.  The only restriction he gave was that all of the faces of the new dice had to have positive values.  This can be approached by realizing that the distribution of sums of the two normal dice can be found by multiplying two representative polynomials to get

x^{12}+2x^{11}+3x^{10}+4x^9+5x^8+6x^7+5x^6+4x^5+3x^4+2x^3+x^2.

Restating the question in the terms of this post, are there two other polynomials that could be multiplied to give the same product?  That is, does this polynomial factor into other polynomials that could multiply to the same product?  A CAS factor command gives

prob9

Any rearrangement of these eight (four distinct) sub-polynomials would create the same distribution as the sum of two dice, but what would the the separate sub-products mean in terms of the dice?  As a first example, what if the first two expressions were used for one die (line 1 below) and the two squared trinomials comprised a second die (line 2)?

prob10

Line 1 actually describes a 4-sided die with one face of 4, two faces with 3s, and one face of 2.  Line 2 describes a 9-sided die (whatever that is) with one face of 8, two faces of 6, three faces of 4, two faces of 2, and one face with a 0 ( 1=1 \cdot x^0).  This means rolling a 4-sided and a 9-sided die as described would give exactly the same sum distribution.  Cool, but not what I wanted.  Now what?

Factorization gave four distinct sub-polynomials, each with multitude 2.  One die could contain 0, 1, or 2 of each of these with the remaining factors on the other die.  That means there are 3^4=81 different possible dice combinations.  I could continue with a trail-and-error approach, but I wanted to be more efficient and elegant.

What follows is the result of thinking about the problem for a while.  Like most math solutions to interesting problems, ultimate solutions are typically much cleaner and more elegant than the thoughts that went into them.  Problem solving is a messy–but very rewarding–business.

SOLUTION

Here are my insights over time:

1) I realized that the x^2 term would raise the power (face values) of the desired dice, but would not change the coefficients (number of faces).  Because Steve asked for dice with all positive face values.  That meant each desired die had to have at least one x to prevent non-positive face values.

2) My first attempt didn’t create 6-sided dice.  The sums of the coefficients of the sub-polynomials determined the number of sides.  That sum could also be found by substituting x=1 into the sub-polynomial.  I want 6-sided dice, so the final coefficients must add to 6.  The coefficients of the factored polynomials of any die individually must add to 2, 3, or 6 and have a product of 6.  The coefficients of (x+1) add to 2, \left( x^2+x+1 \right) add to 3, and \left( x^2-x+1 \right) add to 1.  The only way to get a polynomial coefficient sum of 6 (and thereby create 6-sided dice) is for each die to have one (x+1) factor and one \left( x^2+x+1 \right) factor.

3) That leaves the two \left( x^2-x+1 \right) factors.  They could split between the two dice or both could be on one die, leaving none on the other.  We’ve already determined that each die already had to have one each of the x, (x+1), and \left( x^2+x+1 \right) factors.  To also split the \left( x^2-x+1 \right) factors would result in the original dice:  Two normal 6-sided dice.  If I want different dice, I have to load both of these factors on one die.

That means there is ONLY ONE POSSIBLE alternative for two 6-sided dice that have the same sum distribution as two normal 6-sided dice.

prob11

One die would have single faces of 8, 6, 5, 4, 3, and 1.  The other die would have one 4, two 3s, two 2s, and one 1.  And this is exactly the result of the famous(?) Sicherman Dice.

If a 0 face value was allowed, shift one factor of x from one polynomial to the other.  This can be done two ways.

prob12

The first possibility has dice with faces {9, 7, 6, 5, 4, 2} and {3, 2, 2, 1, 1, 0}, and the second has faces {7, 5, 4, 3, 2, 0} and {5, 4, 4, 3, 3, 2}, giving the only other two non-negative solutions to the Sicherman Dice.

Both of these are nothing more than adding one to all faces of one die and subtracting one from from all faces of the other.  While not necessary to use polynomials to compute these, they are equivalent to multiplying the polynomial of one die by x and the other by \frac{1}{x} as many times as desired. That means there are an infinite number of 6-sided dice with the same sum distribution as normal 6-sided dice if you allow the sides to have negative faces.  One of these is

prob13

corresponding to a pair of Sicherman Dice with faces {6, 4, 3, 2, 1, -1} and {1,5,5,4,4,3}.

CONCLUSION:

There are other very interesting properties of Sicherman Dice, but this is already a very long post.  In the end, there are tremendous connections between probability and polynomials that are accessible to students at the secondary level and beyond.  And CAS keeps the focus on student learning and away from the manipulations that aren’t even the point in these explorations.

Enjoy.

Monty Hall Continued

In my recent post describing a Monty Hall activity in my AP Statistics class, I shared an amazingly crystal-clear explanation of how one of my new students conceived of the solution:

If your strategy is staying, what’s your chance of winning?  You’d have to miraculously pick the money on the first shot, which is a 1/3 chance.  But if your strategy is switching, you’d have to pick a goat on the first shot.  Then that’s a 2/3 chance of winning.  

Then I got a good follow-up question from @SteveWyborney on Twitter:

Returning to my student’s conclusion about the 3-door version of the problem, she said,

The fact that there are TWO goats actually can help you, which is counterintuitive on first glance. 

Extending her insight and expanding the problem to any number of doors, including Steve’s proposed 1,000,000 doors, the more goats one adds to the problem statement, the more likely it becomes to win the treasure with a switching doors strategy.  This is very counterintuitive, I think.

For Steve’s formulation, only 1 initial guess from the 1,000,000 possible doors would have selected the treasure–the additional goats seem to diminish one’s hopes of ever finding the prize.  Each of the other 999,999 initial doors would have chosen a goat.  So if 999,998 goat-doors then are opened until all that remains is the original door and one other, the contestant would win by not switching doors iff the prize was initially randomly selected, giving P(win by staying) = 1/1000000.  The probability of winning with the switching strategy is the complement, 999999/1000000.  

IN RETROSPECT:

My student’s solution statement reminds me on one hand how critically important it is for teachers to always listen to and celebrate their students’ clever new insights and questions, many possessing depth beyond what students realize.  

The solution reminds me of a several variations on “Everything is obvious in retrospect.”  I once read an even better version but can’t track down the exact wording.  A crude paraphrasing is

The more profound a discovery or insight, the more obvious it appears after.

I’d love a lead from anyone with the original wording.

REALLY COOL FOOTNOTE:

Adding to the mystique of this problem, I read in the Wikipedia description that even the great problem poser and solver Paul Erdős didn’t believe the solution until he saw a computer simulation result detailing the solution.  

Probability and Monty Hall

I’m teaching AP Statistics for the first time this year, and my first week just ended.  I’ve taught statistics as portions of other secondary math courses and as a semester-long community college class, but never under the “AP” moniker.  The first week was a blast.  

To connect even the very beginning of the course to previous knowledge of all of my students, I decided to start the year with a probability unit.  For an early class activity, I played the classic Monte Hall game with the classes.  Some readers will recall the rules, but here they are just in case you don’t know them.  

  1. A contestant faces three closed doors.  Behind one is a new car. There is a goat behind each of the other two. 
  2. The contestant chooses one of the doors and announces her choice.  
  3. The game show host then opens one of the other two doors to reveal a goat.
  4. Now the contestant has a choice to make.  Should she
    1. Always stay with the door she initially chose, or
    2. Always change to the remaining unopened door, or
    3. Flip a coin to choose which door because the problem essentially has become a 50-50 chance of pure luck.

Historically, many people (including many very highly educated, degree flaunting PhDs) intuit the solution to be “pure luck”.  After all, don’t you have just two doors to choose from at the end?

In one class this week, I tried a few simulations before I posed the question about strategy.  In the other, I posed the question of strategy before any simulations.  In the end, very few students intuitively believed that staying was a good strategy, with the remainder more or less equally split between the “switch” and “pure luck” options.  I suspect the greater number of “switch” believers (and dearth of stays) may have been because of earlier exposure to the problem.  

I ran my class simulation this way:  

  • Students split into pairs (one class had a single group of 3).  
  • One student was the host and secretly recorded a door number.  
  • The class decided in advance to always follow the “shift strategy”.  [Ultimately, following either stay or switch is irrelevant, but having all groups follow the same strategy gives you the same data in the end.]
  • The contestant then chose a door, the host announced an open door, and the contestant switched doors.
  • The host then declared a win or loss bast on his initial door choice in step two.
  • Each group repeated this 10 times and reported their final number of wins to the entire class.
  • This accomplished a reasonably large number of trials from the entire class in a very short time via division of labor.  Because they chose the shift strategy, my two classes ultimately reported 58% and 68% winning percentages.  

Curiously, the class that had the 58% percentage had one group with just 1 win out of 10 and another winning only 4 of 10. It also had a group that reported winning 10 of 10.  Strange, but even with the low, unexpected probabilities, the long-run behavior from all groups still led to a plurality winning percentage for switching.

Here’s a verbatim explanation from one of my students written after class for why switching is the winning strategy.  It’s perhaps the cleanest reason I’ve ever heard.

The faster, logical explanation would be: if your strategy is staying, what’s your chance of winning?  You’d have to miraculously pick the money on the first shot, which is a 1/3 chance.  But if your strategy is switching, you’d have to pick a goat on the first shot.  Then that’s a 2/3 chance of winning.  In a sense, the fact that there are TWO goats actually can help you, which is counterintuitive on first glance. 

Engaging students hands-on in the experiment made for a phenomenal pair of classes and discussions. While many left still a bit disturbed that the answer wasn’t 50-50, this was a spectacular introduction to simulations, conditional probability, and cool conversations about the inevitability of streaks in chance events. 

For those who are interested, here’s another good YouTube demonstration & explanation.

Traveling Dots, Parabolas, and Elegant Math

Toward the end of last week, I read a description a variation on a paper-folding strategy to create parabolas.  Paraphrased, it said:

  1. On a piece of wax paper, use a pen to draw a line near one edge.  (I used a Sharpie on regular copy paper and got enough ink bleed that I’m convinced any standard copy or notebook paper will do.  I don’t think the expense of wax paper is required!)
  2. All along the line, place additional dots 0.5 to 1 inch apart.
  3. Finally, draw a point F between 0.5 and 2 inches from the line roughly along the midline of the paper toward the center of the paper.
  4. Fold the paper over so one of the dots on line is on tope of point F.  Crease the paper along the fold and open the paper back up.
  5. Repeat step 4 for every dot you drew in step 2.
  6. All of the creases from steps 4 & 5 outline a curve.  Trace that curve to see a parabola.

parabola1

I’d seen and done this before, I had too passively trusted that the procedure must have been true just because the resulting curve “looked like a parabola.”  I read the proof some time ago, but I consumed it too quickly and didn’t remember it when I was read the above procedure.  I shamefully admitted to myself that I was doing exactly what we insist our students NEVER do–blindly accepting a “truth” based on its appearance.  So I spent part of that afternoon thinking about how to understand completely what was going on here.

What follows is the chronological redevelopment of my chain of reasoning for this activity, hopefully showing others that the prettiest explanations rarely occur without effort, time, and refinement.  At the end of this post, I offer what I think is an even smoother version of the activity, freed from some of what I consider overly structured instructions above.

CONIC DEFINITION AND WHAT WASN’T OBVIOUS TO ME

A parabola is the locus of points equidistant from a given  point (focus) and line (directrix).

parabola2

What makes the parabola interesting, in my opinion, is the interplay between the distance from a line (always perpendicular to some point C on the directrix) and the focus point (theoretically could point in any direction like a radius from a circle center).

What initially bothered me about the paper folding approach last week was that it focused entirely on perpendicular bisectors of the Focus-to-C segment (using the image above).  It was not immediately obvious to me at all that perpendicular bisectors of the Focus-to-C segment were 100% logically equivalent to the parabola’s definition.

SIMILARITY ADVANTAGES AND PEDAGOGY

I think I had two major advantages approaching this.

  1. I knew without a doubt that all parabolas are similar (there is a one-to-one mapping between every single point on any parabola and every single point on any other parabola), so I didn’t need to prove lots of cases.  Instead, I focused on the simplest version of a parabola (from my perspective), knowing that whatever I proved from that example was true for all parabolas.
  2. I am quite comfortable with my algebra, geometry, and technology skills.  Being able to wield a wide range of powerful exploration tools means I’m rarely intimidated by problems–even those I don’t initially understand.  I have the patience to persevere through lots of data and explorations until I find patterns and eventually solutions.

I love to understand ideas from multiple perspectives, so I rarely quit with my initial solution.  Perseverance helps me re-phrase ideas and exploring them from alternative perspectives until I find prettier ways of understanding.

In my opinion, it is precisely this willingness to play, persevere, and explore that formalized education is broadly failing to instill in students and teachers.  “What if?” is the most brilliant question, and the one we sadly forget to ask often enough.

ALGEBRAIC PROOF

While I’m comfortable handling math in almost any representation, my mind most often jumps to algebraic perspectives first.  My first inclination was a coordinate proof.

PROOF 1:  As all parabolas are similar, it was enough to use a single, upward facing parabola with its vertex at the origin.  I placed the focus at (0,f), making the directrix the line y=-f.  If any point on the parabola was (x_0,y_0), then a point C on the directrix was at (x_0,-f).

parabola3

From the parabola’s definition, the distance from the focus to P was identical to the length of CP:

\sqrt{(x_0-0)^2-(y_0-f)^2}=y_0+f

Squaring and combining common terms gives

x_0 ^2+y_0 ^2-2y_0f+f^2=y_0 ^2+2y_0f+f^2
x_0 ^2=4fy

But the construction above made lines (creases) on the perpendicular bisector of the focus-to-C segment.  This segment has midpoint \displaystyle \left( \frac{x_0}{2},0 \right) and slope \displaystyle -\frac{2f}{x_0}, so an equation for its perpendicular bisector is \displaystyle y=\frac{x_0}{2f} \left( x-\frac{x_0}{2} \right).

parabola4

Finding the point of intersection of the perpendicular bisector with the parabola involves solving a system of equations.

\displaystyle y=\frac{x_0}{2f} \left( x-\frac{x_0}{2} \right)=\frac{x^2}{4f}
\displaystyle \frac{1}{4f} \left( x^2-2x_0x+x_0 ^2 \right) =0
\displaystyle \frac{1}{4f} \left( x-x_0 \right) ^2 =0

So the only point where the line and parabola meet is at \displaystyle x=x_0–the very same point named by the parabola’s definition.  QED

Proof 2:  All of this could have been brilliantly handled on a CAS to save time and avoid the manipulations.

parabola5

Notice that the y-coordinate of the final solution line is the same y_0 from above.

MORE ELEGANT GEOMETRIC PROOFS

I had a proof, but the algebra seemed more than necessary.  Surely there was a cleaner approach.

Parabola6

In the image above, F is the focus, and I is a point on the parabola.  If D is the midpoint of \overline{FC}, can I conclude \overline{ID} \perp \overline{FC} , proving that the perpendicular bisector of \overline{FC} always intersects the parabola?

PROOF 3:  The definition of the parabola gives \overline{FI} \cong \overline{IC}, and the midpoint gives \overline{FD} \cong \overline{DC}.  Because \overline{ID} is self-congruent, \Delta IDF \cong \Delta IDC by SSS, and corresponding parts make the supplementary \angle IDF \cong \angle IDC, so both must be right angles.  QED

PROOF 4:  Nice enough, but it still felt a little complicated.  I put the problem away to have dinner with my daughters and when I came back, I was able to see the construction not as two congruent triangles, but as the single isosceles \Delta FIC with base \overline{FC}.  In isosceles triangles, altitudes and medians coincide, automatically making \overline{ID} the perpendicular bisector of \overline{FC} .  QED

Admittedly, Proof 4 ultimately relies on the results of Proof 3, but the higher-level isosceles connection felt much more elegant.  I was satisfied.

TWO DYNAMIC GEOMETRY SOFTWARE VARIATIONS

Thinking how I could prompt students along this path, I first considered a trace on the perpendicular lines from the initial procedure above (actually tangent lines to the parabola) using to trace the parabolas.  A video is below, and the Geogebra file is here.

http://vimeo.com/89759785

It is a lovely approach, and I particularly love the way the parabola appears as a digital form of “string art.”  Still, I think it requires some additional thinking for users to believe the approach really does adhere to the parabola’s definition.

I created a second version allowing users to set the location of the focus on the positive y-axis and using  a slider to determine the distances and constructs the parabola through the definition of the parabola.  [In the GeoGebra worksheet (here), you can turn on the hidden circle and lines to see how I constructed it.]  A video shows the symmetric points traced out as you drag the distance slider.

http://vimeo.com/89861149

A SIMPLIFIED PAPER PROCEDURE

Throughout this process, I realized that the location and spacing of the initial points on the directrix was irrelevant.  Creating the software versions of the problem helped me realize that if I could fold a point on the directrix to the focus, why not reverse the process and fold F to the directrix?  In fact, I could fold the paper so that F touched anywhere on the directrix and it would work.  So, here is the simplest version I could develop for the paper version.

  1. Use a straightedge and a Sharpie or thin marker to draw a line near the edge of a piece of paper.
  2. Place a point F roughly above the middle of the line toward the center of the paper.
  3. Fold the paper over so point F is on the line from step 1 and crease the paper along the fold.
  4. Open the paper back up and repeat step 3 several more times with F touching other parts of the step 1 line.
  5. All of the creases from steps 3 & 4 outline a curve.  Trace that curve to see a parabola.

This procedure works because you can fold the focus onto the directrix anywhere you like and the resulting crease will be tangent to the parabola defined by the directrix and focus.  By allowing the focus to “Travel along the Directrix”, you create the parabola’s locus.  Quite elegant, I thought.

ADDITIONAL POSSIBLE QUESTIONS

As I was playing with the different ways to create the parabola and thinking about the interplay between the two distances in the parabola’s definition, I wondered about the potential positions of the distance segments.

parabola2

  1. What is the shortest length of segment CP and where could it be located at that length?  What is the longest length of segment CP and where could it be located at that length?
  2. Obviously, point C can be anywhere along the directrix.  While the focus-to-P segment is theoretically free to rotate in any direction, the parabola definition makes that seem not practically possible.  So, through what size angle is the focus-to-P segment practically able to rotate?
  3. Assuming a horizontal directrix, what is the maximum slope the focus-to-P segment can achieve?
  4. Can you develop a single solution to questions 2 and 3 that doesn’t require any computations or constructions?

CONCLUSIONS

I fully realize that none of this is new mathematics, but I enjoyed the walk through pure mathematics and the enjoyment of developing ever simpler and more elegant solutions to the problem.  In the end, I now have a deeper and richer understanding of parabolas, and that was certainly worth the journey.

Teaching Creativity in Mathematics

This will be the first of two ‘blog posts on an activity that could promote creativity for elementary, middle school, and high school students.  A suggestion for parents and teachers is in the middle of this post.

ABOUT A DECADE AGO, I first discovered what I call the Four 4s activity.  In brief, the game says that using exactly four 4s (no more, no less, and no other digits) and any mathematical operation you want, you can create every integer from 1 to 100.  Two quick simple examples are \displaystyle 3= \frac{4+4+4}{4} and \displaystyle 16= 4\cdot 4+4-4.

As for mathematical operations, anything goes!  The basic +, -, *, / along with exponents, roots, decimals (4.4 or .4), concatenation (44), percentages, repeating decimals (.\overline{4}), and many more are legal.

At the time, I was teaching a 7th grade prealgebra course with several students who were struggling to master order of operations–that pesky, but critical mathematical grammar topic that bedevils some students through high school and beyond.  I thought it would be a good way to motivate some of my students to 1) be creative, and 2) improve their order of operations abilities to find numbers others hadn’t found or to find unique approaches to some numbers.

My students learned that even within the strict rules of mathematical grammar, there is lots of room for creativity.  Sometimes (often? usually?) there are multiple ways of thinking about a problem, some clever and some blunt but effective.  People deserve respect and congratulations for clever, simple, and elegant solutions.  Seeing how others solve one problem (or number) can often grant insights into how to find other nearby solutions.  Perhaps most importantly, they learned to a small degree how to deal with frustration and to not give up just because an answer didn’t immediately reveal itself.  It took us a few weeks, but we eventually completed with great communal satisfaction our 1-100 integer list.

PARENTS and TEACHERS:  Try this game with your young ones or pursue it just for the fun of a mental challenge.  See what variations you can create.  Compare your solutions with your child, children, or student(s).  From my experiences, this activity has led many younger students to ask how repeating decimals, factorials, and other mathematical operations work.  After all, now there’s a clear purpose to learning, even if only for a “game.”

I’ve created an easy page for you to record your solutions.

A FEW WEEKS AGO, I read a recent post from the always great MathMunch about the IntegerMania site and its additional restriction on the activity–an exquisiteness scale.  My interpretation of “exquisiteness” is that a ‘premium’ is awarded to solutions that express an integer in the simplest, cleanest way possible.  Just like a simple, elegant explanation that gets to the heart of a problem is often considered “better”, the exquisiteness scale rewards simple, elegant formulations of integers over more complex forms.  The scale also includes surcharges for functions which presume the presence of other numbers not required to be explicitly written in common notation (like the 1, 2, & 3 in 4!, the 0 in front of .4, and the infinite 4s in .\overline{4}.

In the past, I simply asked students to create solutions of any kind.  I recorded their variations on a class Web site.  Over the past three weeks, I renamed exquisiteness to “complexity” and re-ran Four 4s across all of my high school junior and senior classes, always accepting new formulations of numbers that hadn’t been found yet, and (paralleling Integermania’s example) allowed a maximum 3 submissions per student per week to prevent a few super-active students from dominating the board.  Also following Integermania’s lead, I allowed any new submission to remain on the board for at least a week before it could be “sniped” by a “less complex” formulation.  I used differently colored index cards to indicate the base level of each submission.

Here are a few images of my students’ progress.  I opted for the physical bulletin board to force the game and advancements visible.  In the latter two images, you can see that, unlike Integermania, I layered later snipes of numbers so that the names of earlier submissions were still on the board, preserving the “first found” credit of the earliest formulations.  The boxed number in the upper left of each card is the complexity rating.

4s_1

4s_3

4s_2

The creativity output was strong, with contributions even from some who weren’t in my classes–friends of students curious about what their friends were so animatedly discussing.  Even my 3rd grade daughter offered some contributions, including a level 1.0 snipe, \displaystyle 5=\frac{4\cdot 4+4}{4} of a senior’s level 3.0 \displaystyle 5=4+\left( \frac{4}{4} \right)^4.  The 4th grade son of a colleague added several other formulations.

When obviously complicated solutions were posted early in a week, I heard several discussing ways to snipe in less complex solutions.  Occasionally, students would find an integer using only three 4s and had to find ways to cleverly dispose of the extra digit.  One of my sometimes struggling regular calculus students did this by adding 4′, the derivative of a constant. Another had already used a repeating decimal ( . \overline{4}), and realized she could just bury the extra 4 there ( .\overline{44}).  Two juniors dove into the complexity scale and learned more mathematics so they could deliberately create some of the most complicated solutions possible, even if just for a week before they were sniped.  Their ventures are the topic of my next post.

AFTERTHOUGHTS:  When I next use Four 4s with elementary or middle school students, I’m not sure I’d want to use the complexity scale.  I think getting lots of solutions visible and discussing the pros, cons, and insights of different approaches for those learning the grammar of mathematical operations would be far more valuable for that age.

The addition of the complexity scale definitely changed the game for my high school students.  Mine is a pretty academically competitive school, so most of the early energy went into finding snipes rather than new numbers.  I also liked how this game drove several conversations about mathematical elegance.

One conversation was particularly insightful.  My colleague’s 4th grade son proposed \displaystyle 1=\frac{44}{44} and argued that from his perspective, it was simpler than the level 1.0 \displaystyle \frac{4+4}{4+4} already on the board because his solution required two fewer operations.    From the complexity scale established at the start of the activity, his solution was a level 2.0 because it used concatenated 4s, but his larger point is definitely hard to refute and taught me that the next time I use this activity, I should engage my students in defining the complexity levels.

ADDENDA:

1) IntegerMania’s collection has extended the Four 4s list from 1 to well past 2000.  I wouldn’t have thought it possible to extend the streak so far, but the collection there shows a potential arrangement of Four 4s for every single integer from 1 to up to 1137 before breaking.  Impressive.  Click here to see the list, but don’t look quite yet if you want to explore for yourself.

As a colleague noted, it would be cool for those involved in the contest to see how their potential solutions stacked up against those submitted from around the world.  Can you create solutions to rival those already posted?

2) IntegerMania has several other ongoing and semi-retired competitions along the same lines including one using Four 1s, Four 9s, and another using Ramanujan’s ‘famous’ taxi cab number, 1729.  I’ve convinced some of my students to make contributions.

Play these yourself or with colleagues, students, and/or your children.  Above all, have fun, be creative, and learn something new.

It’s amazing what can be built from the simplest of assumptions.  That, after all, is what mathematics is all about.