Tag Archives: factoring

Infinite Ways to an Infinite Geometric Sum

One of my students, K, and I were reviewing Taylor Series last Friday when she asked for a reminder why an infinite geometric series summed to $\displaystyle \frac{g}{1-r}$ for first term g and common ratio r when $\left| r \right| < 1$.  I was glad she was dissatisfied with blind use of a formula and dove into a familiar (to me) derivation.  In the end, she shook me free from my routine just as she made sure she didn’t fall into her own.

STANDARD INFINITE GEOMETRIC SUM DERIVATION

My standard explanation starts with a generic infinite geometric series.

$S = g+g\cdot r+g\cdot r^2+g\cdot r^3+...$  (1)

We can reason this series converges iff $\left| r \right| <1$ (see Footnote 1 for an explanation).  Assume this is true for (1).  Notice the terms on the right keep multiplying by r.

The annoying part of summing any infinite series is the ellipsis (…).  Any finite number of terms always has a finite sum, but that simply written, but vague ellipsis is logically difficult.  In the geometric series case, we might be able to handle the ellipsis by aligning terms in a similar series.  You can accomplish this by continuing the pattern on the right:  multiplying both sides by r

$r\cdot S = r\cdot \left( g+g\cdot r+g\cdot r^2+... \right)$

$r\cdot S = g\cdot r+g\cdot r^2+g\cdot r^3+...$  (2)

This seems to make make the right side of (2) identical to the right side of (1) except for the leading g term of (1), but the ellipsis requires some careful treatment. Footnote 2 explains how the ellipses of (1) and (2) are identical.  After that is established, subtracting (2) from (1), factoring, and rearranging some terms leads to the infinite geometric sum formula.

$(1)-(2) = S-S\cdot r = S\cdot (1-r)=g$

$\displaystyle S=\frac{g}{1-r}$

STUDENT PREFERENCES

I despise giving any formula to any of my classes without at least exploring its genesis.  I also allow my students to use any legitimate mathematics to solve problems so long as reasoning is justified.

In my experiences, about half of my students opt for a formulaic approach to infinite geometric sums while an equal number prefer the quick “multiply-by-r-and-subtract” approach used to derive the summation formula.  For many, apparently, the dynamic manipulation is more meaningful than a static rule.  It’s very cool to watch student preferences at play.

K’s VARIATION

K understood the proof, and then asked a question I hadn’t thought to ask.  Why did we have to multiply by r?  Could multiplication by $r^2$ also determine the summation formula?

I had three nearly simultaneous thoughts followed quickly by a fourth.  First, why hadn’t I ever thought to ask that?  Second, geometric series for $\left| r \right|<1$ are absolutely convergent, so K’s suggestion should work.  Third, while the formula would initially look different, absolute convergence guaranteed that whatever the “$r^2$ formula” looked like, it had to be algebraically equivalent to the standard form.  While I considered those conscious questions, my math subconscious quickly saw the easy resolution to K’s question and the equivalence from Thought #3.

Multiplying (1) by $r^2$ gives

$r^2 \cdot S = g\cdot r^2 + g\cdot r^3 + ...$ (3)

and the ellipses of (1) and (3) partner perfectly (Footnote 2), so K subtracted, factored, and simplified to get the inevitable result.

$(1)-(3) = S-S\cdot r^2 = g+g\cdot r$

$S\cdot \left( 1-r^2 \right) = g\cdot (1+r)$

$\displaystyle S=\frac{g\cdot (1+r)}{1-r^2} = \frac{g\cdot (1+r)}{(1+r)(1-r)} = \frac{g}{1-r}$

That was cool, but this success meant that there were surely many more options.

EXTENDING

Why stop at multiplying by r or $r^2$?  Why not multiply both sides of (1) by a generic $r^N$ for any natural number N?   That would give

$r^N \cdot S = g\cdot r^N + g\cdot r^{N+1} + ...$ (4)

where the ellipses of (1) and (4) are again identical by the method of Footnote 2.  Subtracting (4) from (1) gives

$(1)-(4) = S-S\cdot r^N = g+g\cdot r + g\cdot r^2+...+ g\cdot r^{N-1}$

$S\cdot \left( 1-r^N \right) = g\cdot \left( 1+r+r^2+...+r^{N-1} \right)$  (5)

There are two ways to proceed from (5).  You could recognize the right side as a finite geometric sum with first term 1 and ratio r.  Substituting that formula and dividing by $\left( 1-r^N \right)$ would give the general result.

Alternatively, I could see students exploring $\left( 1-r^N \right)$, and discovering by hand or by CAS that $(1-r)$ is always a factor.  I got the following TI-Nspire CAS result in about 10-15 seconds, clearly suggesting that

$1-r^N = (1-r)\left( 1+r+r^2+...+r^{N-1} \right)$.  (6)

Math induction or a careful polynomial expansion of (6) would prove the pattern suggested by the CAS.  From there, dividing both sides of (5) by $\left( 1-r^N \right)$ gives the generic result.

$\displaystyle S = \frac{g\cdot \left( 1+r+r^2+...+r^{N-1} \right)}{\left( 1-r^N \right)}$

$\displaystyle S = \frac{g\cdot \left( 1+r+r^2+...+r^{N-1} \right) }{(1-r) \cdot \left( 1+r+r^2+...+r^{N-1} \right)} = \frac{g}{1-r}$

In the end, K helped me see there wasn’t just my stock approach to an infinite geometric sum, but really an infinite number of parallel ways.  Nice.

FOOTNOTES

1) RESTRICTING r:  Obviously an infinite geometric series diverges for $\left| r \right| >1$ because that would make $g\cdot r^n \rightarrow \infty$ as $n\rightarrow \infty$, and adding an infinitely large term (positive or negative) to any sum ruins any chance of finding a sum.

For $r=1$, the sum converges iff $g=0$ (a rather boring series). If $g \ne 0$ , you get a sum of an infinite number of some nonzero quantity, and that is always infinite, no matter how small or large the nonzero quantity.

The last case, $r=-1$, is more subtle.  For $g \ne 0$, this terms of this series alternate between positive and negative g, making the partial sums of the series add to either g or 0, depending on whether you have summed an even or an odd number of terms.  Since the partial sums alternate, the overall sum is divergent.  Remember that series sums and limits are functions; without a single numeric output at a particular point, the function value at that point is considered to be non-existent.

2) NOT ALL INFINITIES ARE THE SAME:  There are two ways to show two groups are the same size.  The obvious way is to count the elements in each group and find out there is the same number of elements in each, but this works only if you have a finite group size.  Alternatively, you could a) match every element in group 1 with a unique element from group 2, and b) match every element in group 2 with a unique element from group 1.  It is important to do both steps here to show that there are no left-over, unpaired elements in either group.

So do the ellipses in (1) and (2) represent the same sets?  As the ellipses represent sets with an infinite number of elements, the first comparison technique is irrelevant.  For the second approach using pairing, we need to compare individual elements.  For every element in the ellipsis of (1), obviously there is an “partner” in (2) as the multiplication of (1) by r visually shifts all of the terms of the series right one position, creating the necessary matches.

Students often are troubled by the second matching as it appears the ellipsis in (2) contains an “extra term” from the right shift.  But, for every specific term you identify in (2), its identical twin exists in (1).  In the weirdness of infinity, that “extra term” appears to have been absorbed without changing the “size” of the infinity.

Since there is a 1:1 mapping of all elements in the ellipses of (1) and (2), you can conclude they are identical, and their difference is zero.

Probability, Polynomials, and Sicherman Dice

Three years ago, I encountered a question on the TI-Nspire Google group asking if there was a way to use CAS to solve probability problems.  The ideas I pitched in my initial response and follow-up a year later (after first using it with students in a statistics class) have been thoroughly re-confirmed in my first year teaching AP Statistics.  I’ll quickly re-share them below before extending the concept with ideas I picked up a couple weeks ago from Steve Phelps’ session on Probability, Polynomials, and CAS at the 64th annual OCTM conference earlier this month in Cleveland, OH.

BINOMIALS:  FROM POLYNOMIALS TO SAMPLE SPACES

Once you understand them, binomial probability distributions aren’t that difficult, but the initial conjoining of combinatorics and probability makes this a perennially difficult topic for many students.  The standard formula for the probability of determining the chances of K successes in N attempts of a binomial situation where p is the probability of a single success in a single attempt is no less daunting:

$\displaystyle \left( \begin{matrix} N \\ K \end{matrix} \right) p^K (1-p)^{N-K} = \frac{N!}{K! (N-K)!} p^K (1-p)^{N-K}$

But that is almost exactly the same result one gets by raising binomials to whole number powers, so why not use a CAS to expand a polynomial and at least compute the $\displaystyle \left( \begin{matrix} N \\ K \end{matrix} \right)$ portion of the probability?  One added advantage of using a CAS is that you could use full event names instead of abbreviations, making it even easier to identify the meaning of each event.

The TI-Nspire output above shows the entire sample space resulting from flipping a coin 6 times.  Each term is an event.  Within each term, the exponent of each variable notes the number of times that variable occurs and the coefficient is the number of times that combination occurs.  The overall exponent in the expand command is the number of trials.  For example, the middle term– $20\cdot heads^3 \cdot tails^3$ –says that there are 20 ways you could get 3 heads and 3 tails when tossing a coin 6 times. The last term is just $tails^6$, and its implied coefficient is 1, meaning there is just one way to flip 6 tails in 6 tosses.

The expand command makes more sense than memorized algorithms and provides context to students until they gain a deeper understanding of what’s actually going on.

FROM POLYNOMIALS TO PROBABILITY

Still using the expand command, if each variable is preceded by its probability, the CAS result combines the entire sample space AND the corresponding probability distribution function.  For example, when rolling a fair die four times, the distribution for 1s vs. not 1s (2, 3, 4, 5, or 6) is given by

The highlighted term says there is a 38.58% chance that there will be exactly one 1 and any three other numbers (2, 3, 4, 5, or 6) in four rolls of a fair 6-sided die.  The probabilities of the other four events in the sample space are also shown.  Within the TI-Nspire (CAS or non-CAS), one could use a command to give all of these probabilities simultaneously (below), but then one has to remember whether the non-contextualized probabilities are for increasing or decreasing values of which binomial outcome.

Particularly early on in their explorations of binomial probabilities, students I’ve taught have shown a very clear preference for the polynomial approach, even when allowed to choose any approach that makes sense to them.

TAKING POLYNOMIALS FROM ONE DIE TO MANY

Given these earlier thoughts, I was naturally drawn to Steve Phelps “Probability, Polynomials, and CAS” session at the November 2014 OCTM annual meeting in Cleveland, OH.  Among the ideas he shared was using polynomials to create the distribution function for the sum of two fair 6-sided dice.  My immediate thought was to apply my earlier ideas.  As noted in my initial post, the expansion approach above is not limited to binomial situations.  My first reflexive CAS command in Steve’s session before he share anything was this.

By writing the outcomes in words, the CAS interprets them as variables.  I got the entire sample space, but didn’t learn gain anything beyond a long polynomial.  The first output– $five^2$ –with its implied coefficient says there is 1 way to get 2 fives.  The second term– $2\cdot five \cdot four$ –says there are 2 ways to get 1 five and 1 four.  Nice that the technology gives me all the terms so quickly, but it doesn’t help me get a distribution function of the sum.  I got the distributions of the specific outcomes, but the way I defined the variables didn’t permit sum of their actual numerical values.  Time to listen to the speaker.

He suggested using a common variable, X, for all faces with the value of each face expressed as an exponent.  That is, a standard 6-sided die would be represented by $X^1+X^2+ X^3+X^4+X^5+X^6$ where the six different exponents represent the numbers on the six faces of a typical 6-sided die.  Rolling two such dice simultaneously is handled as I did earlier with the binomial cases.

NOTE:  Exponents are handled in TWO different ways here.  1) Within a single polynomial, an exponent is an event value, and 2) Outside a polynomial, an exponent indicates the number of times that polynomial is applied within the specific event.  Coefficients have the same meaning as before.

Because the variables are now the same, when specific terms are multiplied, their exponents (face values) will be added–exactly what I wanted to happen.  That means the sum of the faces when you roll two dice is determined by the following.

Notice that the output is a single polynomial.  Therefore, the exponents are the values of individual cases.  For a couple examples, there are 3 ways to get a sum of 10 $\left( 3 \cdot x^{10} \right)$, 2 ways to get a sum of 3 $\left( 2 \cdot x^3 \right)$, etc.  The most commonly occurring outcome is the term with the largest coefficient.  For rolling two standard fair 6-sided dice, a sum of 7 is the most common outcome, occurring 6 times $\left( 6 \cdot x^7 \right)$.  That certainly simplifies the typical 6×6 tables used to compute the sums and probabilities resulting from rolling two dice.

While not the point of Steve’s talk, I immediately saw that technology had just opened the door to problems that had been computationally inaccessible in the past.  For example, what is the most common sum when rolling 5 dice and what is the probability of that sum?  On my CAS, I entered this.

In the middle of the expanded polynomial are two terms with the largest coefficients, $780 \cdot x^{18}$ and $780 \cdot x^{19}$, meaning a sums of 17 and 18 are the most common, equally likely outcomes when rolling 5 dice.  As there are $6^5=7776$ possible outcomes when rolling a die 5 times, the probability of each of these is $\frac{780}{7776} \approx 0.1003$, or about 10.03% chance each for a sum of 17 or 18.  This can be verified by inserting the probabilities as coefficients before each term before CAS expanding.

With thought, this shouldn’t be surprising as the expected mean value of rolling a 6-sided die many times is 3.5, and $5 \cdot 3.5 = 17.5$, so the integers on either side of 17.5 (17 & 18) should be the most common.  Technology confirms intuition.

ROLLING DIFFERENT DICE SIMULTANEOUSLY

What is the distribution of sums when rolling a 4-sided and a 6-sided die together?  No problem.  Just multiply two different polynomials, one representative of each die.

The output shows that sums of 5, 6, and 7 would be the most common, each occurring four times with probability $\frac{1}{6}$ and together accounting for half of all outcomes of rolling these two dice together.

A BEAUTIFUL EXTENSION–SICHERMAN DICE

My most unexpected gain from Steve’s talk happened when he asked if we could get the same distribution of sums as “normal” 6-sided dice, but from two different 6-sided dice.  The only restriction he gave was that all of the faces of the new dice had to have positive values.  This can be approached by realizing that the distribution of sums of the two normal dice can be found by multiplying two representative polynomials to get

$x^{12}+2x^{11}+3x^{10}+4x^9+5x^8+6x^7+5x^6+4x^5+3x^4+2x^3+x^2$.

Restating the question in the terms of this post, are there two other polynomials that could be multiplied to give the same product?  That is, does this polynomial factor into other polynomials that could multiply to the same product?  A CAS factor command gives

Any rearrangement of these eight (four distinct) sub-polynomials would create the same distribution as the sum of two dice, but what would the the separate sub-products mean in terms of the dice?  As a first example, what if the first two expressions were used for one die (line 1 below) and the two squared trinomials comprised a second die (line 2)?

Line 1 actually describes a 4-sided die with one face of 4, two faces with 3s, and one face of 2.  Line 2 describes a 9-sided die (whatever that is) with one face of 8, two faces of 6, three faces of 4, two faces of 2, and one face with a 0 ( $1=1 \cdot x^0$).  This means rolling a 4-sided and a 9-sided die as described would give exactly the same sum distribution.  Cool, but not what I wanted.  Now what?

Factorization gave four distinct sub-polynomials, each with multitude 2.  One die could contain 0, 1, or 2 of each of these with the remaining factors on the other die.  That means there are $3^4=81$ different possible dice combinations.  I could continue with a trail-and-error approach, but I wanted to be more efficient and elegant.

What follows is the result of thinking about the problem for a while.  Like most math solutions to interesting problems, ultimate solutions are typically much cleaner and more elegant than the thoughts that went into them.  Problem solving is a messy–but very rewarding–business.

SOLUTION

Here are my insights over time:

1) I realized that the $x^2$ term would raise the power (face values) of the desired dice, but would not change the coefficients (number of faces).  Because Steve asked for dice with all positive face values.  That meant each desired die had to have at least one x to prevent non-positive face values.

2) My first attempt didn’t create 6-sided dice.  The sums of the coefficients of the sub-polynomials determined the number of sides.  That sum could also be found by substituting $x=1$ into the sub-polynomial.  I want 6-sided dice, so the final coefficients must add to 6.  The coefficients of the factored polynomials of any die individually must add to 2, 3, or 6 and have a product of 6.  The coefficients of $(x+1)$ add to 2, $\left( x^2+x+1 \right)$ add to 3, and $\left( x^2-x+1 \right)$ add to 1.  The only way to get a polynomial coefficient sum of 6 (and thereby create 6-sided dice) is for each die to have one $(x+1)$ factor and one $\left( x^2+x+1 \right)$ factor.

3) That leaves the two $\left( x^2-x+1 \right)$ factors.  They could split between the two dice or both could be on one die, leaving none on the other.  We’ve already determined that each die already had to have one each of the x, $(x+1)$, and $\left( x^2+x+1 \right)$ factors.  To also split the $\left( x^2-x+1 \right)$ factors would result in the original dice:  Two normal 6-sided dice.  If I want different dice, I have to load both of these factors on one die.

That means there is ONLY ONE POSSIBLE alternative for two 6-sided dice that have the same sum distribution as two normal 6-sided dice.

One die would have single faces of 8, 6, 5, 4, 3, and 1.  The other die would have one 4, two 3s, two 2s, and one 1.  And this is exactly the result of the famous(?) Sicherman Dice.

If a 0 face value was allowed, shift one factor of x from one polynomial to the other.  This can be done two ways.

The first possibility has dice with faces {9, 7, 6, 5, 4, 2} and {3, 2, 2, 1, 1, 0}, and the second has faces {7, 5, 4, 3, 2, 0} and {5, 4, 4, 3, 3, 2}, giving the only other two non-negative solutions to the Sicherman Dice.

Both of these are nothing more than adding one to all faces of one die and subtracting one from from all faces of the other.  While not necessary to use polynomials to compute these, they are equivalent to multiplying the polynomial of one die by x and the other by $\frac{1}{x}$ as many times as desired. That means there are an infinite number of 6-sided dice with the same sum distribution as normal 6-sided dice if you allow the sides to have negative faces.  One of these is

corresponding to a pair of Sicherman Dice with faces {6, 4, 3, 2, 1, -1} and {1,5,5,4,4,3}.

CONCLUSION:

There are other very interesting properties of Sicherman Dice, but this is already a very long post.  In the end, there are tremendous connections between probability and polynomials that are accessible to students at the secondary level and beyond.  And CAS keeps the focus on student learning and away from the manipulations that aren’t even the point in these explorations.

Enjoy.

Powers of 2

Yesterday, James Tanton posted a fun little problem on Twitter:

So, 2 is one more than $1=2^0$, and 8 is one less than 9=2^3\$, and Dr. Tanton wants to know if there are any other powers of two that are within one unit of a perfect square.

While this problem may not have any “real-life relevance”, it demonstrates what I describe as the power and creativity of mathematics.  Among the infinite number of powers of two, how can someone know for certain if any others are or are not within one unit of a perfect square?  No one will ever be able to see every number in the list of powers of two, but variables and mathematics give you the tools to deal with all possibilities at once.

For this problem, let D and N be positive integers.  Translated into mathematical language, Dr. Tanton’s problem is equivalent to asking if there are values of D and N for which $2^D=N^2 \pm 1$.  With a single equation in two unknowns, this is where observation and creativity come into play.  I suspect there may be more than one way to approach this, but my solution follows.  Don’t read any further if you want to solve this for yourself.

Because D and N are positive integers, the left side of $2^D=N^2 \pm 1$,  is always even.   That means $N^2$, and therefore N must be odd.

Because N is odd, I know $N=2k+1$ for some whole number k.  Rewriting our equation gives $2^D=(2k+1)^2 \pm 1$, and the right side equals either $4k^2+4k$ or $4k^2+4k+2$.

Factoring the first expression gives $2^D=4k^2+4K=4k(k+1)$.   Notice that this is the product of two consecutive integers, k and $k+1$, and therefore one of these factors (even though I don’t know which one) must be an odd number.  The only odd number that is a factor of a power of two is 1, so either $k=1$ or $k+1=1 \rightarrow k=0$.  Now, $k=1 \longrightarrow N=3 \longrightarrow D=3$ and $k=0 \longrightarrow N=1 \longrightarrow D=0$, the two solutions Dr. Tanton gave.  No other possibilities are possible from this expression, no matter how far down the list of powers of two you want to go.

But what about the other expression?  Factoring again gives $2^D=4k^2+4k+2=2 \cdot \left( 2k^2+2k+1 \right)$.  The expression in parentheses must be odd because its first two terms are both multiplied by 2 (making them even) and then one is added (making the overall sum odd).  Again, 1 is the only odd factor of a power of two, and this happens in this case only when $k=0 \longrightarrow N=1 \longrightarrow D=0$, repeating a solution from above.

Because no other algebraic solutions are possible, the two solutions Dr. Tanton gave in the problem statement are the only two times in the entire universe of perfect squares and powers of two where elements of those two lists are within a single unit of each other.

Math is sweet.