Category Archives: Applications

From Coins to Magic

Here’s a great problem or trick for a class exploration … or magic for parties.

DO THIS YOURSELF.

Grab a small handful of coins (it doesn’t matter how many), randomly flip them onto a flat surface, and count the number of tails.

Randomly pull off from the group into a separate pile the number of coins equal to the number of tails you just counted.  Turn over every coin in this new pile.

Count the number of tails in each pile.

You got the same number both times!

Why?

Marilyn Vos Savant posed a similar problem:

Say that a hundred pennies are on a table. Ninety of them are heads. With your eyes closed, can you separate all the coins into two groups so that each group has the same number of tails?

Savant’s solution is to pull any random 10 coins from the 100 and make a second pile.  Turn all the coins in the new pile over, et voila!  Both piles have an equal number of tails.

While Savant’s approach is much more prescriptive than mine, both solutions work.  Every time.  WHY?

THIS IS STRANGE:

You have no idea the state (heads or tails) of any of the coins you pull into the second pile.  It’s counterintuitive that the two piles could ever contain the same number of tails.

Also, flipping the coins in the new pile seems completely arbitrary, and yet after any random pull & flip, the two resulting piles always hold the same number of tails.

Enter the power (and for young people, the mystery) of algebra to generalize a problem, supporting an argument that holds for all possibilities simultaneously.

HOW IT WORKS:

The first clue to this is the misdirection in Savant’s question.  Told that there are 90 heads, you are asked to make the number of tails equivalent.  In both versions, the number of TAILS in the original pile is the number of coins pulled into the second pile.  This isn’t a coincidence; it’s the key to the solution.

In any pile of randomly flipped coins (they needn’t be all or even part pennies), let N be the number tails.  Create your second pile by pulling a random coins from the initial pile.  Because the coins are randomly selected, you don’t know how many tails are in the new pile, so let that unknown number of coins be X .  That means $0 \le X \le N$, leaving $N-X$ tails in the first pile, and $N-X$ heads in the new pile.  (Make sure you understand that last bit!)  That means if you flip all the coins in the second pile, those heads will become tails, and you are guaranteed exactly $N-X$ tails in both piles.

Cool facts:

• You can’t say with certainty how many tails will be in both piles, but you know they will be the same.
• The total number of coins you start with is completely irrelevant.
• While the given two versions of the problem make piles with equal numbers of heads, this “trick” can balance heads or tails.  To balance heads instead, pull from the initial coins into a second pile the number of heads.  When you flip all the coins in the second pile, both piles will now contain the same number of heads.

A PARTY WONDER or SOLID PROBLEM FOR AN ALGEBRA CLASS:

If you work on your showmanship, you can baffle others with this.  For my middle school daughter, I counted off the “leave alone” pile and then flipped the second pile.  I also let her flip the initial set of coins and tell me each time whether she wanted me to get equal numbers of heads or tails.  I looked away as she shuffled the coins and pulled off the requisite number of coins without looking.

She’s figured out HOW I do it, but as she is just starting algebra, she doesn’t have the abstractness yet to fully generalize the big solution.  She’ll get there.

I could see this becoming a fun data-gathering project for an algebra class.  It would be cool to see how someone approaches this with a group of students.

PowerBall Redux

Donate to a charity instead.  Let me explain.
The majority of responses to my PowerBall description/warnings yesterday have been, “If you don’t play, you can’t win.”  Unfortunately, I know many, many people are buying many lottery tickets, way more than they should.

OK.  For almost everyone, there’s little harm in spending $2 on a ticket for the entertainment, but don’t expect to win, and don’t buy multiple tickets unless you can afford to do without every dollar you spend. I worry about those who are “investing” tens or hundreds of dollars on any lottery. Two of my school colleagues captured the idea of a lottery yesterday with their analogies, Steve: Suppose you go to the beach and grab a handful of sand and bring it back to your house. And you do that every single day. Then your odds of winning the powerball are still slightly worse than picking out one particular grain of sand from all the sand you accumulated over an entire year. Or more simply put from the perspective of a lottery official, Patrick: Here’s our idea. You guys all throw your money in a big pile. Then, after we take some of it, we’ll give the pile to just one of you. WHY YOU SHOULDN’T BUY MULTIPLE TICKETS: For perspective, a football field is 120 yards long, or 703.6 US dollars long using the logic of my last post. Rounding up, that would buy you 352 PowerBall tickets. That means investing$704 dollars would buy you a single football field length of chances in 10.5 coast-to-coast traverses of the entire United States.  There’s going to be an incredibly large number of disappointed people tomorrow.
MORAL:  Even an incredibly large multiple of a less-than-microscopic chance is still a less-than-microscopic chance.
BETTER IDEA: Assume you have the resources and are willing to part with tens or hundreds of dollars for no likelihood of tangible personal gain.  Using the $704 football example, buy 2 tickets and donate the other$700 to charity. You’ll do much more good.

PowerBall Math

Given the record size and mania surrounding the current PowerBall Lottery, I thought some of you might be interested in bringing that game into perspective.  This could be an interesting application with some teachers and students.

It certainly is entertaining for many to dream about what you would do if you happened to be lucky enough to win an astronomical lottery.  And lottery vendors are quick to note that your dreams can’t come true if you don’t play.  Nice advertising.  I’ll let the numbers speak to the veracity of the Lottery’s encouragement.

PowerBall is played by picking any 5 different numbers between 1 & 69, and then one PowerBall number between 1 & 26.  So there are $nCr(69,5)*26=292,201,338$ outcomes for this game.  Unfortunately, humans have a particularly difficult time understanding extremely large numbers, so I offer an analogy to bring it a little into perspective.

• The horizontal width of the United States is generally reported to be 2680 miles, and a U.S. dollar bill is 6.14 inches wide.  That means the U.S. is approximately 27,655,505 dollar bills wide.
• If I have 292,201,338 dollar bills (one for every possible PowerBall outcome), I could make a line of dollar bills placed end-to-end from the U.S. East Coast all the way to the West Coast, back to the East, back to the West, and so forth, passing back and forth between the two coasts just over 10.5 times.
• Now imagine that exactly one of those dollar bills was replaced with a replica dollar bill made from gold colored paper.

Your chances of winning the PowerBall lottery are the same as randomly selecting that single gold note from all of those dollar bills laid end-to-end and crossing the entire breadth of the United States 10.5 times.

Dreaming is fun, but how likely is this particular dream to become real?

Play the lottery if doing so is entertaining to you, but like going to the movie theater, don’t expect to get any money back in return.

Mistakes are Good

Confession #1:  My answers on my last post were WRONG.

I briefly thought about taking that post down, but discarded that idea when I thought about the reality that almost all published mathematics is polished, cleaned, and optimized.  Many students struggle with mathematics under the misconception that their first attempts at any topic should be as polished as what they read in published sources.

While not precisely from the same perspective, Dan Teague recently wrote an excellent, short piece of advice to new teachers on NCTM’s ‘blog entitled Demonstrating Competence by Making Mistakes.  I argue Dan’s advice actually applies to all teachers, so in the spirit of showing how to stick with a problem and not just walking away saying “I was wrong”, I’m going to keep my original post up, add an advisory note at the start about the error, and show below how I corrected my error.

Confession #2:  My approach was a much longer and far less elegant solution than the identical approaches offered by a comment by “P” on my last post and the solution offered on FiveThirtyEight.  Rather than just accepting the alternative solution, as too many students are wont to do, I acknowledged the more efficient approach of others before proceeding to find a way to get the answer through my initial idea.

I’ll also admit that I didn’t immediately see the simple approach to the answer and rushed my post in the time I had available to get it up before the answer went live on FiveThirtyEight.

GENERAL STRATEGY and GOALS:

1-Use a PDF:  The original FiveThirtyEight post asked for the expected time before the siblings simultaneously finished their tasks.  I interpreted this as expected value, and I knew how to compute the expected value of a pdf of a random variable.  All I needed was the potential wait times, t, and their corresponding probabilities.  My approach was solid, but a few of my computations were off.

2-Use Self-Similarity:  I don’t see many people employing the self-similarity tactic I used in my initial solution.  Resolving my initial solution would allow me to continue using what I consider a pretty elegant strategy for handling cumbersome infinite sums.

A CORRECTED SOLUTION:

Stage 1:  My table for the distribution of initial choices was correct, as were my conclusions about the probability and expected time if they chose the same initial app.

My first mistake was in my calculation of the expected time if they did not choose the same initial app.  The 20 numbers in blue above represent that sample space.  Notice that there are 8 times where one sibling chose a 5-minute app, leaving 6 other times where one sibling chose a 4-minute app while the other chose something shorter.  Similarly, there are 4 choices of an at most 3-minute app, and 2 choices of an at most 2-minute app.  So the expected length of time spent by the longer app if the same was not chosen for both is

$E(Round1) = \frac{1}{20}*(8*5+6*4+4*3+2*2)=4$ minutes,

a notably longer time than I initially reported.

For the initial app choice, there is a $\frac{1}{5}$ chance they choose the same app for an average time of 3 minutes, and a $\frac{4}{5}$ chance they choose different apps for an average time of 4 minutes.

Stage 2:  My biggest error was a rushed assumption that all of the entries I gave in the Round 2 table were equally likely.  That is clearly false as you can see from Table 1 above.  There are only two instances of a time difference of 4, while there are eight instances of a time difference of 1.  A correct solution using my approach needs to account for these varied probabilities.  Here is a revised version of Table 2 with these probabilities included.

Conveniently–as I had noted without full realization in my last post–the revised Table 2 still shows the distribution for the 2nd and all future potential rounds until the siblings finally align, including the probabilities.  This proved to be a critical feature of the problem.

Another oversight was not fully recognizing which events would contribute to increasing the time before parity.  The yellow highlighted cells in Table 2 are those for which the next app choice was longer than the current time difference, and any of these would increase the length of a trial.

I was initially correct in concluding there was a $\frac{1}{5}$ probability of the second app choice achieving a simultaneous finish and that this would not result in any additional total time.  I missed the fact that the six non-highlighted values also did not result in additional time and that there was a $\frac{1}{5}$ chance of this happening.

That leaves a $\frac{3}{5}$ chance of the trial time extending by selecting one of the highlighted events.  If that happens, the expected time the trial would continue is

$\displaystyle \frac{4*4+(4+3)*3+(4+3+2)*2+(4+3+2+1)*1}{4+(4+3)+(4+3+2)+(4+3+2+1)}=\frac{13}{6}$ minutes.

Iterating:  So now I recognized there were 3 potential outcomes at Stage 2–a $\frac{1}{5}$ chance of matching and ending, a $\frac{1}{5}$ chance of not matching but not adding time, and a $\frac{3}{5}$ chance of not matching and adding an average $\frac{13}{6}$ minutes.  Conveniently, the last two possibilities still combined to recreate perfectly the outcomes and probabilities of the original Stage 2, creating a self-similar, pseudo-fractal situation.  Here’s the revised flowchart for time.

Invoking the similarity, if there were T minutes remaining after arriving at Stage 2, then there was a $\frac{1}{5}$ chance of adding 0 minutes, a $\frac{1}{5}$ chance of remaining at T minutes, and a $\frac{3}{5}$ chance of adding $\frac{13}{6}$ minutes–that is being at $T+\frac{13}{6}$ minutes.  Equating all of this allows me to solve for T.

$T=\frac{1}{5}*0+\frac{1}{5}*T+\frac{3}{5}*\left( T+\frac{13}{6} \right) \longrightarrow T=6.5$ minutes

Time Solution:  As noted above, at the start, there was a $\frac{1}{5}$ chance of immediately matching with an average 3 minutes, and there was a $\frac{4}{5}$ chance of not matching while using an average 4 minutes.  I just showed that from this latter stage, one would expect to need to use an additional mean 6.5 minutes for the siblings to end simultaneously, for a mean total of 10.5 minutes.  That means the overall expected time spent is

Total Expected Time $=\frac{1}{5}*3 + \frac{4}{5}*10.5 = 9$ minutes.

Number of Rounds Solution:  My initial computation of the number of rounds was actually correct–despite the comment from “P” in my last post–but I think the explanation could have been clearer.  I’ll try again.

One round is obviously required for the first choice, and in the $\frac{4}{5}$ chance the siblings don’t match, let N be the average number of rounds remaining.  In Stage 2, there’s a $\frac{1}{5}$ chance the trial will end with the next choice, and a $\frac{4}{5}$ chance there will still be N rounds remaining.  This second situation is correct because both the no time added and time added possibilities combine to reset Table 2 with a combined probability of $\frac{4}{5}$.  As before, I invoke self-similarity to find N.

$N = \frac{1}{5}*1 + \frac{4}{5}*N \longrightarrow N=5$

Therefore, the expected number of rounds is $\frac{1}{5}*1 + \frac{4}{5}*5 = 4.2$ rounds.

It would be cool if someone could confirm this prediction by simulation.

CONCLUSION:

I corrected my work and found the exact solution proposed by others and simulated by Steve!   Even better, I have shown my approach works and, while notably less elegant, one could solve this expected value problem by invoking the definition of expected value.

Best of all, I learned from a mistake and didn’t give up on a problem.  Now that’s the real lesson I hope all of my students get.

Happy New Year, everyone!

How One Data Point Destroyed a Study

Statistics are powerful tools.  Well-implemented, they tease out underlying patterns from the noise of raw data and improve our understanding.  But statistics must take care to avoid misstatements.   Unfortunately, statistics can also deliberately distort relationships, declaring patterns where none exist.  In my AP Statistics classes, I hope my students learn to extract meaning from well-designed studies, and to spot instances of Benjamin Disraeli’s “three kinds of lies:  lies, damned lies, and statistics.”

This post explores part of a study published August 12, 2015, exposing what I believe to be examples of four critical ways statistics are misunderstood and misused:

• Not recognizing the distortion power of outliers in means, standard deviations, and in the case of the study below, regressions.
• Distorting graphs to create the impression of patterns different from what actually exists,
• Cherry-picking data to show only favorable results, and
• Misunderstanding the p-value in inferential studies.

THE STUDY:

I was searching online for examples of research I could use with my AP Statistics classes when I found on the page of a math teacher organization a link to an article entitled, “Cardiorespiratory fitness linked to thinner gray matter and better math skills in kids.”  Following the URL trail, I found a description of the referenced article in an August, 2015 summary article by Science Daily and the actual research posted on August 12, 2015 by the journal, PLOS ONE.

As a middle and high school teacher, I’ve read multiple studies connecting physical fitness to brain health.  I was sure I had hit paydirt with an article offering multiple, valuable lessons for my students!  I read the claims of the Science Daily research summary correlating the physical fitness of 9- and 10-year-old children to performance on a test of arithmetic.  It was careful not to declare cause-and-effect,  but did say

The team found differences in math skills and cortical brain structure between the higher-fit and lower-fit children. In particular, thinner gray matter corresponded to better math performance in the higher-fit kids. No significant fitness-associated differences in reading or spelling aptitude were detected. (source)

The researchers described plausible connections for the aerobic fitness of children and the thickness of cortical gray matter for each participating child.  The study went astray when they attempted to connect their findings to the academic performance of the participants.

Independent t-tests were employed to compare WRAT-3 scores in higher fit and lower fit children. Pearson correlations were also conducted to determine associations between cortical thickness and academic achievement. The alpha level for all tests was set at p < .05. (source)

All of the remaining images, quotes, and data in this post are pulled directly from the primary article on PLOS ONE.  The URLs are provided above with bibliographic references are at the end.

To address questions raised by the study, I had to access the original data and recreate the researchers’ analyses.  Thankfully, PLOS ONE is an open-access journal, and I was able to download the research data.  In case you want to review the data yourself or use it with your classes, here is the original SPSS file which I converted into Excel and TI-Nspire CAS formats.

My suspicions were piqued when I saw the following two graphs–the only scatterplots offered in their research publication.

Scatterplot 1:  Attempt to connect Anterior Frontal Gray Matter thickness with WRAT-3 Arithmetic performance

The right side of the top scatterplot looked like an uncorrelated cloud of data with one data point on the far left seeming to pull the left side of the linear regression upwards, creating a more negative slope.  Because the study reported only two statistically significant correlations between the WRAT tests and cortical thickness in two areas of the brain, I was now concerned that the single extreme data point may have distorted results.

My initial scatterplot (below) confirmed the published graph, but fit to the the entire window, the data now looked even less correlated.

In this scale, the farthest left data point (WRAT Arithmetic score = 66, Anterior Frontal thickness = 3.9) looked much more like an outlier.  I confirmed that the point exceeded 1.5IQRs below the lower quartile, as indicated visually in a boxplot of the WRAT-Arithmetic scores.

Also note from my rescaled scatterplot that the Anterior Frontal measure (y-coordinate) was higher than any of the next five ordered pairs to its right.  Its horizontal outlier location, coupled with its notably higher vertical component, suggested that the single point could have significant influence on any regression on the data.  There was sufficient evidence for me to investigate the study results excluding the (66, 3.9) data point.

The original linear regression on the 48 (WRAT Arithmetic, AF thickness) data was $AF=-0.007817(WRAT_A)+4.350$.  Excluding (66, 3.9), the new scatterplot above shows the revised linear regression on the remaining 47 points:  $AF=-0.007460(WRAT_A)+4.308$.  This and the original equation are close, but the revised slope is 4.6% smaller in magnitude relative to the published result. With the two published results reported significant at p=0.04, the influence of the outlier (66, 3.9) has a reasonable possibility of changing the study results.

Scatterplot 2:  Attempt to connect Superior Frontal Gray Matter thickness with WRAT-3 Arithmetic performance

The tightly compressed scale of the second published scatterplot made me deeply suspicious the (WRAT Arithmetic, Superior Frontal thickness) data was being vertically compressed to create the illusion of a linear relationship where one possibly did not exist.

Rescaling the the graphing window (below) made those appear notably less linear than the publication implied.  Also, the data point corresponding to the WRAT-Arithmetic score of 66 appeared to suffer from the same outlier-influences as the first data set.  It was still an outlier, but now its vertical component was higher than the next eight data points to its right, with some of them notably lower.  Again, there was sufficient evidence to investigate results excluding the outlier data point.

The linear regression on the original 48 (WRAT Arithmetic, SF thickness) data points was $SF=-0.002767(WRAT_A)+4.113$ (above).  Excluding the outlier , the new scatterplot (below) had revised linear regression, $SF=-0.002391(WRAT_A)+4.069$.  This time, the revised slope was 13.6% smaller in magnitude relative to the original slope.  With the published significance also at p=0.04, omitting the outlier was almost certain to change the published results.

THE OUTLIER BROKE THE STUDY

The findings above strongly suggest the published study results are not as reliable as reported.  It is time to rerun the significance tests.

For the first data set–(WRAT Arithmetic, AF thickness) —run an independent t-test on the regression slope with and without the outlier.

• INCLUDING OUTLIER:  For all 48 samples, the researchers reported a slope of -0.007817, $r=-0.292$, and $p=0.04$.  This was reported as a significant result.
• EXCLUDING OUTLIER:  For the remaining 47 samples, the slope is -0.007460, r=-0.252, and p=0.087.  The r confirms the visual impression that the data was less linear and, most importantly, the correlation is no longer significant at $\alpha <0.05$.

For the second data set–(WRAT Arithmetic, SF thickness):

• INCLUDING OUTLIER:  For all 48 samples, the researchers reported a slope of -0.002767, r=-0.291, and p=0.04.  This was reported as a significant result.
• EXCLUDING OUTLIER:  For the remaining 47 samples, the slope is -0.002391, r=-0.229, and p=0.121.  This revision is even less linear and, most importantly, the correlation is no longer significant for any standard significance level.

In brief, the researchers’ arguable decision to include the single, clear outlier data point was the source of any significant results at all.  Whatever correlation exists between gray matter thickness and WRAT-Arithmetic as measured by this study is tenuous, at best, and almost certainly not significant.

THE DANGERS OF CHERRY-PICKING RESULTS:

So, let’s set aside the entire questionable decision to keep an outlier in the data set to achieve significant findings.  There is still a subtle, potential problem with this study’s result that actually impacts many published studies.

The researchers understandably were seeking connections between the thickness of a brain’s gray matter and the academic performance of that brain as measured by various WRAT instruments.  They computed independent t-tests of linear regression slopes between thickness measures at nine different locations in the brain against three WRAT test measures for a total of 27 separate t-tests.  The next table shows the correlation coefficient and p-value from each test.

This approach is commonly used with researchers reporting out only the tests found to be significant.  But in doing so, the researchers may have overlooked a fundamental property of the confidence intervals that underlie p-values.  Using the typical critical value of p=0.05 uses a 95% confidence interval, and one interpretation of a 95% confidence interval is that under the conditions of the assumed null hypothesis, results that occur in most extreme 5% of outcomes will NOT be considered as resulting from the null hypothesis, even though they are.

In other words, even under they typical conditions for which the null hypothesis is true, 5% of correct results would be deemed different enough to be statistically significant–a Type I Error.  Within this study, this defines a binomial probability situation with 27 trials for which the probability of any one trial producing a significant result even though the null hypothesis is correct, is p=0.05.

The binomial probability of finding exactly 2 significant results at p=0.05 over 27 trials is 0.243, and the probability of producing 2 or more significant results when the null hypothesis is true is 39.4%.

That means there is a 39.4% probability in any study testing 27 trials at a p<0.05 critical value that at least 2 of those trials would report a result that would INCORRECTLY be interpreted as contradicting the null hypothesis.  And if more conditions than 27 are tested, the probability of a Type I Error is even higher.

Whenever you have a large number of inference trials, there is an increasingly large probability that at least some of the “significant” trials are actually just random, undetected occurrences of the null hypothesis.

It just happens.

THE ELUSIVE MEANING OF A p-VALUE:

For more on the difficulty of understanding p-values, check out this nice recent article on FiveThirtyEight Science–Not Even Scientists Can Easily Explain P-Values.

CONCLUSION:

Personally, I’m a little disappointed that this study didn’t find significant results.  There are many recent studies showing the connection between physical activity and brain health, but this study didn’t achieve its goal of finding a biological source to explain the correlation.

It is the responsibility of researchers to know their studies and their resulting data sets.  Not finding significant results is not a problem.  But I do expect research to disclaim when its significant results hang entirely on a choice to retain an outlier in its data set.

REFERENCES:

Chaddock-Heyman L, Erickson KI, Kienzler C, King M, Pontifex MB, Raine LB, et al. (2015) The Role of Aerobic Fitness in Cortical Thickness and Mathematics Achievement in Preadolescent Children. PLoS ONE 10(8): e0134115. doi:10.1371/journal.pone.0134115

University of Illinois at Urbana-Champaign. “Cardiorespiratory fitness linked to thinner gray matter and better math skills in kids.” ScienceDaily. http://www.sciencedaily.com/releases/2015/08/150812151229.htm (accessed December 8, 2015).

Chemistry, CAS, and Balancing Equations

Here’ s a cool application of linear equations I first encountered about 20 years ago working with chemistry colleague Penney Sconzo at my former school in Atlanta, GA.  Many students struggle early in their first chemistry classes with balancing equations.  Thinking about these as generalized systems of linear equations gives a universal approach to balancing chemical equations, including ionic equations.

This idea makes a brilliant connection if you teach algebra 2 students concurrently enrolled in chemistry, or vice versa.

FROM CHEMISTRY TO ALGEBRA

Consider burning ethanol.  The chemical combination of ethanol and oxygen, creating carbon dioxide and water:

$C_2H_6O+3O_2 \longrightarrow 2CO_2+3H_2O$     (1)

But what if you didn’t know that 1 molecule of ethanol combined with 3 molecules of oxygen gas to create 2 molecules of carbon dioxide and 3 molecules of water?  This specific set coefficients (or multiples of the set) exist for this reaction because of the Law of Conservation of Matter.  While elements may rearrange in a chemical reaction, they do not become something else.  So how do you determine the unknown coefficients of a generic chemical reaction?

Using the ethanol example, assume you started with

$wC_2H_6O+xO_2 \longrightarrow yCO_2+zH_2O$     (2)

for some unknown values of w, x, y, and z.  Conservation of Matter guarantees that the amount of carbon, hydrogen, and oxygen are the same before and after the reaction.  Tallying the amount of each element on each side of the equation gives three linear equations:

Carbon:  $2w=y$
Hydrogen:  $6w=2z$
Oxygen:  $w+2x=2y+z$

where the coefficients come from the subscripts within the compound notations.  As one example, the carbon subscript in ethanol ( $C_2H_6O$ ) is 2, indicating two carbon atoms in each ethanol molecule.  There must have been 2w carbon atoms in the w ethanol molecules.

This system of 3 equations in 4 variables won’t have a unique solution, but let’s see what my Nspire CAS says.  (NOTE:  On the TI-Nspire, you can solve for any one of the four variables.  Because the presence of more variables than equations makes the solution non-unique, some results may appear cleaner than others.  For me, w was more complicated than z, so I chose to use the z solution.)

All three equations have y in the numerator and denominators of 2.  The presence of the y indicates the expected non-unique solution.  But it also gives me the freedom to select any convenient value of y I want to use.  I’ll pick $y=2$ to simplify the fractions.  Plugging in gives me values for the other coefficients.

Substituting these into (2) above gives the original equation (1).

VARIABILITY EXISTS

Traditionally, chemists write these equations with the lowest possible natural number coefficients, but thinking of them as systems of linear equations makes another reality obvious.  If 1 molecule of ethanol combines with 3 molecules of hydrogen gas to make 2 molecules of carbon dioxide and 3 molecules of water, surely 10 molecule of ethanol combines with 30 molecules of hydrogen gas to make 20 molecules of carbon dioxide and 30 molecules of water (the result of substituting $y=20$ instead of the $y=2$ used above).

You could even let $y=1$ to get $z=\frac{3}{2}$, $w=\frac{1}{2}$, and $x=\frac{3}{2}$.  Shifting units, this could mean a half-mole of ethanol and 1.5 moles of hydrogen make a mole of carbon dioxide and 1.5 moles of water.  The point is, the ratios are constant.  A good lesson.

ANOTHER QUICK EXAMPLE:

Now let’s try a harder one to balance:  Reacting carbon monoxide and hydrogen gas to create octane and water.

$wCO + xH_2 \longrightarrow y C_8 H_{18} + z H_2 O$

Setting up equations for each element gives

Carbon:  $w=8y$
Oxygen:  $w=z$
Hydrogen:  $2x=18y+2z$

I could simplify the hydrogen equation, but that’s not required.  Solving this system of equations gives

Nice.  No fractions this time.  Using $y=1$ gives $w=8$, $x=17$, and $z=8$, or

$8CO + 17H_2 \longrightarrow C_8 H_{18} + 8H_2 O$

Simple.

EXTENSIONS TO IONIC EQUATIONS:

Now let’s balance an ionic equation with unknown coefficients a, b, c, d, e, and f:

$a Ba^{2+} + b OH^- + c H^- + d PO_4^{3-} \longrightarrow eH_2O + fBa_3(PO_4)_2$

In addition to writing equations for barium, oxygen, hydrogen, and phosphorus, Conservation of Charge allows me to write one more equation to reflect the balancing of charge in the reaction.

Barium:  $a = 3f$
Oxygen:  $b +4d = e+8f$
Hydrogen:  $b+c=2e$
Phosphorus:  $d=2f$
CHARGE (+/-):  $2a-b-c-3d=0$

Solving the system gives

Now that’s a curious result.  I’ll deal with the zeros in a moment.  Letting $d=2$ gives $f=1$ and $a=3$, indicating that 3 molecules of ionic barium combine with 2 molecules of ionic phosphate to create a single uncharged molecule of barium phosphate precipitate.

The zeros here indicate the presence of “spectator ions”.  Basically, the hydroxide and hydrogen ions on the left are in equal measure to the liquid water molecule on the right.  Since they are in equal measure, one solution is

$3Ba^{2+}+6OH^- +6H^-+2PO_4^{3-} \longrightarrow 6H_2O + Ba_3(PO_4)_2$

CONCLUSION:

You still need to understand chemistry and algebra to interpret the results, but combining algebra (and especially a CAS) makes it much easier to balance chemical equations and ionic chemical equations, particularly those with non-trivial solutions not easily found by inspection.

The minor connection between science (chemistry) and math (algebra) is nice.

As many others have noted, CAS enables you to keep your mind on the problem while avoiding getting lost in the algebra.

Measuring Calculator Speed

Two weeks ago, my summer school Algebra 2 students were exploring sequences and series.  A problem I thought would be a routine check on students’ ability to compute the sum of a finite arithmetic series morphed into an experimental measure of the computational speed of the TI-Nspire CX handheld calculator.  This experiment can be replicated on any calculator that can compute sums of arithmetic series.

PHILOSOPHY

Teaching this topic in prior years, I’ve found that sometimes students have found series sums by actually adding all of the individual sequence terms.  Some former students have solved problems involving  addition of more than 50 terms, in sequence order, to find their sums.  That’s a valid, but computationally painful approach. I wanted my students to practice less brute-force series manipulations.  Despite my intentions, we ended up measuring brute-force anyway!

Readers of this ‘blog hopefully know that I’m not at all a fan of memorizing formulas.  One of my class mantras is

“Memorize as little as possible.  Use what you know as broadly as possible.”

Formulas can be mis-remembered and typically apply only in very particular scenarios.  Learning WHY a procedure works allows you to apply or adapt it to any situation.

THE PROBLEM I POSED AND STUDENT RESPONSES

Not wanting students to add terms, I allowed use of their Nspire handheld calculators and asked a question that couldn’t feasibly be solved without technological assistance.

The first two terms of a sequence are $t_1=3$ and $t_2=6$.  Another term farther down the sequence is $t_k=25165824$.

A)  If the sequence is arithmetic, what is k?

B)  Compute $\sum_{n=1}^{k}t_n$ where $t_n$ is the arithmetic sequence defined above, and k is the number you computed in part A.

Part A was easy.  They quickly recognized the terms were multiples of 3, so $t_k=25165824=3\cdot k$, or $k=8388608$.

For Part B, I expected students to use the Gaussian approach to summing long arithmetic series that we had explored/discovered the day before.   For arithmetic series, rearrange the terms in pairs:  the first with last, the second with next-to-last, the third with next-to-next-to-last, etc..  Each such pair will have a constant sum, so the sum of any arithmetic series can be computed by multiplying that constant sum by the number of pairs.

Unfortunately, I think I led my students astray by phrasing part B in summation notation.  They were working in pairs and (unexpectedly for me) every partnership tried to answer part B by entering $\sum_{n=1}^{838860}(3n)$ into their calculators.  All became frustrated when their calculators appeared to freeze.  That’s when the fun began.

Multiple groups began reporting identical calculator “freezes”; it took me a few moments to realize what what happening.  That’s when I reminded students what I say at the start of every course:  Their graphing calculator will become their best, most loyal, hardworking, non-judgemental mathematical friend, but you should have some concept of what you are asking it to do.  Whatever you ask, the calculator will diligently attempt to answer until it finds a solution or runs out of energy, no matter how long it takes.  In this case, the students had asked their calculators to compute values of 8,388,608 terms and add them all up.  The machines hadn’t frozen; they were diligently computing and adding 8+ million terms, just as requested.  Nice calculator friends!

A few “Oh”s sounded around the room as they recognized the enormity of the task they had absentmindedly asked of their machines.  When I asked if there was another way to get the answer, most remembered what I had hoped they’d use in the first place.  Using a partner’s machine, they used Gauss’s approach to find $\sum_{n=1}^{8388608}(3n)=(3+25165824)\cdot (8388608/2)=105553128849408$ in an imperceptable fraction of a second.  Nice connections happened when, minutes later, the hard-working Nspires returned the same 15-digit result by the computationally painful approach.  My question phrasing hadn’t eliminated the term-by-term addition I’d hoped to avoid, but I did unintentionally create reinforcement of a concept.  Better yet, I got an idea for a data analysis lab.

LINEAR TIME

They had some fundamental understanding that their calculators were “fast”, but couldn’t quantify what “fast” meant.  The question I posed them the next day was to compute $\sum_{n=1}^k(3n)$ for various values of k, record the amount of time it took for the Nspire to return a solution, determine any pattern, and make predictions.

Recognizing the machine’s speed, one group said “K needs to be a large number, otherwise the calculator would be done before you even started to time.”  Here’s their data.

They graphed the first 5 values on a second Nspire and used the results to estimate how long it would take their first machine to compute the even more monumental task of adding up the first 50 million terms of the series–a task they had set their “loyal mathematical friend” to computing while they calculated their estimate.

Some claimed to be initially surprised that the data was so linear.  With some additional thought, they realized that every time k increased by 1, the Nspire had to do 2 additional computations:  one multiplication and one addition–a perfectly linear pattern.  They used a regression to find a quick linear model and checked residuals to make sure nothing strange was lurking in the background.

The lack of pattern and maximum residual magnitude of about 0.30 seconds over times as long as 390 seconds completely dispelled any remaining doubts of underlying linearity.  Using the linear regression, they estimated their first Nspire would be working for 32 minutes 29 seconds.

They looked at the calculator at 32 minutes, noted that it was still running, and unfortunately were briefly distracted.  When they looked back at 32 minutes, 48 seconds, the calculator had stopped.  It wasn’t worth it to them to re-run the experiment.  They were VERY IMPRESSED that even with the error, their estimate was off just 19 seconds (arguably up to 29 seconds off if the machine had stopped running right after their 32 minute observation).

The units of the linear regression slope (0.000039) were seconds per k.  Reciprocating gave approximately 25,657 computed and summed values of k per second.  As every increase in k required the calculator to multiply the next term number by 3 and add that new term value to the existing sum, each k represented 2 Nspire calculations.  Doubling the last result meant their Nspire was performing about 51,314 calculations per second when calculating the sum of an arithmetic series.

My students were impressed by the speed, the lurking linear function, and their ability to predict computation times within seconds for very long arithmetic series calculations.

Not a bad diversion from unexpected student work, I thought.

Innumeracy and Sharks

Here’s a brief snippet from a conversation about the recent spate of shark attacks in North Carolina as I heard yesterday morning (approx 6AM, 7/4/15) on CNN.

George Burgess (Director, Florida Program for Shark Research):  “One thing is going to happen and that is there are going to be more [shark] attacks year in and year out simply because the human population continues to rise and with it a concurrent interest in aquatic recreation.  So one of the few things I, as a scientist, can predict with some certainty is more attacks in the future because there’s more people.”

Alison Kosik (CNN anchor):  “That is scary and I just started surfing so I may dial that back a bit.”

This marks another great teaching moment spinning out of innumeracy in the media.  I plan to drop just those two paragraphs on my classes when school restarts this fall and open the discussion.  I wonder how many will question the implied, but irrational probability in Kosik’s reply.

TOO MUCH COVERAGE?

Burgess argued elsewhere that

Increased documentation of the incidents may also make people believe attacks are more prevalent.  (Source here.)

It’s certainly plausible that some people think shark attacks are more common than they really are.  But that begs the question of just how nervous a swimmer should be.

MEDIA MANIPULATION

CNN–like almost all mass media, but not nearly as bad as some–shamelessly hyper-focuses on catchy news banners, and what could be catchier than something like ‘Shark attacks spike just as tourists crowd beaches on busy July 4th weekend”?  Was Kosik reading a prepared script that distorts the underlying probability, or was she showing signs of innumeracy? I hope it’s not both, but neither is good.

IRRATIONAL PROBABILITY

So just how uncommon is a shark attack?  In a few minutes of Web research, I found that there were 25 shark attacks in North Carolina from 2005-2014.  There was at least one every year with a maximum of 5 attacks in 2010 (source).  So this year’s 7 attacks is certainly unusually high from the recent annual average of 2.5, but John Allen Paulos reminded us in Innumeracy that [in this case about 3 times] a very small probability, is still a very small probability.

In another place, Burgess noted

“It’s amazing, given the billions of hours humans spend in the water, how uncommon attacks are,” Burgess said, “but that doesn’t make you feel better if you’re one of them.”  (Source here.)

18.9% of NC visitors went to the beach (source) .  In 2012, there were approximately 45.4 million visitors to NC (source).  To overestimate the number of beachgoers, Let’s say 19% of 46 million visitors, or 8.7 million people, went to NC beaches.  Seriously underestimating the number of beachgoers who enter the ocean, assume only 1 in 8 beachgoers entered the ocean.  That’s still a very small 7 attacks out of 1 million people in the ocean.  Because beachgoers almost always enter the ocean at some point (in my experiences), the average likely is much closer to 2 or fewer attacks per million.

To put that in perspective, 110,406 people were injured in car accidents in 2012 in NC (source).  The probability of getting injured driving to the beach is many orders of magnitude larger than the likelihood of ever being attacked by a shark.

Alison Kosik should keep up her surfing.

If you made it to a NC beach safely, enjoy the swim.  It’s safer than your trip there was or your trip home is going to be.  But even those trips are reasonably safe.

I certainly am not diminishing the anguish of accident victims (shark, auto, or otherwise), but accidents happen.  But don’t make too much of one either.  Be intelligent, be reasonable, and enjoy life.

In the end, I hope my students learn to question facts and probabilities.  I hope they always question “How reasonable is what I’m being told?”

Here’s a much more balanced article on shark attacks from NPR:
Don’t Blame the Sharks For ‘Perfect Storm’ of Attacks In North Carolina.

Book suggestions:
1)  Innumeracy, John Allen Paulos
2) Predictably Irrational, Dan Ariely

Next Steps from a Triangle

Watching the news a couple mornings ago, an impossible triangle appeared on the screen.  Hopefully some readers might be able to turn some first ideas a colleague and I had into a great applied geometry lesson.  What follows are some teacher thoughts.  My colleagues and I hope to cultivate classes where students become curious enough to raise some of these questions themselves.

WHAT’S WRONG?

At first glance, the labeling seems off.  In Euclidean geometry, the Triangle Inequality says the sum of the lengths of any two sides of a triangle must exceed the length of the third side.  Unfortunately, the shorter two sides sum to 34 miles, so the longest side of 40 miles seems physically impossible.  Someone must have made a typo.  Right?

But to dismiss this as a simple typo would be to miss out on some spectacular mathematical conversations.  I’m also a big fan of taking problems or situations with prima facie flaws and trying to recover either the problem or some aspects of it (see two of previous posts here and here).

WHAT DOES APPROXIMATELY MEAN?

Without confirming any actual map distances, I first was drawn to the vagueness of the approximated side lengths.  Was it possible that this triangle was actually correct under some level of round-off adjustment?  Hopefully, students would try to determine the degree of rounding the graphic creator used.  Two sides are rounded to a multiple of 10, but the left side appears rounded to a nearest integer with two significant digits.  Assuming the image creator was consistent (is that reasonable?), that last side suggests the sides were rounded to the nearest integer.  That means the largest the left side could be would be 14.5 miles and the bottom side 20.5 miles.  Unfortunately, that means the third side can be no longer than 14.5+20.5=35 miles.  Still not enough to justify the 40 miles, but this does open one possible save.

But what if all three sides were measured to the nearest 10 instead of my assumed ones place?  In this case the sides would be approximately 10, 20, and 40.  Again, this looks bad at first, but a 10 could have been rounded from a 14.9, a 20 from a 24.9, making the third side a possible 14.9+24.9=39.8, completely justifying a third side of 40.    This wasn’t the given labeling, but it would have potentially saved the graphic’s legitimacy.

GEOMETRY ALTERNATIVE

Is there another way the triangle might be correct?  Rarely do pre-collegiate geometry classes explore anything beyond Euclidean geometry.  One of my colleagues, Steve, proposed spherical geometry:

Does the fact that the earth is round play a part in these seemingly wrong values (it turns out “not really”… Although it’s not immediately clear, the only way to violate the triangle inequality in spherical geometry is to connect point the long way around the earth. And based on my admittedly poor geographical knowledge, I’m pretty sure that’s not the case here!)

SHORTEST DISTANCE

Perhaps students eventually realize that the distances involved are especially small relative to the Earth’s surface, so they might conclude that the Euclidean geometry approximation in the graphic is likely fine.

Then again, why is the image drawn “as the crow flies”?  The difficult mountainous terrain in upstate New York make surface distances much longer than air distances between the same points.  Steve asked,

in the context of this problem (known location of escaped prisoners), why is the shortest distance between these points being shown? Wouldn’t the walking/driving distance by paths be more relevant?  (Unless the prisoners had access to a gyrocopter…)

The value of a Euclidean triangle drawn over mountainous terrain has become questionable, at best.

FROM PERIMETER TO AREA

I suspect the triangle awkwardly tried to show the distances the escapees might have traveled.  Potentially interesting, but when searching for a missing person in the mountains–the police and news focus at the time of the graphic–you don’t walk the perimeter of the suspected zone, you have to explore the area inside.

A day later, I saw the search area around Malone, NY shown as a perfect circle.  (I wish I had grabbed that image, too.).  Around the same time, the news reported that the search area was 22 square miles.

• Was the authorities’ 22 measure an approximation of a circle’s area, a polygon based on surface roads, or some other shape?
• Going back to the idea of a spherical triangle, Steve hoped students would ask if they could “compute that from just knowing the side lengths? Is there a spherical Herons Formula?”
• If the search area was a more complicated shape, could you determine its area through some sort of decomposition into simpler shapes?  Would spherical geometry change how you approach that question?  Steve wondered if any students would ask, “Could we compute that from just knowing the side lengths? Is there a spherical Herons Formula?
• At one point near the end of the search, I hear there were about 1400 police officers in the immediate vicinity searching for the escapee.  If you were directing the search for a prison escapee or a lost hiker, how would you deploy those officers?  How long would it take them to explore the entire search zone?  How would the shape of the potential search zone affect your deployment plan?
• If you spread out the searchers in an area, what is the probability that an escapee or missing person could avoid detection?  How would you compute such a probability?
• Ultimately, I propose that Euclidean or spherical approximations seriously underestimated the actual surface area?  The dense mountainous terrain significantly complicated this search.  Could students extrapolate a given search area shape to different terrains?  How would the number of necessary searchers change with different terrains?
• I think there are some lovely openings to fractal measures of surface roughness in the questions in the last bullet point.

ERROR ANALYSIS

Ultimately, we hope students would ask

• What caused the graphic’s errors?  Based on analyses above and some Google mapping, we think “a liberal interpretation of the “approximately” label on each leg might actually be the culprit.”  What do the triangle inequality violations suggest about round-off errors or the use of significant digits?
• The map appeared to be another iteration of a map used a few days earlier.  Is it possible that compounded rounding errors were partially to blame?
• Surely the image’s designer new the triangle was an oversimplification of the reality.  Assuming so, why was this graphic used anyway?  Does it have any news value?  Could you design a more meaningful infographic?

APPRECIATION

Many thanks to Steve Earth for his multiple comments and thoughts that helped fill out this post.

Birthdays, CAS, Probability, and Student Creativity

Many readers are familiar with the very counter-intuitive Birthday Problem:

It is always fun to be in a group when two people suddenly discover that they share a birthday.  Should we be surprised when this happens?  Asked a different way, how large a group of randomly selected people is required to have at least a 50% probability of having a birthday match within the group?

I posed this question to both of my sections of AP Statistics in the first week of school this year.  In a quick poll, one section had a birthday match–two students who had taken classes together for a few years without even realizing what they had in common.  Was I lucky, or was this a commonplace occurrence?

Intrigue over this question motivated our early study of probability.  The remainder of this post follows what I believe is the traditional approach to the problem, supplemented by the computational power of a computer algebra system (CAS)–the TI Nspire CX CAS–available on each of my students’ laptops.

Initial Attempt:

Their first try at a solution was direct.  The difficulty was the number of ways a common birthday could occur.  After establishing that we wanted any common birthday to count as a match and not just an a priori specific birthday, we tried to find the number of ways birthday matches could happen for different sized groups.  Starting small, they reasoned that

• If there were 2 people in a room, there was only 1 possible birthday connection.
• If there were 3 people (A, B, and C), there were 4 possible birthday connections–three pairs (A-B, A-C, and B-C) and one triple (A-B-C).
• For four people (A, B, C, and D), they realized they had to look for pair, triple, and quad connections.  The latter two were easiest:  one quad (A-B-C-D) and four triples (A-B-C, A-B-D, A-C-D, and B-C-D).  For the pairs, we considered the problem as four points and looked for all the ways we could create segments.  That gave (A-B, A-C, A-D, B-C, B-D, and C-D).  These could also occur as double pairs in three ways (A-B & C-D, A-C & B-D, and A-D & B-C).  All together, this made 1+4+6+3=14 ways.

This required lots of support from me and was becoming VERY COMPLICATED VERY QUICKLY.  Two people had 1 connection, 3 people had 4 connections, and 4 people had 14 connections.  Tracking all of the possible connections as the group size expanded–and especially not losing track of any possibilities–was making this approach difficult.  This created a perfect opportunity to use complement probabilities.

While there were MANY ways to have a shared birthday, for every sized group, there is one and only one way to not have any shared birthdays–they all had to be different.  And computing a probability for a single possibility was a much simpler task.

We imagined an empty room with random people entering one at a time.  The first person entering could have any birthday without matching anyone, so $P \left( \text{no match with 1 person} \right) = \frac{365}{365}$ .  When the second person entered, there were 364 unchosen birthdays remaining, giving $P \left( \text{no match with 2 people} \right) = \frac{365}{365} \cdot \frac{364}{365}$, and $P \left( \text{no match with 3 people} \right) = \frac{365}{365} \cdot \frac{364}{365} \cdot \frac{363}{365}$.  And the complements to each of these are the probabilities we sought:

$P \left( \text{birthday match with 1 person} \right) = 1- \frac{365}{365} = 0$
$P \left( \text{birthday match with 2 people} \right) = 1- \frac{365}{365} \cdot \frac{364}{365} \approx 0.002740$
$P \left( \text{birthday match with 3 people} \right) = 1- \frac{365}{365} \cdot \frac{364}{365} \cdot \frac{363}{365} \approx 0.008204$.

The probabilities were small, but with persistent data entry from a few classmates, they found that the 50% threshold was reached with 23 people.

The hard work was finished, but some wanted to find an easier way to compute the solution.  A few students noticed that the numerator looked like the start of a factorial and revised the equation:

$\begin{matrix} \displaystyle P \left( \text{birthday match with n people} \right ) & = & 1- \frac{365}{365} \cdot \frac{364}{365} \dots \frac{(366-n)}{365} \\ \\ & = & 1- \frac{365 \cdot 364 \dots (366-n)}{365^n} \\ \\ & = & 1- \frac{365\cdot 364 \dots (366-n)\cdot (366-n-1)!}{365^n \cdot (366-n-1)!} \\ \\ & = & 1- \frac{365!}{365^n \cdot (365-n)!} \end{matrix}$

It was much simpler to plug in values to this simplified equation, confirming the earlier result.

Not everyone saw the “complete the factorial” manipulation, but one noticed in the first solution the linear pattern in the numerators of the probability fractions.  While it was easy enough to write a formula for the fractions, he didn’t know an easy way to multiply all the fractions together.  He had experience with Sigma Notation for sums, so I introduced him to Pi Notation–it works exactly the same as Sigma Notation, except Pi multiplies the individual terms instead of adding them.  On the TI-Nspire, the Pi Notation command is available in the template menu or under the calculus menu.

Conclusion:

I really like two things about this problem:  the extremely counterintuitive result (just 23 people gives a 50% chance of a birthday match) and discovering the multiple ways you could determine the solution.  Between student pattern recognition and my support in formalizing computation suggestions, students learned that translating different recognized patterns into mathematics symbols, supported by technology, can provide different equally valid ways to solve a problem.

Now I can answer the question I posed about the likelihood of me finding a birthday match among my two statistics classes.  The two sections have 15 and 21 students, respectively.  The probability of having at least one match is the complement of not having any matches.  Using the Pi Notation version of the solution gives

I wasn’t guaranteed a match, but the 58.4% probability gave me a decent chance of having a nice punch line to start the class.  It worked pretty well this time!

Extension:

My students are currently working on their first project, determining a way to simulate groups of people entering a room with randomly determined birthdays to see if the 23 person theoretical threshold bears out with experimental results.