# Tag Archives: probability

## Painting and Probability

Here’s a cool probability problem the start of which is accessible to middle and high school students with careful reasoning.  It was posted by Alexander Bogomolny on Cut the Knot a week ago. I offer what I think is a cool extension of the problem following the initial credits. The next day Mike Lawler tweeted another phenomenal video solution worked out with his boys during Family Math.

Mike’s videos show his boys thinking through simpler discrete cases of the problem and their eventual case-by-case solution to the nxnxn question.  The last video on the page gives a lovely alternative solution completely bypassing the case-by-case analyses.  This is also “Solution 1” on Alexander’s page.

EXTENDING THE PROBLEM:

When I first encountered the problem, I wanted to think about it before reading any solutions.  As with Mike’s boys, I was surprised early on when the probability for a 2x2x2 cube was $\displaystyle \frac{1}{2}$ and the probability for a 3x3x3 cube was $\displaystyle \frac{1}{3}$.  That was far too pretty to be a coincidence.  My solution exactly mirrored the nxnxn case-by-case analysis in Mike’s videos:  the probability of rolling a painted red face up from a randomly selected smaller cube is $\displaystyle \frac{1}{n}$.

Surely something this simple and clean could be generalized.  Since a cube can be considered a “3-dimensional square”, I re-phrased Alexander’s question into two dimensions.  The trickier part was thinking what it would mean to “roll” a 2-dimensional shape.

The outside of an nxn square is painted red and is chopped into $n^2$ unit squares.  The latter are thoroughly mixed up and put into a bag.  One small square is withdrawn at random from the bag and spun on a flat surface.  What is the probability that the spinner stops with a red side facing you?

Shown below is a 4×4 square, but in all sizes of 2-dimensional squares, there are three possible types in the bag:  those with 2, 1, or 0 sides painted. I solved my first variation case by case.  In any nxn square,

• There are 4 corner squares with 2 sides painted.  The probability of picking one of those squares and then spinning a red side is $\displaystyle \frac{4}{n^2} \cdot \frac{2}{4} = \frac{2}{n^2}$.
• There are $4(n-2)$ edge squares not in a corner with 1 side painted.  The probability of picking one of those squares and then spinning a red side is $\displaystyle \frac{4(n-2)}{n^2} \cdot \frac{1}{4} = \frac{n-2}{n^2}$.
• All other squares have 0 sides painted, so the probability of picking one of those squares and then spinning a red side is 0.
• Adding the probabilities for the separate cases gives the total probability: $\displaystyle \frac{2}{n^2}+\frac{n-2}{n^2}=\frac{1}{n}$

After reading Mike’s and Alexander’s posts, I saw a much easier approach.

• Paint all 4 edges of an nxn square, and divide each painted edge into n painted unit segments.  This creates $4 \cdot n$ total painted small segments.
• Decompose the nxn original square into $n^2$ unit squares.  Each unit square has 4 edges giving $4 \cdot n^2$ total edges.
• Because every edge of every unit square is equally likely to be spun, the total probability of randomly selecting a smaller square and spinning a red side is $\displaystyle \frac{4n}{4n^2}=\frac{1}{n}$.

The dimensions of the “square” don’t seem to matter!

WARNING:

Oliver Wendell Holmes noted, “A mind that is stretched by a new experience can never go back to its old dimensions.”  The math after this point has the potential to stretch…

EXTENDING THE PROBLEM MORE:

I now wondered whether this was a general property.

In the 2-dimensional square, 1-dimensional edges were painted and the square was spun to find the probability of a red edge facing.  With the originally posed cube, 2-dimensional faces were painted and the cube was tossed to find the probability of an upward-facing red face.  These cases suggest that when a cube of some dimension with edge length n is painted, is decomposed into unit cubes of the original dimension, and is spun/tossed to show a cube of one smaller dimension, then the probability of getting a painted smaller-dimensional cube of is always $\displaystyle \frac{1}{n}$, independent of the dimensions occupied by the cube.

Going beyond the experiences of typical middle or high school students, I calculated this probability for a 4-dimensional hypercube (a tesseract). • The exterior of a tesseract is 8 cubes.  Ignore the philosophical difficulty of what it means to “paint” (perhaps fill?) an entire cube.  After all, we’re already beyond the experience of our 3-dimensions.
• Paint/fill all 8 cubes on the surface of the tesseract, and divide each painted cube into $n^3$ painted unit cubes.  This creates $8 \cdot n^3$ total painted unit cubes.
• Decompose the original tesseract into $n^4$ unit tesseracts.  Each unit tesseract has 8 cubes giving $8 \cdot n^4$ total unit cubes.
• Because every unit cube on every unit tesseract is equally likely to be “rolled”, the total probability of randomly selecting a smaller tesseract and rolling a red cube is $\displaystyle \frac{8n^3}{8n^4}=\frac{1}{n}$.

The probability is independent of dimension!

More formally,

The exterior of a d-dimensional hypercube with edge length n is painted red and is chopped into $n^d$ unit d-dimensional hypercubes.  The latter are put into a bag of sufficient dimension to hold them and thoroughly mixed up.  A unit d-dimensional hypercube is withdrawn at random from the bag and tossed.  The probability that the unit d-dimensional hypercube lands with a red (d-1)-dimensional hypercube showing is $\displaystyle \frac{1}{n}$ .

PROOF:

• The exterior of a d-dimensional hypercube is comprised of 2d (d-1)-dimensional hypercubes of dimension (d-1).  Paint/fill all 2d surface hypercubes and divide each painted (d-1)-dimensional hypercube into $n^{d-1}$ painted unit hypercubes.  This creates $2d \cdot n^{d-1}$ total painted unit hypercubes.
• Decompose the original tesseract into $n^d$ unit d-dimensional hypercubes.  Each unit d-dimensional hypercube has 2d surface (d-1)-dimensional hypercubes giving $2d \cdot n^d$ total surface unit d-dimensional hypercubes.
• Because every unit (d-1)-dimensional hypercube on the surface of every unit d-dimensional hypercube is equally likely to be “rolled”, the total probability of randomly selecting a unit d-dimensional hypercube and rolling a (d-1)-dimensional red-painted hypercube is $\displaystyle \frac{2d \cdot n^{d-1}}{2d \cdot n^d}=\frac{1}{n}$.

## From Coins to Magic

Here’s a great problem or trick for a class exploration … or magic for parties.

DO THIS YOURSELF.

Grab a small handful of coins (it doesn’t matter how many), randomly flip them onto a flat surface, and count the number of tails.

Randomly pull off from the group into a separate pile the number of coins equal to the number of tails you just counted.  Turn over every coin in this new pile.

Count the number of tails in each pile.

You got the same number both times!

Why? Marilyn Vos Savant posed a similar problem:

Say that a hundred pennies are on a table. Ninety of them are heads. With your eyes closed, can you separate all the coins into two groups so that each group has the same number of tails?

Savant’s solution is to pull any random 10 coins from the 100 and make a second pile.  Turn all the coins in the new pile over, et voila!  Both piles have an equal number of tails.

While Savant’s approach is much more prescriptive than mine, both solutions work.  Every time.  WHY?

THIS IS STRANGE:

You have no idea the state (heads or tails) of any of the coins you pull into the second pile.  It’s counterintuitive that the two piles could ever contain the same number of tails.

Also, flipping the coins in the new pile seems completely arbitrary, and yet after any random pull & flip, the two resulting piles always hold the same number of tails.

Enter the power (and for young people, the mystery) of algebra to generalize a problem, supporting an argument that holds for all possibilities simultaneously.

HOW IT WORKS:

The first clue to this is the misdirection in Savant’s question.  Told that there are 90 heads, you are asked to make the number of tails equivalent.  In both versions, the number of TAILS in the original pile is the number of coins pulled into the second pile.  This isn’t a coincidence; it’s the key to the solution.

In any pile of randomly flipped coins (they needn’t be all or even part pennies), let N be the number tails.  Create your second pile by pulling a random coins from the initial pile.  Because the coins are randomly selected, you don’t know how many tails are in the new pile, so let that unknown number of coins be X .  That means $0 \le X \le N$, leaving $N-X$ tails in the first pile, and $N-X$ heads in the new pile.  (Make sure you understand that last bit!)  That means if you flip all the coins in the second pile, those heads will become tails, and you are guaranteed exactly $N-X$ tails in both piles.

Cool facts:

• You can’t say with certainty how many tails will be in both piles, but you know they will be the same.
• The total number of coins you start with is completely irrelevant.
• While the given two versions of the problem make piles with equal numbers of heads, this “trick” can balance heads or tails.  To balance heads instead, pull from the initial coins into a second pile the number of heads.  When you flip all the coins in the second pile, both piles will now contain the same number of heads.

A PARTY WONDER or SOLID PROBLEM FOR AN ALGEBRA CLASS:

If you work on your showmanship, you can baffle others with this.  For my middle school daughter, I counted off the “leave alone” pile and then flipped the second pile.  I also let her flip the initial set of coins and tell me each time whether she wanted me to get equal numbers of heads or tails.  I looked away as she shuffled the coins and pulled off the requisite number of coins without looking.

She’s figured out HOW I do it, but as she is just starting algebra, she doesn’t have the abstractness yet to fully generalize the big solution.  She’ll get there.

I could see this becoming a fun data-gathering project for an algebra class.  It would be cool to see how someone approaches this with a group of students.

## PowerBall Redux

Donate to a charity instead.  Let me explain.
The majority of responses to my PowerBall description/warnings yesterday have been, “If you don’t play, you can’t win.”  Unfortunately, I know many, many people are buying many lottery tickets, way more than they should.

OK.  For almost everyone, there’s little harm in spending $2 on a ticket for the entertainment, but don’t expect to win, and don’t buy multiple tickets unless you can afford to do without every dollar you spend. I worry about those who are “investing” tens or hundreds of dollars on any lottery. Two of my school colleagues captured the idea of a lottery yesterday with their analogies, Steve: Suppose you go to the beach and grab a handful of sand and bring it back to your house. And you do that every single day. Then your odds of winning the powerball are still slightly worse than picking out one particular grain of sand from all the sand you accumulated over an entire year. Or more simply put from the perspective of a lottery official, Patrick: Here’s our idea. You guys all throw your money in a big pile. Then, after we take some of it, we’ll give the pile to just one of you. WHY YOU SHOULDN’T BUY MULTIPLE TICKETS: For perspective, a football field is 120 yards long, or 703.6 US dollars long using the logic of my last post. Rounding up, that would buy you 352 PowerBall tickets. That means investing$704 dollars would buy you a single football field length of chances in 10.5 coast-to-coast traverses of the entire United States.  There’s going to be an incredibly large number of disappointed people tomorrow.
MORAL:  Even an incredibly large multiple of a less-than-microscopic chance is still a less-than-microscopic chance.
BETTER IDEA: Assume you have the resources and are willing to part with tens or hundreds of dollars for no likelihood of tangible personal gain.  Using the $704 football example, buy 2 tickets and donate the other$700 to charity. You’ll do much more good.

## PowerBall Math

Given the record size and mania surrounding the current PowerBall Lottery, I thought some of you might be interested in bringing that game into perspective.  This could be an interesting application with some teachers and students.

It certainly is entertaining for many to dream about what you would do if you happened to be lucky enough to win an astronomical lottery.  And lottery vendors are quick to note that your dreams can’t come true if you don’t play.  Nice advertising.  I’ll let the numbers speak to the veracity of the Lottery’s encouragement.

PowerBall is played by picking any 5 different numbers between 1 & 69, and then one PowerBall number between 1 & 26.  So there are $nCr(69,5)*26=292,201,338$ outcomes for this game.  Unfortunately, humans have a particularly difficult time understanding extremely large numbers, so I offer an analogy to bring it a little into perspective.

• The horizontal width of the United States is generally reported to be 2680 miles, and a U.S. dollar bill is 6.14 inches wide.  That means the U.S. is approximately 27,655,505 dollar bills wide.
• If I have 292,201,338 dollar bills (one for every possible PowerBall outcome), I could make a line of dollar bills placed end-to-end from the U.S. East Coast all the way to the West Coast, back to the East, back to the West, and so forth, passing back and forth between the two coasts just over 10.5 times.
• Now imagine that exactly one of those dollar bills was replaced with a replica dollar bill made from gold colored paper.

Your chances of winning the PowerBall lottery are the same as randomly selecting that single gold note from all of those dollar bills laid end-to-end and crossing the entire breadth of the United States 10.5 times.

Dreaming is fun, but how likely is this particular dream to become real?

Play the lottery if doing so is entertaining to you, but like going to the movie theater, don’t expect to get any money back in return.

## Mistakes are Good

Confession #1:  My answers on my last post were WRONG.

I briefly thought about taking that post down, but discarded that idea when I thought about the reality that almost all published mathematics is polished, cleaned, and optimized.  Many students struggle with mathematics under the misconception that their first attempts at any topic should be as polished as what they read in published sources.

While not precisely from the same perspective, Dan Teague recently wrote an excellent, short piece of advice to new teachers on NCTM’s ‘blog entitled Demonstrating Competence by Making Mistakes.  I argue Dan’s advice actually applies to all teachers, so in the spirit of showing how to stick with a problem and not just walking away saying “I was wrong”, I’m going to keep my original post up, add an advisory note at the start about the error, and show below how I corrected my error.

Confession #2:  My approach was a much longer and far less elegant solution than the identical approaches offered by a comment by “P” on my last post and the solution offered on FiveThirtyEight.  Rather than just accepting the alternative solution, as too many students are wont to do, I acknowledged the more efficient approach of others before proceeding to find a way to get the answer through my initial idea.

I’ll also admit that I didn’t immediately see the simple approach to the answer and rushed my post in the time I had available to get it up before the answer went live on FiveThirtyEight.

GENERAL STRATEGY and GOALS:

1-Use a PDF:  The original FiveThirtyEight post asked for the expected time before the siblings simultaneously finished their tasks.  I interpreted this as expected value, and I knew how to compute the expected value of a pdf of a random variable.  All I needed was the potential wait times, t, and their corresponding probabilities.  My approach was solid, but a few of my computations were off.

2-Use Self-Similarity:  I don’t see many people employing the self-similarity tactic I used in my initial solution.  Resolving my initial solution would allow me to continue using what I consider a pretty elegant strategy for handling cumbersome infinite sums.

A CORRECTED SOLUTION:

Stage 1:  My table for the distribution of initial choices was correct, as were my conclusions about the probability and expected time if they chose the same initial app. My first mistake was in my calculation of the expected time if they did not choose the same initial app.  The 20 numbers in blue above represent that sample space.  Notice that there are 8 times where one sibling chose a 5-minute app, leaving 6 other times where one sibling chose a 4-minute app while the other chose something shorter.  Similarly, there are 4 choices of an at most 3-minute app, and 2 choices of an at most 2-minute app.  So the expected length of time spent by the longer app if the same was not chosen for both is $E(Round1) = \frac{1}{20}*(8*5+6*4+4*3+2*2)=4$ minutes,

a notably longer time than I initially reported.

For the initial app choice, there is a $\frac{1}{5}$ chance they choose the same app for an average time of 3 minutes, and a $\frac{4}{5}$ chance they choose different apps for an average time of 4 minutes.

Stage 2:  My biggest error was a rushed assumption that all of the entries I gave in the Round 2 table were equally likely.  That is clearly false as you can see from Table 1 above.  There are only two instances of a time difference of 4, while there are eight instances of a time difference of 1.  A correct solution using my approach needs to account for these varied probabilities.  Here is a revised version of Table 2 with these probabilities included. Conveniently–as I had noted without full realization in my last post–the revised Table 2 still shows the distribution for the 2nd and all future potential rounds until the siblings finally align, including the probabilities.  This proved to be a critical feature of the problem.

Another oversight was not fully recognizing which events would contribute to increasing the time before parity.  The yellow highlighted cells in Table 2 are those for which the next app choice was longer than the current time difference, and any of these would increase the length of a trial.

I was initially correct in concluding there was a $\frac{1}{5}$ probability of the second app choice achieving a simultaneous finish and that this would not result in any additional total time.  I missed the fact that the six non-highlighted values also did not result in additional time and that there was a $\frac{1}{5}$ chance of this happening.

That leaves a $\frac{3}{5}$ chance of the trial time extending by selecting one of the highlighted events.  If that happens, the expected time the trial would continue is $\displaystyle \frac{4*4+(4+3)*3+(4+3+2)*2+(4+3+2+1)*1}{4+(4+3)+(4+3+2)+(4+3+2+1)}=\frac{13}{6}$ minutes.

Iterating:  So now I recognized there were 3 potential outcomes at Stage 2–a $\frac{1}{5}$ chance of matching and ending, a $\frac{1}{5}$ chance of not matching but not adding time, and a $\frac{3}{5}$ chance of not matching and adding an average $\frac{13}{6}$ minutes.  Conveniently, the last two possibilities still combined to recreate perfectly the outcomes and probabilities of the original Stage 2, creating a self-similar, pseudo-fractal situation.  Here’s the revised flowchart for time. Invoking the similarity, if there were T minutes remaining after arriving at Stage 2, then there was a $\frac{1}{5}$ chance of adding 0 minutes, a $\frac{1}{5}$ chance of remaining at T minutes, and a $\frac{3}{5}$ chance of adding $\frac{13}{6}$ minutes–that is being at $T+\frac{13}{6}$ minutes.  Equating all of this allows me to solve for T. $T=\frac{1}{5}*0+\frac{1}{5}*T+\frac{3}{5}*\left( T+\frac{13}{6} \right) \longrightarrow T=6.5$ minutes

Time Solution:  As noted above, at the start, there was a $\frac{1}{5}$ chance of immediately matching with an average 3 minutes, and there was a $\frac{4}{5}$ chance of not matching while using an average 4 minutes.  I just showed that from this latter stage, one would expect to need to use an additional mean 6.5 minutes for the siblings to end simultaneously, for a mean total of 10.5 minutes.  That means the overall expected time spent is

Total Expected Time $=\frac{1}{5}*3 + \frac{4}{5}*10.5 = 9$ minutes.

Number of Rounds Solution:  My initial computation of the number of rounds was actually correct–despite the comment from “P” in my last post–but I think the explanation could have been clearer.  I’ll try again. One round is obviously required for the first choice, and in the $\frac{4}{5}$ chance the siblings don’t match, let N be the average number of rounds remaining.  In Stage 2, there’s a $\frac{1}{5}$ chance the trial will end with the next choice, and a $\frac{4}{5}$ chance there will still be N rounds remaining.  This second situation is correct because both the no time added and time added possibilities combine to reset Table 2 with a combined probability of $\frac{4}{5}$.  As before, I invoke self-similarity to find N. $N = \frac{1}{5}*1 + \frac{4}{5}*N \longrightarrow N=5$

Therefore, the expected number of rounds is $\frac{1}{5}*1 + \frac{4}{5}*5 = 4.2$ rounds.

It would be cool if someone could confirm this prediction by simulation.

CONCLUSION:

I corrected my work and found the exact solution proposed by others and simulated by Steve!   Even better, I have shown my approach works and, while notably less elegant, one could solve this expected value problem by invoking the definition of expected value.

Best of all, I learned from a mistake and didn’t give up on a problem.  Now that’s the real lesson I hope all of my students get.

Happy New Year, everyone!

## Great Probability Problems

UPDATE:  Unfortunately, there are a couple errors in my computations below that I found after this post went live.  In my next post, Mistakes are Good, I fix those errors and reflect on the process of learning from them.

ORIGINAL POST:

A post last week to the AP Statistics Teacher Community by David Bock alerted me to the new weekly Puzzler by Nate Silver’s new Web site, http://fivethirtyeight.com/.  As David noted, with their focus on probability, this new feature offers some great possibilities for AP Statistics probability and simulation.

I describe below FiveThirtyEight’s first three Puzzlers along with a potential solution to the last one.  If you’re searching for some great problems for your classes or challenges for some, try these out!

THE FIRST THREE PUZZLERS:

The first Puzzler asked a variation on a great engineering question:

You work for a tech firm developing the newest smartphone that supposedly can survive falls from great heights. Your firm wants to advertise the maximum height from which the phone can be dropped without breaking.

You are given two of the smartphones and access to a 100-story tower from which you can drop either phone from whatever story you want. If it doesn’t break when it falls, you can retrieve it and use it for future drops. But if it breaks, you don’t get a replacement phone.

Using the two phones, what is the minimum number of drops you need to ensure that you can determine exactly the highest story from which a dropped phone does not break? (Assume you know that it breaks when dropped from the very top.) What if, instead, the tower were 1,000 stories high?

The second Puzzler investigated random geyser eruptions:

You arrive at the beautiful Three Geysers National Park. You read a placard explaining that the three eponymous geysers — creatively named A, B and C — erupt at intervals of precisely two hours, four hours and six hours, respectively. However, you just got there, so you have no idea how the three eruptions are staggered. Assuming they each started erupting at some independently random point in history, what are the probabilities that A, B and C, respectively, will be the first to erupt after your arrival?

Both very cool problems with solutions on the FiveThirtyEight site.  The current Puzzler talked about siblings playing with new phone apps.

You’ve just finished unwrapping your holiday presents. You and your sister got brand-new smartphones, opening them at the same moment. You immediately both start doing important tasks on the Internet, and each task you do takes one to five minutes. (All tasks take exactly one, two, three, four or five minutes, with an equal probability of each). After each task, you have a brief moment of clarity. During these, you remember that you and your sister are supposed to join the rest of the family for dinner and that you promised each other you’d arrive together. You ask if your sister is ready to eat, but if she is still in the middle of a task, she asks for time to finish it. In that case, you now have time to kill, so you start a new task (again, it will take one, two, three, four or five minutes, exactly, with an equal probability of each). If she asks you if it’s time for dinner while you’re still busy, you ask for time to finish up and she starts a new task and so on. From the moment you first open your gifts, how long on average does it take for both of you to be between tasks at the same time so you can finally eat? (You can assume the “moments of clarity” are so brief as to take no measurable time at all.)

SOLVING THE CURRENT PUZZLER:

Before I started, I saw Nick Brown‘s interesting Tweet of his simulation. If Nick’s correct, it looks like a mode of 5 minutes and an understandable right skew.  I approached the solution by first considering the distribution of initial random app choices. There is a $\displaystyle \frac{5}{25}$ chance the siblings choose the same app and head to dinner after the first round.  The expected length of that round is $\frac{1}{5} \cdot \left( 1+2=3=4+5 \right) = 3$ minutes.

That means there is a $\displaystyle \frac{4}{5}$ chance different length apps are chosen with time differences between 1 and 4 minutes.  In the case of unequal apps, the average time spent before the shorter app finishes is $\frac{1}{25} \cdot \left( 8*1+6*2+4*3+2*4 \right) = 1.6$ minutes.

It doesn’t matter which sibling chose the shorter app.  That sibling chooses next with distribution as follows. While the distributions are different, conveniently, there is still a time difference between 1 and 4 minutes when the total times aren’t equal.  That means the second table shows the distribution for the 2nd and all future potential rounds until the siblings finally align.  While this problem has the potential to extend for quite some time, this adds a nice pseudo-fractal self-similarity to the scenario.

As noted, there is a $\displaystyle \frac{4}{20}=\frac{1}{5}$ chance they complete their apps on any round after the first, and this would not add any additional time to the total as the sibling making the choice at this time would have initially chosen the shorter total app time(s).  Each round after the first will take an expected time of $\frac{1}{20} \cdot \left( 7*1+5*2+3*3+1*4 \right) = 1.5$ minutes.

The only remaining question is the expected number of rounds of app choices the siblings will take if they don’t align on their first choice.  This is where I invoked self-similarity.

In the initial choice there was a $\frac{4}{5}$ chance one sibling would take an average 1.6 minutes using a shorter app than the other.  From there, some unknown average N choices remain.  There is a $\frac{1}{5}$ chance the choosing sibling ends the experiment with no additional time, and a $\frac{4}{5}$ chance s/he takes an average 1.5 minutes to end up back at the Table 2 distribution, still needing an average N choices to finish the experiment (the pseudo-fractal self-similarity connection).  All of this is simulated in the flowchart below. Recognizing the self-similarity allows me to solve for N. $\displaystyle N = \frac{1}{5} \cdot 1 + \frac{4}{5} \cdot N \longrightarrow N=5$

Number of Rounds – Starting from the beginning, there is a $\frac{1}{5}$ chance of ending in 1 round and a $\frac{4}{5}$ chance of ending in an average 5 rounds, so the expected number of rounds of app choices before the siblings simultaneously end is $\frac{1}{5} *1 + \frac{4}{5}*5=4.2$ rounds

Time until Eating – In the first choice, there is a $\frac{1}{5}$ chance of ending in 3 minutes.  If that doesn’t happen, there is a subsequent $\frac{1}{5}$ chance of ending with the second choice with no additional time.  If neither of those events happen, there will be 1.6 minutes on the first choice plus an average 5 more rounds, each taking an average 1.5 minutes, for a total average $1.6+5*1.5=9.1$ minutes.  So the total average time until both siblings finish simultaneously will be $\frac{1}{5}*3+\frac{4}{5}*9.1 = 7.88$ minutes

CONCLUSION:

My 7.88 minute mean is reasonably to the right of Nick’s 5 minute mode shown above.  We’ll see tomorrow if I match the FiveThirtyEight solution.

Anyone else want to give it a go?  I’d love to hear other approaches.

## Marilyn vos Savant Conditional Probability Follow Up

In the Marilyn vos Savant problem I posted yesterday, I focused on the subtle shift from simple to conditional probability the writer of the question appeared to miss.  Two of my students took a different approach.

The majority of my students, typical of AP Statistics students’ tendencies very early in the course, tried to use a “wall of words” to explain away the discrepancy rather than providing quantitative evidence.  But two fully embraced the probabilities and developed the following probability tree to incorporate all of the given probabilities.  Each branch shows the probability of a short or long straw given the present state of the system.  Notice that it includes both of the apparently confounding 1/3 and 1/2 probabilities. The uncontested probability of the first person is 1/4.

The probability of the second person is then (3/4)(1/3) = 1/4, exactly as expected.  The probabilities of the 3rd and 4th people can be similarly computed to arrive at the same 1/4 final result.

My students argued essentially that the writer was correct in saying the probability of the second person having the short straw was 1/3 in the instant after it was revealed that the first person didn’t have the straw, but that they had forgotten to incorporate the probability of arriving in that state.  When you use all of the information, the probability of each person receiving the short straw remains at 1/4, just as expected.

## Marilyn vos Savant and Conditional Probability

The following question appeared in the “Ask Marilyn” column in the August 16, 2015 issue of Parade magazine.  The writer seems stuck between two probabilities. (Click here for a cleaned-up online version if you don’t like the newspaper look.)

I just pitched this question to my statistics class (we start the year with a probability unit).  I thought some of you might like it for your classes, too.

I asked them to do two things.  1) Answer the writer’s question, AND 2) Use precise probability terminology to identify the source of the writer’s conundrum.  Can you answer both before reading further?

Very briefly, the writer is correct in both situations.  If each of the four people draws a random straw, there is absolutely a 1 in 4 chance of each drawing the straw.  Think about shuffling the straws and “dealing” one to each person much like shuffling a deck of cards and dealing out all of the cards.  Any given straw or card is equally likely to land in any player’s hand.

Now let the first person look at his or her straw.  It is either short or not.  The author is then correct at claiming the probability of others holding the straw is now 0 (if the first person found the short straw) or 1/3 (if the first person did not).  And this is precisely the source of the writer’s conundrum.  She’s actually asking two different questions but thinks she’s asking only one.

The 1/4 result is from a pure, simple probability scenario.  There are four possible equally-likely locations for the short straw.

The 0 and 1/3 results happen only after the first (or any other) person looks at his or her straw.  At that point, the problem shifts from simple probability to conditional probability.  After observing a straw, the question shifts to determining the probability that one of the remaining people has the short straw GIVEN that you know the result of one person’s draw.

So, the writer was correct in all of her claims; she just didn’t realize she was asking two fundamentally different questions.  That’s a pretty excusable lapse, in my opinion.  Slips into conditional probability are often missed.

Perhaps the most famous of these misses is the solution to the Monty Hall scenario that vos Savant famously posited years ago.  What I particularly love about this is the number of very-well-educated mathematicians who missed the conditional and wrote flaming retorts to vos Savant brandishing their PhDs and ultimately found themselves publicly supporting errant conclusions.  You can read the original question, errant responses, and vos Savant’s very clear explanation here.

CONCLUSION:

Probability is subtle and catches all of us at some point.  Even so, the careful thinking required to dissect and answer subtle probability questions is arguably one of the best exercises of logical reasoning around.

RANDOM(?) CONNECTION:

As a completely different connection, I think this is very much like Heisenberg’s Uncertainty Principle.  Until the first straw is observed, the short straw really could (does?) exist in all hands simultaneously.  Observing the system (looking at one person’s straw) permanently changes the state of the system, bifurcating forever the system into one of two potential future states:  the short straw is found in the first hand or is it not.

CORRECTION (3 hours after posting):

I knew I was likely to overstate or misname something in my final connection.  Thanks to Mike Lawler (@mikeandallie) for a quick correction via Twitter.  I should have called this quantum superposition and not the uncertainty principle.  Thanks so much, Mike.

## Innumeracy and Sharks

Here’s a brief snippet from a conversation about the recent spate of shark attacks in North Carolina as I heard yesterday morning (approx 6AM, 7/4/15) on CNN.

George Burgess (Director, Florida Program for Shark Research):  “One thing is going to happen and that is there are going to be more [shark] attacks year in and year out simply because the human population continues to rise and with it a concurrent interest in aquatic recreation.  So one of the few things I, as a scientist, can predict with some certainty is more attacks in the future because there’s more people.”

Alison Kosik (CNN anchor):  “That is scary and I just started surfing so I may dial that back a bit.”

This marks another great teaching moment spinning out of innumeracy in the media.  I plan to drop just those two paragraphs on my classes when school restarts this fall and open the discussion.  I wonder how many will question the implied, but irrational probability in Kosik’s reply.

TOO MUCH COVERAGE?

Burgess argued elsewhere that

Increased documentation of the incidents may also make people believe attacks are more prevalent.  (Source here.)

It’s certainly plausible that some people think shark attacks are more common than they really are.  But that begs the question of just how nervous a swimmer should be.

MEDIA MANIPULATION

CNN–like almost all mass media, but not nearly as bad as some–shamelessly hyper-focuses on catchy news banners, and what could be catchier than something like ‘Shark attacks spike just as tourists crowd beaches on busy July 4th weekend”?  Was Kosik reading a prepared script that distorts the underlying probability, or was she showing signs of innumeracy? I hope it’s not both, but neither is good.

IRRATIONAL PROBABILITY

So just how uncommon is a shark attack?  In a few minutes of Web research, I found that there were 25 shark attacks in North Carolina from 2005-2014.  There was at least one every year with a maximum of 5 attacks in 2010 (source).  So this year’s 7 attacks is certainly unusually high from the recent annual average of 2.5, but John Allen Paulos reminded us in Innumeracy that [in this case about 3 times] a very small probability, is still a very small probability.

In another place, Burgess noted

“It’s amazing, given the billions of hours humans spend in the water, how uncommon attacks are,” Burgess said, “but that doesn’t make you feel better if you’re one of them.”  (Source here.)

18.9% of NC visitors went to the beach (source) .  In 2012, there were approximately 45.4 million visitors to NC (source).  To overestimate the number of beachgoers, Let’s say 19% of 46 million visitors, or 8.7 million people, went to NC beaches.  Seriously underestimating the number of beachgoers who enter the ocean, assume only 1 in 8 beachgoers entered the ocean.  That’s still a very small 7 attacks out of 1 million people in the ocean.  Because beachgoers almost always enter the ocean at some point (in my experiences), the average likely is much closer to 2 or fewer attacks per million.

To put that in perspective, 110,406 people were injured in car accidents in 2012 in NC (source).  The probability of getting injured driving to the beach is many orders of magnitude larger than the likelihood of ever being attacked by a shark.

Alison Kosik should keep up her surfing.

If you made it to a NC beach safely, enjoy the swim.  It’s safer than your trip there was or your trip home is going to be.  But even those trips are reasonably safe.

I certainly am not diminishing the anguish of accident victims (shark, auto, or otherwise), but accidents happen.  But don’t make too much of one either.  Be intelligent, be reasonable, and enjoy life.

In the end, I hope my students learn to question facts and probabilities.  I hope they always question “How reasonable is what I’m being told?”

Here’s a much more balanced article on shark attacks from NPR:
Don’t Blame the Sharks For ‘Perfect Storm’ of Attacks In North Carolina.

Book suggestions:
1)  Innumeracy, John Allen Paulos
2) Predictably Irrational, Dan Ariely

## CAS and Normal Probability Distributions

My presentation this past Saturday at the 2015 T^3 International Conference in Dallas, TX was on the underappreciated applicability of CAS to statistics.  This post shares some of what I shared there from my first year teaching AP Statistics.

MOVING PAST OUTDATED PEDAGOGY

It’s been decades since we’ve required students to use tables of values to compute by hand trigonometric and radical values.  It seems odd to me that we continue to do exactly that today for so many statistics classes, including the AP.  While the College Board permits statistics-capable calculators, it still provides probability tables with every exam.  That messaging is clear:  it is still “acceptable” to teach statistics using outdated probability tables.

In this, my first year teaching AP Statistics, I decided it was time for my students and I to completely break from this lingering past.  My statistics classes this year have been 100% software-enabled.  Not one of my students has been required to use or even see any tables of probability values.

My classes also have been fortunate to have complete CAS availability on their laptops.  My school’s math department deliberately adopted the TI-Nspire platform in part because that software looks and operates exactly the same on tablet, computer, and handheld platforms.  We primarily use the computer-based version for learning because of the speed and visualization of the large “real estate” there.  We are shifting to school-owned handhelds in our last month before the AP Exam to gain practice on the platform required on the AP.

The remainder of this post shares ways my students and I have learned to apply the TI-Nspire CAS to some statistical questions around normal distributions.

FINDING NORMAL AREAS AND PROBABILITIES

Assume a manufacturer makes golf balls whose distances traveled under identical testing conditions are approximately normally distributed with a mean 295 yards with a standard deviation of 3 yards.  What is the probability that one such randomly selected ball travels more than 300 yards?

Traditional statistics courses teach students to transform the 300 yards into a z-score to look up in a probability table.  That approach obviously works, but with appropriate technology, I believe there will be far less need to use or even compute z-scores in much the same way that always converting logarithms to base-10 or base-to use logarithmic tables is anachronistic when using many modern scientific calculators.

TI calculators and other technologies allow computations of non-standard normal curves.  Notice the Nspire CAS calculation below the curve uses both bounds of the area of interest along with the mean and standard deviation of the distribution to accomplish the computation in a single step.  So the probability of a randomly selected ball from the population described above going more than 300 yards is 4.779%.

GOING BACKWARDS

Now assume the manufacturing process can control the mean distance traveled.  What mean should it use so that no more than 1% of the golf balls travel more than 300 yards?

Depending on the available normal probability tables, the traditional approach to this problem is again to work with z-scores.  A modified CAS version of this is shown below. Therefore, the manufacturer should produce a ball that travels a mean 293.021 yards under the given conditions.

The approach is legitimate, and I shared it with my students.  Several of them ultimately chose a more efficient single line command: But remember that the invNorm() and normCdf() commands on the Nspire are themselves functions, and so their internal parameters are available to solve commands.  A pure CAS, “forward solution” still incorporating only the normCdf() command to which my students were first introduced makes use of this to determine the missing center. DIFFERENTIATING INSTRUCTION

While calculus techniques definitely are NOT part of the AP Statistics curriculum, I do have several students jointly enrolled in various calculus classes.  Some of these astutely noted the similarity between the area-based arguments above and the area under a curve techniques they were learning in their calculus classes.  Never being one to pass on a teaching moment, I pulled a few of these to the side to show them that the previous solutions also could have been derived via integration. I can’t recall any instances of my students actually employing integrals to solve statistics problems this year, but just having the connection verified completely solidified the mathematics they were learning in my class.

CONFIDENCE INTERVALS

The mean lead level of 35 crows in a random sample from a region was 4.90 ppm and the standard deviation was 1.12 ppm.  Construct a 95 percent confidence interval for the mean lead level of crows in the region.

Many students–mine included–have difficulty comprehending confidence intervals and resort to “black box” confidence interval tools available in most (all?) statistics-capable calculators, including the TI-Nspire.

As n is greater than 30, I can compute the requested z-interval by filling in just four entries in a pop-up window and pressing Enter. Convenient, for sure, but this approach doesn’t help the confused students understand that the confidence interval is nothing more than the bounds of the middle 95% of the normal pdf described in the problem, a fact crystallized by the application of the tools the students have been using for weeks by that point in the course. Notice in the solve+normCdf() combination commands that the unknown this time was a bound and not the mean as was the case in the previous example.

EXTENDING THE RULE OF FOUR

I’ve used the “Rule of Four” in every math class I’ve taught for over two decades, explaining that every mathematical concept can be explained or expressed four different ways:  Numerically, Algebraically, Graphically (including graphs and geometric figures), and Verbally.  While not the contextual point of his quote, I often cite MIT’s Marvin Minsky here:

“You don’t understand anything until you learn it more than one way.”

Learning to translate between the four representations grants deeper understanding of concepts and often gives access to solutions in one form that may be difficult or impossible in other forms.

After my decades-long work with CAS, I now believe there is actually a 5th representation of mathematical ideas:  Tools.  Knowing how to translate a question into a form that your tool (in the case of CAS, the tool is computers) can manage or compute creates a different representation of the problem and requires deeper insights to manage the translation.

I knew some of my students this year had deeply embraced this “5th Way” when one showed me his alternative approach to the confidence interval question: I found this solution particularly lovely for several reasons.

• The student knew about lists and statistical commands and on a whim tried combining them in a novel way to produce the desired solution.
• He found the confidence interval directly using a normal distribution command rather than the arguably more convenient black box confidence interval tool.  He also showed explicitly his understanding of the distribution of sample means by adjusting the given standard deviation for the sample size.
• Finally, while using a CAS sometimes involves getting answers in forms you didn’t expect, in this case, I think the CAS command and list output actually provide a cleaner, interval-looking result than the black box confidence interval command much more intuitively connected to the actual meaning of a confidence interval.
• While I haven’t tried it out, it seems to me that this approach also should work on non-CAS statistical calculators that can handle lists.

(a very minor disappointment, quickly overcome)

Returning to my multiple approaches, I tried using my student’s newfound approach using a normCdf() command. Alas, my Nspire returned the very command I had entered, indicating that it didn’t understand the question I had posed.  While a bit disappointed that this approach didn’t work, I was actually excited to have discovered a boundary in the current programming of the Nspire.  Perhaps someday this approach will also work, but my students and I have many other directions we can exploit to find what we need.

Leaving the probability tables behind in their appropriate historical dust while fully embracing the power of modern classroom technology to enhance my students’ statistical learning and understanding, I’m convinced I made the right decision to start this school year.  They know more, understand the foundations of statistics better, and as a group feel much more confident and flexible.  Whether their scores on next month’s AP exam will reflect their growth, I can’t say, but they’ve definitely learned more statistics this year than any previous statistics class I’ve ever taught.

COMPLETE FILES FROM MY 2015 T3 PRESENTATION

If you are interested, you can download here the PowerPoint file for my entire Nspired Statistics and CAS presentation from last week’s 2015 T3 International Conference in Dallas, TX.  While not the point of this post, the presentation started with a non-calculus derivation/explanation of linear regressions.  Using some great feedback from Jeff McCalla, here is an Nspire CAS document creating the linear regression computation updated from what I presented in Dallas.  I hope you found this post and these files helpful, or at least thought-provoking.