Mixed Number Curiosity

The first part of this is not my work, but I offer an intriguing extension.


This appeared on Twitter recently. (source)


Despite its apparent notional confusion, it is a true statement.  Since both sides are positive, you can square both sides without producing extraneous results.  Doing so proves the statement.

It’s a lovely, but curious piece of arithmetic trivia.  A more mathematical question:

Does this pattern hold for any other numbers?

Thomas Oléron Evans has a proof on his ‘blog here in which he solves the equation \sqrt(a+\frac{b}{c}) = a \cdot \sqrt( \frac{b}{c}) under the assumptions that a, b, and c are natural and \frac{b}{c} is any fraction in its most reduced form.  Doing so leads to the equation


where A is any natural number larger than 1.  Nice.

While the derivation is more complicated for middle and upper school students, proof that the formula works is straightforward.

A>0, so all terms are positive.  Square both terms, find a common denominator, et voila!


Using Evans’ assumptions, the formula is inevitable, but any math rests on its assumptions.  I wondered if there are more numbers out there for which the original number pattern was true.

Using Evans’ formula, my very first thought was to violate the integer assumption.  I let A=1.1 and grabbed my Nspire.


Checking the fractional term, I see that I also violated the “simplest form” assumption.  Converting this to a fractional form to make sure there isn’t a decimal off somewhere down the line, I got


So it is true for more than Evans claimed.

I don’t have time to investigate this further right now, so I throw it out to you.  How far does this property go?

Confidence Intervals via graphs and CAS

Confidence intervals (CIs) are a challenging topic for many students, a task made more challenging, in my opinion, because many (most?) statistics texts approach CIs via z-scores.  While repeatedly calculating CI endpoints from standard deviations explains the underlying mathematical structure, it relies on an (admittedly simple) algebraic technique that predates classroom technology currently available for students on the AP Statistics exam.

Many (most?) statistics packages now include automatic CI commands.  Unfortunately for students just learning what a CI means, automatic commands can become computational “black boxes.”  Both CAS and graphing techniques offer a strong middle ground–enough foundation to reinforce what CIs mean with enough automation to avoid unnecessary symbol manipulation time.

In most cases, this is accomplished by understanding a normal cumulative distribution function (cdf) as a function, not just as an electronic substitute for normal probability tables of values.  In this post, I share two alternatives each for three approaches to determining CIs using a TI-Nspire CAS.


In 2010, the mean ACT mathematics score for all tests was 21.0 with standard deviation 5.3.  Determine a 90% confidence interval for the math ACT score of an individual chosen at random from all 2010 ACT test takers.


A 90% CI excludes the extreme 5% on each end of the normal distribution.  Using an inverse normal command gives the z-scores at the corresponding 5% and 95% locations on the normal cdf.


Of course, utilizing symmetry would have required only one command.  To find the actual boundary points of the CI, standardize the endpoints, x, and equate that to the two versions of the z-scores.

\displaystyle \frac{x-21.0}{5.3} = \pm 1.64485

Solving these rational equations for x gives x=12.28 and x=29.72, or CI = \left[ 12.28,29.72 \right] .

Most statistics software lets users avoid this computation with optional parameters for the mean and standard deviation of non-standard normal curves.  One of my students last year used this in the next variation.


After using lists as shortcuts on our TI-Nspires last year for evaluating functions at several points simultaneously, one of my students creatively applied them to the inverse normal command, entering the separate 0.05 and 0.95 cdf probabilities as a single list.  I particularly like how the output for this approach outputs looks exactly like a CI.



The endpoints of a CI are just endpoints of an interval on a normal cdf, so why not avoid the algebra and additional inverse normal command and determine the endpoints via CAS commands?  My students know the solve command from previous math classes, so after learning the normal cdf command, there are very few situations for them to even use the inverse.


This approach keeps my students connected to the normal cdf and solving for the bounds quickly gives the previous CI bounds.

METHOD 2b (Alas, not yet) — CAS and LISTS:

Currently, the numerical techniques the TI-Nspire family uses to solve equations with statistics commands don’t work well with lists in all situations.  Curiously, the Nspire currently can’t handle the solve+lists equivalent of the inverse normal+lists approach in METHOD 1b.


But, I’ve also learned that problems not easily solved in an Nspire CAS calculator window typically crack pretty easily when translated to their graphical equivalents.


This approach should work for any CAS or non-CAS graphing calculator or software with statistics commands.

Remember the “f” in cdf.  A cumulative distribution function is a function, and graphing calculators/software treats them as such.  Replacing the normCdf upper bounds with an x for standard graphing syntax lets one graph the normal cdf (below).

Also remember that any algebraic equation can be solved graphically by independently graphing each side of the equation and treating the resulting pair of equations as a system of equations.  In this case, graphing y=0.05 and y=0.95 and finding the points of intersection gives the values of the CI.



SIDENOTE:  While lists didn’t work with the CAS in the case of METHOD 2b, the next screen shows the syntax to graph both ends of the CI using lists with a single endpoint equation.


The lists obviously weren’t necessary here, but the ability to use lists is a very convenient feature on the TI-Nspire that I’ve leveraged countless times to represent families of functions.  In my opinion, using them in METHOD 3b again leverages that same idea, that the endpoints you seek are different aspects of the same family–the CI.


There are many ways for students in their first statistics courses to use what they already know to determine the endpoints of a confidence interval.  And keeping students attention focused on new ways to use old information solidifies both old and new content.  Eliminating unnecessary computations that aren’t the point of most of introductory statistics anyway is an added bonus.

Happy learning everyone…

Low Floor and High Ceiling Math

Whether you like to solve problems yourself, or are looking for some tidbits for your children or students, I hope this post is informative.

I’ve been reading Jo Boaler‘s brilliant new book, Mathematical Mindsets.  While there’s tons of great information and research there, I’ve been thinking lots lately about her charge to develop more “low floor, high ceiling” tasks into math lessons–problems that are “challenging, but accessible” to a much broader spectrum of students than typical exercises.  In particular, Boaler encourages teachers to use problems that are easily understood, relatively simple to begin, and yet hold deep potential for advanced exploration.  Boaler notes that these problems tend to be very difficult to find.

Here I offer an adaptation of an Ask Marilyn post toward this goal.  While the problem was initially posed in terms of singles at a party; I rephrased it for younger students.  Solving it helped me see variations that I hope address Boaler’s low floor, high ceiling call.


Paraphrasing the original:

Say 100 students stop by the lunchroom for a snack.  Of these, 90 like apples, 80 like pears, 70 like bananas, and 60 like peaches.  At the very least, how many students like all four fruits?


The Ask Marilyn post offered only the answer–zero–but not a solution.  To prove that, I made picture.  Since the question was the least number of commonly liked fruits, I needed to spread out the likes as much as possible.  Ninety liked apples, so when I added the pears, I made sure to include the 10 non-apples among the 80 who did like pears giving


That made 30 (at the bottom) who liked only one of apples or pears, so when I added the 70 bananas, I first added them to those 30, leaving


That made 40 who liked all three, so the 60 peaches could match up to the other 60 who liked only two of the first three, confirming vos Savant’s claim that it was possible in this setup to have no one liking all four.



FIRST:  As a minor extension, one of my students last year would have said the problem could be “complexified” slightly by changing the numbers to percentages.  (I loved my conversations with that student about complexifying vs. simplifying problems to find deeper connections and extensions.)  With enough number sense, students should eventually be able to work with absolute numbers and relative percentages with equal ease.  Mathematically, it doesn’t change anything about the problem.

SECOND:  The problem doesn’t have to be about a single minimum number of students to like all four fruits.  While there is a unique minimum, there are many other non-optimal arrangements.  I wonder how students with developing problem solving skills would approach this.

THIRD:  In my initial attempts, I had used many different variations on my tabular solution above.  Only in the writing of this post did I actually use the above arrangement, and that happened only because I was trying to come up with a visually simple representation.  In doing so, I realized that the critical information here was not what was told, but what was not said.  Where 90, 80, 70, & 60 liked the given fruits, that meant a respective 10, 20, 30, & 40 did not.  And those added up to 100, so I knew that any variation of “not-likes” that also added to 100 could be distributed so that the minimum number who liked all four would also be zero.  So there is an infinite number (if I use percentages) of variations of this problem that have the same answer.  I also realized that any combination of 2 or more fruits whose “not likes” added to 100 could produce the same results.  My ceiling just rose!

FOURTH:  To make the problem more accessible, I could rephrase this in terms of setting out fruit and exploring many different possible arrangements.  i could also encourage learners to support their developing problem solving by translating the problem into pictures.

I’m ready to pose a new variation.  I’d love to hear your thoughts, insights, and variations for raising the ceiling in this problem.


Say 100 students stop by the lunchroom for a snack.  Of these, 90 like apples, 80 like pears, 70 like bananas, and 60 like peaches.  The lunchroom staff knows these numbers, but doesn’t know how much of each fruit to put out.  But putting out too much fruit would be wasteful.

  • What advice can you give them?  Show how you know your solutions are correct.
  • Draw some pictures of the possible numbers of students who like the different fruits.
  • Is there more than one possible answer?
  • It is possible that some students might not like any fruits offered.  How many students might this describe?
  • Some students might like all four fruits.  For how many students might this be true?  How many answers are there to this?

For students who manage all of these, you can challenge them to

  • How can you change the initial numbers in this problem without changing most of the answers?
  • Can you create the same scenarios with more or fewer types of fruit?
  • If the numbers are too big for very young students, you could drop all of the initial numbers by a factor of 10.  How many will see this scaling down simplification (or its scaling up complexification)?

Helping Students Visualize Skew

This week during my statistics classes’ final review time for their term final, I had a small idea I wish I’d had years ago.

Early in the course, we talked about means and medians as centers of data and how these values were nearly identical in roughly symmetric data sets.  However, when there are extreme or outlier values on only one side of the center, the more extreme-sensitive mean would be “pulled” in the direction of those outliers.  In simple cases like this, the pull of the extreme values is said to “skew” the data in the direction of the extremes.  Admittedly, in more complicated data sets with one long tail and one heavy tail, skew can be difficult to visualize, but in early statistics classes with appropriate warnings, I’ve found it sufficient to discuss basic skew in terms of the pull of extreme values on the mean.

Despite my efforts to help students understand the direction of skew relative to the tails of a data set, I’ve noticed that many still describe first the side where the “data piles up” before declaring the skew to be the opposite.  For example, using a histogram (below) of data from my school last week where students were given a week to guess the number of Skittles in glass jar, several of my students continued to note that the data “piled up” on the left, so the skew was right, or positive.


SIDE NOTE:  There were actually 838 Skittles in the jar.  Clearly most of the students seriously underestimated the total with a few extremely hopeful outliers to the far right.

While my students can properly identify the right skew of this data set, I remained bothered by the mildly convoluted approach they persistently used to determine the skew.  I absolutely understand why their eyes are drawn to the left-side pile up, but I wondered if there was another way I could visualize skew that might get them to look first in the skewed direction.  That’s when I wondered about the possibility of describing the “stretchiness” of skew via boxplots.  Nearly symmetric data have nearly symmetric box plots, but extreme or outlier values would notably pull the whiskers of the boxplot or appear as outlier dots on the ends.

If my first visualization of the Skittles data was with a boxplot (below) would that have made it any easier to see the extreme right-side pull on the Skittles data or would they see the box on the left and then declare right skew?


Is it possible the longer right whisker and several right-side outliers in this boxplot make it easier to see right skewness directly rather than as the opposite of the data’s left-side pile up side?  It’ll be the start of class in Fall 2016 before I can try out this idea.

In the meantime, I wonder how others approach helping students see and understand skewness directly rather than as a consequence of something else.  Ideas, anyone?

Recentering a Normal Curve with CAS

Sometimes, knowing how to ask a question in a different way using appropriate tools can dramatically simplify a solution.  For context, I’ll use an AP Statistics question from the last decade about a fictitious railway.


After two set-up questions, students were asked to compute how long to delay one train’s departure to create a very small chance of delay while waiting for a second train to arrive.  I’ll share an abbreviated version of the suggested solution before giving what I think is a much more elegant approach using the full power of CAS technology.


Initially, students were told that X was the normally distributed time Train B took to travel to city C, and Y was the normally distributed time Train D took to travel to C.  The first question asked for the distribution of Y-X if the mean and standard deviation of X are respectively 170 and 20, and the mean and standard deviation of Y are 200 and 10, respectively.  Knowing how to transform normally distributed variables quickly gives that Y-X is normally distributed with mean 30 and standard deviation \sqrt{500}.

Due to damage to a part of the railroad, if Train B arrived at C before Train D, B would have to wait for D to clear the tracks before proceeding.  In part 2, you had to find the probability that B would wait for D.  Translating from English to math, if B arrives before D, then X \le Y.  So the probability of Train B waiting on Train D is equivalent to P(0 \le Y-X).  Using the distribution information in part 1 and a statistics command on my Nspire, this probability is



Under the given conditions, there’s about a 91.0% chance that Train B will have to wait at C for Train D to clear the tracks.  Clearly, that’s not a good railway management situation, setting up the final question.  Paraphrasing,

How long should Train B be delayed so that its probability of being delayed is only 0.01?


A delay in Train B says the mean arrival time of Train D, Y, will remain unchanged at 200, while the mean of arrival time of Train B, X, is increased by some unknown amount.  Call that new mean of X, \hat{X}=170+delay.  That makes the new mean of the difference in arrival times

Y - \hat{X} = 200-(170+delay) = 30-delay

As this is just a translation, the distribution of Y - \hat{X} is congruent to the distribution of Y-X, but recentered.  The standard deviation of both curves is \sqrt{500}.  You want to find the value of delay so that P \left(0 \le Y - \hat{X} \right) = 0.01.  That’s equivalent to knowing the location on the standard normal distribution where the area to the right is 0.01, or equivalently, the area to the left is 0.99.  One way that can be determined is with an inverse normal command.


 The proposed solution used z-scores to suggest finding the value of delay by solving

\displaystyle \frac{0-(30-delay)}{\sqrt{500}} = 2.32635

A little algebra gives delay=82.0187, so the railway should delay Train B just a hair over 82 minutes.


From part 2, the initial conditions suggest Train B has a 91.0% chance of delay, and part 3 asks for the amount of recentering required to change that probability to 0.01.  Rephrasing this as a CAS command (using TI-Nspire syntax), that’s equivalent to solving


Notice that this is precisely the command used in part 2, re-expressed as an equation with a variable adjustment to the mean.  And since I’m using a CAS, I recognize the left side of this equation as a function of delay, making it something that can easily be “solved”.


Notice that I got exactly the same solution without the algebraic manipulation of a rational expression.

My big point here is not that use of a CAS simplifies the algebra (that wasn’t that hard in the first place), but rather that DEEP KNOWLEDGE of the mathematical situation allows one to rephrase a question in a way that enables incredibly efficient use of the technology.  CAS aren’t replacements for basic algebra skills, they are enhancements for mathematical thinking.


The CAS solve command is certainly nice, but many teachers and students don’t yet have CAS access, even though it is 100% legal for the PSAT, SAT, and all AP math exams.   But that’s OK.  If you recognize the normCdf command as a function, you can essentially use a graph menu to accomplish the same end.

Too often, I think teachers and students think of normCdf and invNorm commands as nothing more than glorified “lookup commands”–essentially nothing more than electronic versions of the probability tables they replaced.  But, when one of the parameters is missing, replacing it with X makes it graphable.  In fact, whenever you have an equation that is difficult (or impossible to solve), graph both sides and find the intersection, just like a solution to a system of equations.  Using this strategy, graphing y=normCdf(0,\infty,30-X,\sqrt{500}) and y=0.01 and finding the intersection gives the required solution.



Whether you can access a CAS or not, think more deeply about what questions ask and find creative alternatives to symbolic manipulations.

A Generic Approach to Arclength in Calculus

Earlier this week, a teacher posted in the College Board’s AP Calculus Community a request for an explanation of computing the arclength of a curve without relying on formulas.

The following video is my proposed answer to that question.  In it, I derive the fundamental arclength relationship before computing the length of y=x^2 from x=0 to x=3 four different ways:

  • As a function of x,
  • As a function of y,
  • Parametrically, and
  • As a polar function.

In summary, the length of any differentiable curve can be thought of as


where a and b are the bounds of the curve, the square root is just the local linearity application of the Pythagorean Theorem, and the integral sums the infinitesimal roots over the length of the curve.

To determine the length of any differentiable curve, factor out the form of the differential that matches the independent variable of the curve’s definition.

Tell a Friend

I’ve been in several conversations over these first couple weeks of school with colleagues in our lower and middle schools about what students need to do to convince others they understand an idea.

On our first pre-assessments, some teachers noted that many students showed good computation skills, but struggled when they had to explain relationships.  Frankly, I’m never surprised by revelations that students find explanations more difficult than formulas and computations.  That’s tough for learners of all ages.  But, in my opinion, it’s also the most important part about developing a way to communicate mathematically.

In the other direction, I frequently hear students complain that they just don’t know what to write and that teachers seem to arbitrarily ask for “more explanation”, but they just can’t figure out what that means.


Just like writing in humanities classes, a math learner needs to seriously consider his “audience”.  Who’s going to read your solution?  I think too many write for a classroom teacher, expecting him or her to fill in any potential logical gaps.

Instead, I tell my students that I expect all of their explanations to be understandable by every classmate. In short,

Don’t write your answer to me; write it to a friend who’s been absent for a couple days.

If a random classmate who’s been out a couple days can get it just based on your written work, they you’re good.