Category Archives: CAS

Computer Algebra Systems

Computers vs. People: Writing Math

Readers of this ‘blog know I actively use many forms of technology in my teaching and personal explorations.  Yesterday, a thread started on the AP-Calculus community discussion board with some expressing discomfort that most math software accepts sin(x)^2 as an acceptable equivalent to the “traditional” handwritten sin^2 (x).

From Desmos:sine1

Some AP readers spoke up to declare that sin(x)^2 would always be read as sin(x^2).  While I can’t speak to the veracity of that last claim, I found it a bit troubling and missing out on some very real difficulties users face when interpreting between paper- and computer-based versions of math expressions.  Following is an edited version of my response to the AP Calculus discussion board.


I believe there’s something at the core of all of this that isn’t being explicitly named:  The differences between computer-based 1-dimensional input (left-to-right text-based commands) vs. paper-and-pencil 2-dimensional input (handwritten notation moves vertically–exponents, limits, sigma notation–and horizontally).  Two-dimensional traditional math writing simply doesn’t convert directly to computer syntax.  Computers are a brilliant tool for mathematics exploration and calculation, but they require a different type of input formatting.  To overlook and not explicitly name this for our students leaves them in the unenviable position of trying to “creatively” translate between two types of writing with occasional interpretation differences.

Our students are unintentionally set up for this confusion when they first learn about the order of operations–typically in middle school in the US.  They learn the sequencing:  parentheses then exponents, then multiplication & division, and finally addition and subtraction.  Notice that functions aren’t mentioned here.  This thread [on the AP Calculus discussion board] has helped me realize that all or almost all of the sources I routinely reference never explicitly redefine order of operations after the introduction of the function concept and notation.  That means our students are left with the insidious and oft-misunderstood PEMDAS (or BIDMAS in the UK) as their sole guide for operation sequencing.  When they encounter squaring or reciprocating or any other operations applied to function notation, they’re stuck trying to make sense and creating their own interpretation of this new dissonance in their old notation.  This is easily evidenced by the struggles many have when inputting computer expressions requiring lots of nested parentheses or when first trying to code in LaTEX.

While the sin(x)^2 notation is admittedly uncomfortable for traditional “by hand” notation, it is 100% logical from a computer’s perspective:  evaluate the function, then square the result.

We also need to recognize that part of the confusion fault here lies in the by-hand notation.  What we traditionalists understand by the notational convenience of sin^2(x) on paper is technically incorrect.  We know what we MEAN, but the notation implies an incorrect order of computation.  The computer notation of sin(x)^2 is actually closer to the truth.

I particularly like the way the TI-Nspire CAS handles this point.  As is often the case with this software, it accepts computer input (next image), while its output converts it to the more commonly understood written WYSIWYG formatting (2nd image below).



Further recent (?) development:  Students have long struggled with the by-hand notation of sin^2(x) needing to be converted to (sin(x))^2 for computers.  Personally, I’ve always liked both because the computer notation emphasizes the squaring of the function output while the by-hand version was a notational convenience.  My students pointed out to me recently that Desmos now accepts the sin^2(x) notation while TI Calculators still do not.

Desmos: sine4

The enhancement of WYSIWYG computer input formatting means that while some of the differences in 2-dimensional hand writing and computer inputs are narrowing, common classroom technologies no longer accept the same linear formatting — but then that was possibly always the case….

To rail against the fact that many software packages interpret sin(x)^2 as (sin(x))^2 or sin^2(x) misses the point that 1-dimensional computer input is not necessarily the same as 2-dimensional paper writing.  We don’t complain when two human speakers misunderstand each other when they speak different languages or dialects.  Instead, we should focus on what each is trying to say and learn how to communicate clearly and efficiently in both venues.

In short, “When in Rome, …”.

Roots of Complex Numbers without DeMoivre

Finding roots of complex numbers can be … complex.

This post describes a way to compute roots of any number–real or complex–via systems of equations without any conversions to polar form or use of DeMoivre’s Theorem.  Following a “traditional approach,” one non-technology example is followed by a CAS simplification of the process.


Most sources describe the following procedure to compute the roots of complex numbers (obviously including the real number subset).

  • Write the complex number whose root is sought in generic polar form.  If necessary, convert from Cartesian form.
  • Invoke DeMoivre’s Theorem to get the polar form of all of the roots.
  • If necessary, convert the numbers from polar form back to Cartesian.

As a very quick example,

Compute all square roots of -16.

Rephrased, this asks for all complex numbers, z, that satisfy  z^2=-16.  The Fundamental Theorem of Algebra guarantees two solutions to this quadratic equation.

The complex Cartesian number, -16+0i, converts to polar form, 16cis( \pi ), where cis(\theta ) = cos( \theta ) +i*sin( \theta ).  Unlike Cartesian form, polar representations of numbers are not unique, so any full rotation from the initial representation would be coincident, and therefore equivalent if converted to Cartesian.  For any integer n, this means

-16 = 16cis( \pi ) = 16 cis \left( \pi + 2 \pi n \right)

Invoking DeMoivre’s Theorem,

\sqrt{-16} = (-16)^{1/2} = \left( 16 cis \left( \pi + 2 \pi n \right) \right) ^{1/2}
= 16^{1/2} * cis \left( \frac{1}{2} \left( \pi + 2 \pi n \right) \right)
= 4 * cis \left( \frac{ \pi }{2} + \pi * n \right)

For n= \{ 0, 1 \} , this gives polar solutions, 4cis \left( \frac{ \pi }{2} \right) and 4cis \left( \frac{ 3 \pi }{2} \right) .  Each can be converted back to Cartesian form, giving the two square roots of -16:   4i and -4i .  Squaring either gives -16, confirming the result.

I’ve always found the rotational symmetry of the complex roots of any number beautiful, particularly for higher order roots.  This symmetry is perfectly captured by DeMoivre’s Theorem, but there is arguably a simpler way to compute them.


Because the solution to every complex number computation can be written in a+bi form, new possibilities open.  The original example can be rephrased:

Determine the simultaneous real values of x and y for which -16=(x+yi)^2.

Start by expanding and simplifying the right side back into a+bi form.  (I wrote about a potentially easier approach to simplifying powers of i in my last post.)

-16+0i = \left( x+yi \right)^2 = x^2 +2xyi+y^2 i^2=(x^2-y^2)+(2xy)i

Notice that the two ends of the previous line are two different expressions for the same complex number(s).  Therefore, equating the real and imaginary coefficients gives a system of equations:


Solving the system gives the square roots of -16.

From the latter equation, either x=0 or y=0.  Substituting y=0 into the first equation gives -16=x^2, an impossible equation because x & y are both real numbers, as stated above.

Substituting x=0 into the first equation gives -16=-y^2, leading to y= \pm 4.  So, x=0 and y=-4 -OR- x=0 and y=4 are the only solutions–x+yi=0-4i and x+yi=0+4i–the same solutions found earlier, but this time without using polar form or DeMoivre!  Notice, too, that the presence of TWO solutions emerged naturally.

Higher order roots could lead to much more complicated systems of equations, but a CAS can solve that problem.


Determine all fourth roots of 1+2i.

That’s equivalent to finding all simultaneous x and y values that satisfy 1+2i=(x+yi)^4.  Expanding the right side is quickly accomplished on a CAS.  From my TI-Nspire CAS:


Notice that the output is simplified to a+bi form that, in the context of this particular example, gives the system of equations,


Using my CAS to solve the system,


First, note there are four solutions, as expected.  Rewriting the approximated numerical output gives the four complex fourth roots of 1+2i-1.176-0.334i-0.334+1.176i0.334-1.176i, and 1.176+0.334i.  Each can be quickly confirmed on the CAS:



Given proper technology, finding the multiple roots of a complex number need not invoke polar representations or DeMoivre’s Theorem.  It really is as “simple” as expanding (x+yi)^n where n is the given root, simplifying the expansion into a+bi form, and solving the resulting 2×2 system of equations.

At the point when such problems would be introduced to students, their algebraic awareness should be such that using a CAS to do all the algebraic heavy lifting is entirely appropriate.

As one final glimpse at the beauty of complex roots, I entered the two equations from the last system into Desmos to take advantage of its very good implicit graphing capabilities.  You can see the four intersections corresponding to the four solutions of the system.  Solutions to systems of implicit equations are notoriously difficult to compute, so I wasn’t surprised when Desmos didn’t compute the coordinates of the points of intersection, even though the graph was pretty and surprisingly quick to generate.


Define Your Own Math Rule

My friend, Knox S., introduced me to this problem.  According to a post on The Telegraph’s Education page, this was originally  posted on Facebook by Randall Jones.

The first line is fine by the standard rules of arithmetic, but as soon as you read the 2nd and 3rd lines, you know something is amiss.  What could be the output of line 4?

The Telegraph post above claims there are two answers.  Sadly, that post suggests there are only two solutions.  The reality is that there is an infinite number of correct answers.

I first share the two most commonly proffered solutions suggested by the Telegraph as the only answers.  I follow this with Knox’s clever use of an incremental number base.  Finally, I offer a more generalized approach to support my claim of many more solutions.


  • THE ANSWER IS 40:  After the first line, add the previous answer to next sum.


Consistent with the first three lines, the same rule to line 4 “proves” the answer is 40:


While nothing requires it, this approach is recursive.  I’ve not seen anyone say this, but the 40 approach requires the equations to appear in the given order.  If you give the equations in a different order, the rule is no longer consistent.  In particular, if you wanted a 5th line, what would it be?  There’s nothing clear about how to extend this solution.

  • THE ANSWER IS 96:  Alternatively, you can multiply the two numbers on the left and add that product to the first number.  This procedure is consistent with the first three lines, so the solution to line 4 must be 96:


The nice thing about this approach is that the solution is explicit, not recursive.  What’s obviously counter-intuitive is why you would first multiply the given numbers, and then why you would add the result to the first number, not the second.  This approach is consistent with the given information, so it is valid.

Unlike the first solution, this multiplicative approach is not commutative.  By this rule, 1+4 yields 5, as shown, but 4+1 would be 4+(4*1)=8.  Nothing in the problem statement required commutativity, so no worries.

Another good aspect of this algorithm is that the order of the equations is now irrelevant.  It applies no matter what numbers are “added” on the left side of the equation.  This is definitely more satisfying.


  • THE ANSWER IS 201:  Knox noticed that if you changed the number base, you could find another legit pattern.  The first line is standard arithmetic, but how could the next lines be consistent, too?  You know 2+5 doesn’t give 12 in standard base-10 arithmetic, but if you use base-5, 2+5=7=1*5^1+2*5^0=12_5.

    Unfortunately, in base-5, line 1 would be (1+4)_5=10_5 and line 3 would be (3+6)_5=14_5, both inconsistent.  Knox’s cleverest move was to vary the number base.  The 3rd line is true in base-4; since the 1st line is true in any base larger than five, he found a consistent pattern by applying base-6 to line 1:


Following this pattern, the next line would be base-3, giving 201 as the answer:


The best part of Knox’s solution is that he maintains the addition integrity of the left side.  The down-side is that this approach works for only one more line.  Any 5th line would give a base-2 (binary) answer, and since base-1 does not exist, the problem would end there.

Knox’s approach also allows you to use any numbers you want for the left-hand sums.  But notice that answers depend on where you write the sum.  For example, if (2+5) was in any other line, you would not get 12.  In line 1, (2+5)_6=11_6, in line 3, you’d get (2+5)_4=13_4.


By now, you should see that any any rule could work so long as you are consistent.  Because standard arithmetic does not apply, solvers should feel free to invoke any functions or algorithms desired.  One way to do this is to think of each line as the inputs (left side) and output (right side) of a three-variable function.

  • THE ANSWER IS 96:  One possible function is z=f(x,y)=a*x^2+b*y^2+c for some values of a, b, and c that passes through (1,4,5), (2,5,12), and (3,6,21).  I used my TI-Nspire CAS to solve the resulting system:


    That means if x and y are the given left-side numbers and z is the right-side answer, the equation \frac{1}{3}*x+\frac{2}{3}*y-6=z satisfies the first three lines and the answer to line 4 is 96


  • THE ANSWER IS \displaystyle \frac{2574}{29}:  If you can square the inputs, why not cube them?  That means another possible function is z=f(x,y)=a*x^3+b*y^3+c.  My CAS solution of the resulting system leads to the fractional answer:


The first three given equations essentially define three ordered triples–(1,4,5), (2,5,12), and (3,6,21)–so almost any equation you conceive with three unknown coefficients can be used to create a 3×3 system of equations.  The fractional solution for line 4 may not be as satisfying as any of the earlier approaches using only integers, but these last two examples make it clear that there should be an infinite number of solutions.

These last two solutions are especially nice because they are explicit and don’t depend on the order of the given information.  You can choose any two numbers to “add”, and the algorithms will work.

Notice also that all of these functions, except for Knox’s, are non-commutative.  No worries, the problem already broke free of standard rules in line 2.


The last two examples prove the existence of quadratic and cubic solutions, so why not a linear solution?  In other words, is there a 3D plane in the form z=a*x+b*y+c containing the given points?


Unfortunately, the resulting 3×3 system didn’t solve. The determinant of the coefficient matrix is zero, suggesting an inconsistent or dependent system.  Upon further inspection, subtracting line 1 from line 2 in the planar system gives a+b=7.  Similarly, subtracting line 2 from line 3 gives a+b=9.  Since both can’t be simultaneously true, the system is inconsistent and has no solution.  It was worth the effort.


Since standard arithmetic didn’t apply after the first line and no other restrictions were in play, that opened the door to lots of creativity.  The many different solutions to this problem all hinge on finding some function–any function–that satisfied the first three lines.  Find one of these, and the last line is simple.  That some attempts won’t work is no hinderance.  Even when standard algorithms seem to apply, there is almost always the possibility of some creative twist when working with numerical sequences.

So, whenever you’re faced with a non-standard system, have fun, be creative, and develop something unexpected.

Many Roads Give Same Derivative

A recent post in the AP Calculus Community expressed some confusion about different ways to compute \displaystyle \frac{dy}{dx} at (0,4) for the function x=2ln(y-3).  I share below the two approaches suggested in the original post, proffer two more, and a slightly more in-depth activity I’ve used in my calculus classes for years.  I conclude with an alternative to derivatives of inverses.

Two Approaches Initially Proposed

1 – Accept the function as posed and differentiate implicitly.

\displaystyle \frac{d}{dx} \left( x = 2 ln(y-3) \right)

\displaystyle 1 = 2*\frac{1}{y-3} * \frac{dy}{dx}

\displaystyle \frac{dy}{dx} = \frac{y-3}{2}

Which gives \displaystyle \frac{dy}{dx} = \frac{1}{2} at (x,y)=(0,4).

2 – Solve for y and differentiate explicitly.

\displaystyle x = 2ln(y-3) \longrightarrow y = 3 + e^{x/2}

\displaystyle \frac{dy}{dx} = e^{x/2} * \frac{1}{2}

Evaluating this at (x,y)=(0,4) gives \displaystyle \frac{dy}{dx} = \frac{1}{2} .

Two Alternative Approaches

3 – Substitute early.

The question never asked for an algebraic expression of \frac{dy}{dx}, only the numerical value of this slope.  Because students tend to make more silly mistakes manipulating algebraic expressions than numeric ones, the additional algebra steps are unnecessary, and potentially error-prone.  Admittedly, the manipulations are pretty straightforward here, in more algebraically complicated cases, early substitutions could significantly simplify work. Using approach #1 and substituting directly into the second line gives

\displaystyle 1 = 2 * \frac{1}{y-3} * \frac{dy}{dx} .

At (x,y)=(0,4), this is

\displaystyle 1 = 2 * \frac{1}{4-3}*\frac{dy}{dx}

\displaystyle \frac{dy}{dx} = \frac{1}{2}

The numeric manipulations on the right side are obviously easier than the earlier algebra.

4 – Solve for \frac{dx}{dy} and reciprocate.

There’s nothing sacred about solving for \frac{dy}{dx} directly.  Why not compute the derivative of the inverse and reciprocate at the end? Differentiating first with respect to y eventually leads to the same solution.

\displaystyle \frac{d}{dy} \left( x = 2 ln(y-3) \right)

\displaystyle \frac{dx}{dy} = 2 * \frac{1}{y-3}

At (x,y)=(0,4), this is

\displaystyle \frac{dx}{dy} = \frac{2}{4-3} = 2 , so

\displaystyle \frac{dy}{dx} = \frac{1}{2}.

Equivalence = A fundamental mathematical concept

I sometimes wonder if teachers should place much more emphasis on equivalence.  We spend so much time manipulating expressions in mathematics classes at all levels, changing mathematical objects (shapes, expressions, equations, etc.) into a different, but equivalent objects.  Many times, these manipulations are completed under the guise of “simplification.”  (Here is a brilliant Dan Teague post cautioning against taking this idea too far.)

But it is critical for students to recognize that proper application of manipulations creates equivalent expressions, even if when the resulting expressions don’t look the same.   The reason we manipulate mathematical objects is to discover features about the object in one form that may not be immediately obvious in another.

For the function x = 2 ln(y-3), the slope at (0,4) must be the same, no matter how that slope is calculated.  If you get a different looking answer while using correct manipulations, the final answers must be equivalent.

Another Example

A similar question appeared on the AP Calculus email list-server almost a decade ago right at the moment I was introducing implicit differentiation.  A teacher had tried to find \displaystyle \frac{dy}{dx} for

\displaystyle x^2 = \frac{x+y}{x-y}

using implicit differentiation on the quotient, manipulating to a product before using implicit differentiation, and finally solving for y in terms of x to use an explicit derivative.

1 – Implicit on a quotient

Take the derivative as given:$

\displaystyle \frac{d}{dx} \left( x^2 = \frac{x+y}{x-y} \right)

\displaystyle 2x = \frac{(x-y) \left( 1 + \frac{dy}{dx} \right) - (x+y) \left( 1 - \frac{dy}{dx} \right) }{(x-y)^2}

\displaystyle 2x * (x-y)^2 = (x-y) + (x-y)*\frac{dy}{dx} - (x+y) + (x+y)*\frac{dy}{dx}

\displaystyle 2x * (x-y)^2 = -2y + 2x * \frac{dy}{dx}

\displaystyle \frac{dy}{dx} = \frac{-2x * (x-y)^2 + 2y}{2x}

2 – Implicit on a product

Multiplying the original equation by its denominator gives

x^2 * (x - y) = x + y .

Differentiating with respect to x gives

\displaystyle 2x * (x - y) + x^2 * \left( 1 - \frac{dy}{dx} \right) = 1 + \frac{dy}{dx}

\displaystyle 2x * (x-y) + x^2 - 1 = x^2 * \frac{dy}{dx} + \frac{dy}{dx}

\displaystyle \frac{dy}{dx} = \frac{2x * (x-y) + x^2 - 1}{x^2 + 1}

3 – Explicit

Solving the equation at the start of method 2 for y gives

\displaystyle y = \frac{x^3 - x}{x^2 + 1} .

Differentiating with respect to x gives

\displaystyle \frac{dy}{dx} = \frac {\left( x^2+1 \right) \left( 3x^2 - 1\right) - \left( x^3 - x \right) (2x+0)}{\left( x^2 + 1 \right) ^2}


Those 3 forms of the derivative look VERY DIFFERENT.  Assuming no errors in the algebra, they MUST be equivalent because they are nothing more than the same derivative of different forms of the same function, and a function’s rate of change doesn’t vary just because you alter the look of its algebraic representation.

Substituting the y-as-a-function-of-x equation from method 3 into the first two derivative forms converts all three into functions of x.  Lots of by-hand algebra or a quick check on a CAS establishes the suspected equivalence.  Here’s my TI-Nspire CAS check.


Here’s the form of this investigation I gave my students.

Final Example

I’m not a big fan of memorizing anything without a VERY GOOD reason.  My teachers telling me to do so never held much weight for me.  I memorized as little as possible and used that information as long as I could until a scenario arose to convince me to memorize more.  One thing I managed to avoid almost completely were the annoying derivative formulas for inverse trig functions.

For example, find the derivative of y = arcsin(x) at x = \frac{1}{2}.

Since arc-trig functions annoy me, I always rewrite them.  Taking sine of both sides and then differentiating with respect to x gives.

sin(y) = x

\displaystyle cos(y) * \frac{dy}{dx} = 1

I could rewrite this equation to give \frac{dy}{dx} = \frac{1}{cos(y)}, a perfectly reasonable form of the derivative, albeit as a less-common  expression in terms of y.  But I don’t even do that unnecessary algebra.  From the original function, x=\frac{1}{2} \longrightarrow y=\frac{\pi}{6}, and I substitute that immediately after the differentiation step to give a much cleaner numeric route to my answer.

\displaystyle cos \left( \frac{\pi}{6} \right) * \frac{dy}{dx} = 1

\displaystyle \frac{\sqrt{3}}{2} * \frac{dy}{dx} = 1

\displaystyle \frac{dy}{dx} = \frac{2}{\sqrt{3}}

And this is the same result as plugging x = \frac{1}{2} into the memorized version form of the derivative of arcsine.  If you like memorizing, go ahead, but my mind remains more nimble and less cluttered.

One final equivalent approach would have been differentiating sin(y) = x with respect to y and reciprocating at the end.


There are MANY ways to compute derivatives.  For any problem or scenario, use the one that makes sense or is computationally easiest for YOU.  If your resulting algebra is correct, you know you have a correct answer, even if it looks different.  Be strong!

Straightening Standard Deviations

This post describes a bivariate data problem I introduced last month in my AP Statistics class, but it easily could have appeared in any Algebra 2 or PreCalculus course, particularly for those classes adapting to the statistics strands of the CCSSM and new SAT standards.  While I used the lab to introduce standard deviations of random samples, the approach also could be used if your bivariate statistics unit is occurs later in your sequencing.

My class started a unit on sampling when we returned in January.  They needed to understand how larger sample sizes tended to shrink standard deviations, but I didn’t want to just give them the formula

\displaystyle \sigma_{\overline{x}}=\frac{\sigma}{\sqrt{n}} .

I know many teachers introduce this relationship by selecting samples with perfect square sizes and see the population standard deviations shrink by integer factors (quadruple the sample size = halve the standard deviation, multiply the sample size by 9 = standard deviation divides by 3, etc.), but I didn’t want to exert that much control.  My students had explored data straightening techniques in the fall and were used to sampling and simulations, so I wanted to see how successfully they could leverage that background to “discover” the sample standard deviation relationship.

My AP Statistics students use TI Nspire CAS software on their laptops, so I wrote their lab using that technology.  The lab could easily be adapted to whatever statistics technology you use in your class.  You can download a pdf of my lab here.


The activity drew samples from a normal distribution for which students were able to define their own means and standard deviations.  Students could choose any values, but those who chose integers tended to make the later connections more easily.

Their first step was to draw 2500 different random samples of sizes n=1, 4, 10, 25, 50, 100.  From each 2500 point data set, students computed sample means and standard deviations.  In retrospect, I should have let students select all or most of their own sample sizes, but I’m still quite satisfied with the results.  If you do experiment with different sample sizes, definitely run the larger potential sizes on your technology to check computation times.

One student chose \mu = 7 and \sigma = 13.  Her sample means and standard deviations are


It was pretty obvious to her that no matter what the sample size, \overline{x} \approx \mu, but the standard deviations were shrinking as the sample sizes grew.  Determining that relationship was the heart of the activity.  Obviously, the sample size (SS) seemed to drive the sample standard deviation (SD), so my student graphed her (SS, SD) data to get


We had explored bivariate data-straightening techniques at the end of the fall semester, so she tried semi-log and log-log transformations to check for the possibilities that these data might be represented by an exponential or power function, respectively.  Her semi-log transformation was still curved, but the log-log was very straight.  That transformation and its accompanying linear regression are below.


Her residuals were small, balanced, and roughly random, so she knew she had a reasonable fit.  From there, she used her CAS to transform (re-curve) the linear regression back to an equation for the original data.


It made sense that this resulting formula not only depended on the sample size, but also originally on the population standard deviation my student had earlier chosen to be \sigma = 13.  Within reasonable round-off deviations, the numerator appeared to be the population standard deviation and the exponent of the denominator was very close to \frac{1}{2}, indicating a square root.  That gave her the expected sample standard deviation formula, \displaystyle \sigma_{\overline{x}} = \frac{\sigma}{\sqrt{n}}.

I know this formula is provided on the AP Statistics Exam, but the simulation, curve straightening, linear regression, and statistical confirmation of the formula were a great review and exercise.  I hope you find it useful, too.

Stats Exploration Yields Deeper Understanding

or “A lesson I wouldn’t have learned without technology”

Last November, some of my AP Statistics students were solving a problem involving a normal distribution with an unknown mean.  Leveraging the TI Nspire CAS calculators we use for all computations, they crafted a logical command that should have worked.  Their unexpected result initially left us scratching heads.  After some conversations with the great folks at TI, we realized that what at first seemed perfectly reasonable for a single answer, in fact had two solutions.  And it took until the end of this week for another student to finally identify and resolve the mysterious results.  This ‘blog post recounts our journey from a questionable normal probability result to a rich approach to confidence intervals.


I had assigned an AP Statistics free response question about a manufacturing process that could be manipulated to control the mean distance its golf balls would travel.  We were told that the process created balls with a normally distributed distance of 288 yards and a standard deviation of 2.8 yards.  The first part asked students to find the probability of balls traveling more than an allowable 291.2 yards.  This was straightforward.  Find the area under a normal curve with a mean of 288 and a standard deviation of 2.8 from 291.2 to infinity.  The Nspire (CAS and non-CAS) syntax for this is:


[Post publishing note: See Dennis’ comment below for a small correction for the non-CAS Nspires.  I forgot that those machines don’t accept “infinity” as a bound.]

As 12.7% of the golf balls traveling too far is obviously an unacceptably high percentage, the next part asked for the mean distance needed so only 99% of the balls traveled allowable distances.  That’s when things got interesting.


Their initial thought was that even though they didn’t know the mean, they now knew the output of their normCdf command.  Since the balls couldn’t travel a negative distance and zero was many standard deviations from the unknown mean, the following equation with representing the unknown mean should define the scenario nicely.


Because this was an equation with a single unknown, we could now use our CAS calculators to solve for the missing parameter.


Something was wrong.  How could the mean distance possibly be just 6.5 yards?  The Nspires are great, reliable machines.  What happened?

I had encountered something like this before with unexpected answers when a solve command was applied to a Normal cdf with dual finite bounds .  While it didn’t seem logical to me why this should make a difference, I asked them to try an infinite lower bound and also to try computing the area on the other side of 291.2.  Both of these provided the expected solution.


The caution symbol on the last line should have been a warning, but I honestly didn’t see it at the time.  I was happy to see the expected solution, but quite frustrated that infinite bounds seemed to be required.  Beyond three standard deviations from the mean of any normal distribution, almost no area exists, so how could extending the lower bound from 0 to negative infinity make any difference in the solution when 0 was already \frac{291.2}{2.8}=104 standard deviations away from 291.2?  I couldn’t make sense of it.

My initial assumption was that something was wrong with the programming in the Nspire, so I emailed some colleagues I knew within CAS development at TI.


They reminded me that statistical computations in the Nspire CAS were resolved through numeric algorithms–an understandable approach given the algebraic definition of the normal and other probability distribution functions.  The downside to this is that numeric solvers may not pick up on (or are incapable of finding) difficult to locate or multiple solutions.  Their suggestion was to employ a graph whenever we got stuck.  This, too, made sense because graphing a function forced the machine to evaluate multiple values of the unknown variable over a predefined domain.

It was also a good reminder for my students that a solution to any algebraic equation can be thought of as the first substitution solution step for a system of equations.  Going back to the initially troublesome input, I rewrote normCdf(0,291.2,x,2.8)=0.99 as the system


and “the point” of intersection of that system would be the solution we sought.  Notice my emphasis indicating my still lingering assumptions about the problem.  Graphing both equations shone a clear light on what was my persistent misunderstanding.


I was stunned to see two intersection solutions on the screen.  Asking the Nspire for the points of intersection revealed BOTH ANSWERS my students and I had found earlier.


If both solutions were correct, then there really were two different normal pdfs that could solve the finite bounded problem.  Graphing these two pdfs finally explained what was happening.

By equating the normCdf result to 0.99 with FINITE bounds, I never specified on which end the additional 0.01 existed–left or right.  This graph showed the 0.01 could have been at either end, one with a mean near the expected 284 yards and the other with a mean near the unexpected 6.5 yards.  The graph below shows both normal curves with the 6.5 solution having an the additional 0.01 on the left and the 284 solution with the 0.01 on the right.


The CAS wasn’t wrong in the beginning.  I was.  And as has happened several times before, the machine didn’t rely on the same sometimes errant assumptions I did.  My students had made a very reasonable assumption that the area under the normal pdf for the golf balls should start only 0 (no negative distances) and inadvertently stumbled into a much richer problem.


The reason the infinity-bounded solutions didn’t give the unexpected second solution is that it is impossible to have the unspecified extra 0.01 area to the left of an infinite lower or upper bound.

To avoid unexpected multiple solutions, I resolved to tell my students to use infinite bounds whenever solving for an unknown parameter.  It was a little dissatisfying to not be able to use my students’ “intuitive” lower bound of 0 for this problem, but at least they wouldn’t have to deal with unexpected, counterintuitive results.

Surprisingly, the permanent solution arrived weeks later when another student shared his fix for a similar problem when computing confidence interval bounds.


I really don’t like the way almost all statistics textbooks provide complicated formulas for computing confidence intervals using standardized z- and t-distribution critical scores.  Ultimately a 95% confidence interval is nothing more than the bounds of the middle 95% of a probability distribution whose mean and standard deviation are defined by a sample from the overall population.  Where the problem above solved for an unknown mean, on a CAS, computing a confidence interval follows essentially the same reasoning to determine missing endpoints.

My theme in every math class I teach is to memorize as little as you can, and use what you know as widely as possible.  Applying this to AP Statistics, I never reveal the existence of confidence interval commands on calculators until we’re 1-2 weeks past their initial introduction.  This allows me to develop a solid understanding of confidence intervals using a variation on calculator commands they already know.

For example, assume you need a 95% confidence interval of the percentage of votes Bernie Sanders is likely to receive in Monday’s Iowa Caucus.  The CNN-ORC poll released January 21 showed Sanders leading Clinton 51% to 43% among 280 likely Democratic caucus-goers.  (Read the article for a glimpse at the much more complicated reality behind this statistic.)  In this sample, the proportion supporting Sanders is approximately normally distributed with a sample p=0.51 and sample standard deviation of p of \sqrt((.51)(.49)/280)=0.0299.  The 95% confidence interval is the defined by the bounds containing the middle 95% of the data of this normal distribution.

Using the earlier lesson, one student suggested finding the bounds on his CAS by focusing on the tails.


giving a confidence interval of (0.45, 0.57) for Sanders for Monday’s caucus, according to the method of the CNN-ORC poll from mid-January.  Using a CAS keeps my students focused on what a confidence interval actually means without burying them in the underlying computations.

That’s nice, but what if you needed a confidence interval for a sample mean?  Unfortunately, the t-distribution on the Nspire is completely standardized, so confidence intervals need to be built from critical t-values.  Like on a normal distribution, a 95% confidence interval is defined by the bounds containing the middle 95% of the data.  One student reasonably suggested the following for a 95% confidence interval with 23 degrees of freedom.  I really liked the explicit syntax definition of the confidence interval.


Alas, the CAS returned the input.  It couldn’t find the answer in that form.  Cognizant of the lessons learned above, I suggested reframing the query with an infinite bound.


That gave the proper endpoint, but I was again dissatisfied with the need to alter the input, even though I knew why.

That’s when another of my students spoke up to say that he got the solution to work with the initial commands by including a domain restriction.


Of course!  When more than one solution is possible, restrict the bounds to the solution range you want.  Then you can use the commands that make sense.


That small fix finally gave me the solution to the earlier syntax issue with the golf ball problem.  There were two solutions to the initial problem, so if I bounded the output, they could use their intuitive approach and get the answer they needed.

If a mean of 288 yards and a standard deviation of 2.8 yards resulted in 12.7% of the area above 291.2, then it wouldn’t take much of a left shift in the mean to leave just 1% of the area above 291.2. Surely that unknown mean would be no lower than 3 standard deviations below the current 288, somewhere above 280 yards.  Adding that single restriction to my students’ original syntax solved their problem.




By encouraging a deep understanding of both the underlying statistical content AND of their CAS tool, students are increasingly able to find creative solutions using flexible methods and expressions intuitive to them.  And shouldn’t intellectual strength, creativity, and flexibility be the goals of every learning experience?


Value Process over Answers

Most of my thinking about teaching lately has been about the priceless, timeless value of process in problem solving over the ephemeral worth of answers.  While an answer to a problem puts a period at the end of a sentence, the beauty and worth of the sentence was the construction, word choice, and elegance employed in sharing the idea at the heart of the sentence.

Just as there are many ways to craft a sentence–from cumbersome plodding to poetic imagery–there are equally many ways to solve problems in mathematics.  Just as great writing reaches, explores, and surprises, great problem solving often starts with the solver not really knowing where the story will lead, taking different paths depending on the experience of the solver, and ending with even more questions.

I experienced that yesterday reading through tweets from one of my favorite middle and upper school problem sources, Five Triangles.  The valuable part of what follows is, in my opinion, the multiple paths I tried before settling on something productive.  My hope is that students learn the value in exploration, even when initially unproductive.

At the end of this post, I offer a few variations on the problem.

The Problem


Try this for yourself before reading further.  I’d LOVE to hear others’ approaches.

First Thoughts and Inherent Variability

My teaching career has been steeped in transformations, and I’ve been playing with origami lately, so my brain immediately translated the setup:

Fold vertex A of equilateral triangle ABC onto side BC.  Let segment DE be the resulting crease with endpoints on sides AB and AC with measurements as given above.

So DF is the folding image of AD and EF is the folding image of AE.  That is, ADFE is a kite and segment DE is a perpendicular bisector of (undrawn) segment AF.  That gave \Delta ADE \cong \Delta FDE .

I also knew that there were lots of possible locations for point F, even though this set-up chose the specific orientation defined by BF=3.

Lovely, but what could I do with all of that?

Trigonometry Solution Eventually Leads to Simpler Insights

Because FD=7, I knew AD=7.  Combining this with the given DB=8 gave AB=15, so now I knew the side of the original equilateral triangle and could quickly compute its perimeter or area if needed.  Because BF=3, I got FC=12.

At this point, I had thoughts of employing Heron’s Formula to connect the side lengths of a triangle with its area.  I let AE=x, making EF=x and EC=15-x.  With all of the sides of \Delta EFC defined, its perimeter was 27, and I could use Heron’s Formula to define its area:

Area(\Delta EFC) = \sqrt{13.5(1.5)(13.5-x)(x-1.5)}

But I didn’t know the exact area, so that was a dead end.

Since \Delta ABC is equilateral, m \angle C=60^{\circ} , I then thought about expressing the area using trigonometry.  With trig, the area of a triangle is half the product of any two sides multiplied by the sine of the contained angle.  That meant Area(\Delta EFC) = \frac{1}{2} \cdot 12 \cdot (15-x) \cdot sin(60^{\circ}) = 3(15-x) \sqrt3.

Now I had two expressions for the same area, so I could solve for x.

3\sqrt{3}(15-x) = \sqrt{13.5(1.5)(13.5-x)(x-1.5)}

Squaring both sides revealed a quadratic in x.  I could do this algebra, if necessary, but this was clearly a CAS moment.


I had two solutions, but this felt WAY too complicated.  Also, Five Triangles problems are generally accessible to middle school students.  The trigonometric form of a triangle’s area is not standard middle school fare.  There had to be an easier way.

A Quicker Ending

Thinking trig opened me up to angle measures.  If I let m \angle CEF = \theta, then m \angle EFC = 120^{\circ}-\theta, making m \angle DFB = \theta, and I suddenly had my simple breakthrough!  Because their angles were congruent, I knew \Delta CEF \sim \Delta BFD.

Because the triangles were similar, I could employ similarity ratios.


And that is one of the CAS solutions by a MUCH SIMPLER approach.

Extensions and Variations

Following are five variations on the original Five Triangles problem.  What other possible variations can you find?

1)  Why did the CAS give two solutions?  Because \Delta BDF had all three sides explicitly given, by SSS there should be only one solution.  So is the 13.0714 solution real or extraneous?  Can you prove your claim?  If that solution is extraneous, identify the moment when the solution became “real”.

2)  Eliminating the initial condition that BF=3 gives another possibility.  Using only the remaining information, how long is \overline{BF} ?

\Delta BDF now has SSA information, making it an ambiguous case situation.  Let BF=x and invoke the Law of Cosines.

7^2=x^2+8^2-2 \cdot x \cdot 8 cos(60^{\circ})

Giving the original BF=3 solution and a second possible answer:  BF=5.

3)  You could also stay with the original problem asking for AE.

From above, the solution for BF=3 is AE=10.5.  But if BF=5 from the ambiguous case, then FC=10 and the similarity ratio above becomes


4)  Under what conditions is \overline{DE} \parallel \overline{BC} ?

5)  Consider all possible locations of folding point A onto \overline{BC}.  What are all possible lengths of \overline{DE}?