Tag Archives: residual

Statistics and Series

I was inspired by the article “Errors in Mathematics Aren’t Always Bad” (Sheldon Gordon, Mathematics Teacher, August 2011, Volume 105, Issue 1) to think about an innovative way to introduce series to my precalculus class without using any of the traditional calculus that’s typically required to derive them.  It’s not a proof, but it’s certainly compelling and introduces my students to an idea that many find challenging in a much less demanding environment.

Following is a paraphrase of an activity I took my students through in January.  They started by computing and graphing a few points on y=e^x near x=0.

The global shape is exponential, but this image convinced them to try a linear fit.

Simplifying a bit, this linear regression suggests that e^x\approx x+1 for values of x near x=0.  Despite the “strength” of the correlation coefficient, we teach our students always to look at the residuals from any attempted fit.  If you have ever relied solely on correlation coefficients to determine “the best fit” for a set of data, the “strength” of r \approx0.998402 and the following residual plot should convince you to be more careful.

The values are very small, but these residuals (res1=e^x-(x+1))  look pretty close to quadratic even though the correlation coefficient was nearly 1.  Fitting a quadratic to (xval,res1) gives another great fit.

The linear and constant coefficients are nearly zero making res1\approx\frac{1}{2}x^2.  Therefore, a quadratic approximation to the original exponential is e^x \approx\frac{1}{2}x^2+x+1.  But even with another great correlation coefficient, hopefully the last step has convinced you to investigate the new residuals, res2=e^x-(\frac{1}{2}x^2+x+1).

And that looks cubic.  Fitting a cubic to (xval,res2) gives yet another great fit.

This time, the quadratic, linear, and constant coefficients are all nearly zero making res2\approx.167x^3.  The simplest fraction close to this coefficient is \frac{1}{6} making cubic approximation e^x \approx\frac{1}{6}x^3+\frac{1}{2}x^2+x+1.  One more time, check the new residuals, res3=e^x-(\frac{1}{6}x^3+\frac{1}{2}x^2+x+1).

Given this progression and the “flatter” vertex, my students were ready to explore a quartic fit to the res3 data.

As before, only the highest degree term seems non-zero, giving res3\approx0.04175x^4.  Some of my students called this coefficient \frac{1}{25} and others went for \frac{1}{24}.  At this point, either approximation was acceptable, leading to e^x \approx\frac{1}{24}x^4+\frac{1}{6}x^3+\frac{1}{2}x^2+x+1.

My students clearly got the idea that this approach could be continued as far as desired, but since our TI-Nspire had used its highest polynomial regression (quartic) and the decimals were getting harder to approximate, we had enough.  As a final check, they computed a quartic regression on the original data, showing that the progression above could have been simplified to a single step.

If you try this with your classes, I recommend NOT starting with the quartic regression.  Students historically have difficulty understanding what series are and from where they come.  My anecdotal experiences from using this approach for the first time this year suggest that, as a group, my students are far more comfortable with series than ever before.

Ultimately, this activity established for my students the idea that polynomials can be great approximations for other functions at the same time we crudely developed the Maclaurin Series for e^x\approx\frac{1}{4!}x^4+\frac{1}{3!}x^3+\frac{1}{2!}x^2+x+1, a topic I’m revisiting soon as we explore derivatives.  We also learned that even very strong correlation coefficients can hide some pretty math.

Quadratics on the Sun

Thursday, I read (via @wired) about a really interesting tornado on the surface of the sun.

This twister was reported at nearly 186,000 mph with temperatures “between 90,000 and 3.6 million degrees Fahrenheit.”  Pretty stunning.

I thought the twisting winds might make a really interesting multivariable calculus problem, but a physics colleague, John Burk (@occam98) asked, “I wonder what the F-number is on a tornado with 186,000 mi/hr winds?”

OK, a new direction.  I started by finding the Fujita Scale for tornado categories:








40-72 mph

73-112 mph

113-157 mph

158-206 mph

207-260 mph

261-318 mph

319-379 mph

Reminding me that the damage caused by a tornado is connected to the square of the velocity of the winds.  John found a “phenomenal quadratic fit” to the Fujita Scale for tornadoes by using the midpoint of each range.  I repeated John’s analysis and found basically the same results.  I also added a residual analysis.

There are very few data points here, so I shouldn’t have been all that surprised by a residual pattern.  As John’s analysis suggested, the quadratic is very close to the data points–residuals\in{[-0.607,0.476]}–a very small interval relative to the dependent range.  But as a math teacher, I began to wonder.

  1. How you would set the range of each category if you use midpoints?
  2. Can the fit be any better?
  3. Once you get an equation, do the coefficients tell you anything?

I’m not sure how to answer the first question in any sort of non-arbitrary way, so I turned my focus to two other approaches:  using the minimum and maximum wind speeds for each F-category.  Those graphs and their residuals follow.

So all three appear to fit the F-scale data very well, with maybe a slightly less obvious residual pattern in the min and max curves (although I wouldn’t stake any deep claims on that assertion based on so few data points). The residuals range for the minimum speeds is residuals_{min}\in{[-0.929,0.667]}, and is residuals_{max}\in{[-0.429,0.571]} for the maximum speeds.  If forced to make a call, perhaps the maximum speeds are better for their smaller overall residual range and possibly less defined pattern.  Also, using the equation for the maximum (or minimum) wind speeds avoids the vague endpoints issue for the Fujita scalings the mean wind speed approach encountered.

I don’t recognize anything about the coefficient values, so I tried converting the three equations into factored and vertex forms with nothing really enlightening there either.  I don’t know what I was hoping to find, but none of the coefficients in any of the forms seem to be sharing any secrets.

For now, I think I’ve run to the end of my reverse-engineering of the Fujita Scale.

Returning to the solar tornado that inspired all of this, I used my CAS to solve for the minimum and maximum Fujita scales based on these data, getting that the solar tornado would be rated somewhere between an F-270 and an F-285 tornado if it happened on Earth.  Wow.

In the end, perhaps it’s time for me to study the development of the Fujita Scale, but I’m pretty convinced from these tight fits that it was not a purely random equation.

In the meantime, I hope you find here a (not so surprising) connection to quadratic functions and possibly something to provide a deeper connection between mathematics and science–something woefully underrepresented for too many students.