For Parents/Math/How to Teach Scatter Plots, Line of Best Fit, and Data Analysis in 8th Grade

How to Teach Scatter Plots, Line of Best Fit, and Data Analysis in 8th Grade

8 min read8th8th

By 8th grade, your child has made bar graphs and calculated averages. But now the data standards jump to a new level: two variables at once. Can taller people jump farther? Does more study time predict better test scores? Does ice cream sales cause drownings? These are bivariate data questions, and answering them requires scatter plots, trend lines, and careful reasoning about what data can and cannot prove.

What the research says

The Common Core standard 8.SP (Statistics and Probability) focuses on bivariate data for good reason. Research on statistical reasoning shows that students who work with real two-variable datasets develop stronger critical thinking skills than those who only compute summary statistics. The ability to look at a scatter plot and describe the relationship — including its direction, strength, and limitations — is foundational for science, social studies, and everyday data literacy. Studies also show that the "correlation equals causation" error is one of the most persistent misconceptions in all of education, and explicit instruction is needed to combat it.

Scatter plots: plotting two variables

What they are

A scatter plot shows two measurements for each individual in a dataset. Each point represents one person, object, or event. The horizontal axis is one variable, the vertical axis is the other.

Teaching sequence

Step 1: Build a scatter plot from real data.

Collect data your child can relate to. Good options:

  • Height (inches) vs. shoe size for family members and friends (8-10 data points)
  • Hours of sleep vs. self-reported energy level (1-10) over two weeks
  • Temperature outside vs. number of people at a local park (estimate from observation)

Have your child plot the points on graph paper. Label both axes with the variable name and units. Title the graph.

Step 2: Describe what you see.

Before any math, ask these three questions:

  1. Direction: As one variable increases, does the other tend to increase, decrease, or do neither?
  2. Strength: Are the points clustered tightly or scattered loosely?
  3. Outliers: Are there any points that do not fit the general pattern?

Introduce the vocabulary:

PatternTermExample
Both increase togetherPositive associationHeight vs. shoe size
One increases, other decreasesNegative associationPrice vs. quantity sold
No clear patternNo associationBirthday month vs. test score
Points hug a line closelyStrong associationStudy time vs. quiz score (in a motivated class)
Points are spread looselyWeak associationHours of TV vs. height

Activity: "What's the story?" Show your child 4-5 scatter plots (you can find these in any 8th grade textbook or create simple ones). For each, ask them to describe the direction, strength, and what it means in context. No numbers — just reading the picture.

Sample dialogue

You: "Here is a scatter plot of students' study time versus their test scores. What do you notice?"

Child: "The points go up from left to right. More studying, higher scores."

You: "Is it a strong or weak pattern?"

Child: "Pretty strong. Most points are close together, but there are a couple of students who studied a lot and still scored low."

You: "Those are outliers. Why might they exist?"

Child: "Maybe they were studying the wrong material, or they were distracted."

You: "Good. The pattern is real, but it does not apply to every single person."

Line of best fit

What it is

A line of best fit (or trend line) is a straight line drawn through a scatter plot that best represents the overall trend. It does not need to pass through any specific point. Roughly half the data points should be above the line and half below.

Teaching it without formulas

In 8th grade, students draw the line of best fit by eye — they are not expected to use the least-squares regression formula. Here is how to teach it:

Step 1: Eyeball method.

  1. Look at the scatter plot and imagine a line that "splits" the data evenly.
  2. Use a ruler or straightedge to draw it.
  3. Check: are there roughly equal numbers of points above and below the line? Is the line following the general direction of the points?

Step 2: Use the line to make predictions.

Once the line is drawn, your child can use it to estimate values.

Based on the scatter plot, if a student studies for 4 hours, what score would you predict?

Find 4 on the x-axis, go up to the line, read across to the y-axis. That is the prediction.

Step 3: Discuss the limits of prediction.

  • Interpolation (predicting within the range of the data) is reasonable.
  • Extrapolation (predicting far beyond the data) is risky. If the data covers 1-8 hours of study, predicting the score for 20 hours of study is unreliable — the relationship may not continue in a straight line.

Activity: "Draw and predict." Give your child a scatter plot with 12-15 points. Have them draw the line of best fit, then use it to answer 3-4 prediction questions. Include at least one extrapolation question so you can discuss why it is less reliable.

Two-way frequency tables

Two-way tables show how two categorical variables relate. They are a different kind of bivariate data — categories instead of numbers.

Example

A survey asked 200 students about their preferred subject and whether they play a sport.

Plays a sportDoes not playTotal
Prefers math453075
Prefers English354075
Prefers science401050
Total12080200

What your child should be able to do

  1. Read the table. "How many students prefer science and play a sport?" → 40.

  2. Calculate relative frequencies. "What fraction of all students prefer math?" → 75/200 = 37.5%. "What fraction of sport players prefer science?" → 40/120 = 33.3%.

  3. Look for associations. "Is there an association between playing a sport and preferring science?" Compare: 40/120 = 33% of sport players prefer science, but only 10/80 = 12.5% of non-sport players do. Yes, there appears to be an association.

Activity: "Survey your family." Have your child create a two-way table from a real survey. Ask 10-15 people two yes/no or categorical questions (e.g., "Do you prefer morning or evening?" and "Do you drink coffee?"). Build the table, calculate relative frequencies, and discuss whether there is an association.

Correlation vs. causation

This is the most important conceptual lesson in the entire data unit.

Correlation means two variables tend to move together. Causation means one variable actually makes the other change.

The classic examples

  • Ice cream sales and drowning rates are positively correlated. Ice cream does not cause drowning. Both increase in summer because of heat — a lurking variable.
  • Shoe size and reading ability are positively correlated in children. Big feet do not cause better reading. Both increase with age.
  • Countries with more TVs per capita have higher life expectancies. TVs do not make people live longer. Wealth is the lurking variable.

Teaching it

Rule your child should memorize: Correlation does not prove causation. To establish causation, you need a controlled experiment, not just an observed pattern.

Activity: "Explain the real reason." Give your child 5-6 correlation statements and have them identify the lurking variable.

  1. "Students who eat breakfast score higher on tests." (Lurking variable: overall health habits and family stability.)
  2. "Cities with more police officers have more crime." (Lurking variable: population size — bigger cities have both.)
  3. "People who exercise more report less stress." (Could be causal, but also: people with lower stress may have more time and energy to exercise.)

For each one, ask: "Does A cause B, does B cause A, or is there a third factor causing both?"

Common mistakes to watch for

  • Drawing the line of best fit through the first and last points. The line should represent the overall trend, not connect two specific points.
  • Confusing "no association" with "negative association." A random scatter has no association. A clear downward trend is a negative association — it is still a pattern.
  • Ignoring context when making predictions. A line of best fit might predict a negative test score or a 200-hour study session. Your child should always ask: "Does this prediction make sense in the real world?"
  • Claiming causation from a scatter plot. A scatter plot shows association. Period. Causation requires experimental evidence.

When to move on

Your child is ready for high school statistics when they can:

  • Plot bivariate data accurately and describe the association (direction, strength, outliers)
  • Draw a reasonable line of best fit and use it for interpolation
  • Read and interpret two-way frequency tables, including relative frequencies
  • Identify lurking variables in at least 3 out of 5 correlation-vs-causation scenarios
  • Explain in their own words why correlation does not prove causation

What comes next

In high school, your child will formalize the line of best fit using the least-squares regression equation, learn correlation coefficients (r-values), and work with non-linear data models. The conceptual foundation built here — reading patterns, questioning causation, and making careful predictions — is exactly what makes those formal tools meaningful rather than mechanical. These skills also connect to algebra and functions, since every line of best fit is a linear function used to model real data.

Adaptive math that teaches itself

Lumastery handles the daily math lessons, adapts to each child’s level, and gives you weekly reports on their progress.

Start Free — No Card Required