The CBS News Battleground Tracker is back, explaining what’s on voters’ minds and regularly providing detailed snapshots of the U.S. presidential election in each state throughout the 2020 campaign. In addition to the specific polls we conduct in key states in a given week, the Battleground Tracker map includes our best estimates and presidential race ratings in every state. This includes states we’ve polled extensively and states where we’ve surveyed few voters but have lots of other data.
What exactly is the Battleground Tracker, and where do the numbers come from? Here are five things to know.
1. Go state by state to understand this election
We take a state-by-state approach to describing the race and measuring public opinion, since the presidency is won in the Electoral College, not by the national popular vote. Indeed, relying too much on national polling can be misleading, as 2016 reminded us.
The Battleground Tracker looks at individual states, focusing on competitive ones. And we translate each candidate’s current support to the electoral vote scoreboard. Our state-by-state approach also gives a sense of what voters in different parts of the country think and feel of this year’s candidates, national issues, and local matters.
2. It’s more than just a poll
While surveying voters across the country is an integral part of the Battleground Tracker, this is more than your typical poll. It’s really a big data project. We combine polling, voter files (from L2 Political), U.S. Census data, and historical trends to get a clear picture of what’s going on in each state.
Here’s how we put together that combination:
- We know which candidates different types of voters are supporting from our polling, which includes much larger sample sizes — tens of thousands — than a typical poll;
- We know how many people like them are in each state and county, as well as their turnout history, from voter files and Census data;
- And we know each state’s previous election results, which enables us to anchor our 2020 estimates to recent history.
This approach achieves better estimates in states without as much polling. In 2016, for example, scarce polling in certain states like Michigan or Wisconsin led some to believe they were not as competitive as they turned out to be. That picture might have been improved by considering that these states were full of the same kinds of voters shifting to the Republican Party elsewhere.
We combine all of this data using a statistical technique called multilevel regression with post-stratification (MRP). One advantage of this method is that we use trends across the entire country to inform our picture of a particular state. For example, if we see white working-class voters across the Midwest shifting their support, we use this information to more precisely estimate Michigan. We collaborate on data collection and modeling with YouGov, building on our joint U.S. House model in 2018 and delegate model during this year’s presidential primaries. (You can scroll to the bottom for more details about MRP.)
3. Think snapshots, not forecasts
Our job is to tell you where races stand today and explain why — and what might change. Over the course of the campaign, voters will become more familiar with candidates and some change their minds. In 2016, for instance, late-deciding voters propelled Donald Trump to victory in key states.
Unlike an electoral forecast, we’re estimating each candidate’s current support, incorporating all the data we’ve collected up to this point. For instance, if we estimate Joe Biden at 49% in a state with a margin of error of 3 points, we’re confident that his support there is between 46% and 52% today, not that the final result will be in that range.
Similarly, our ratings reflect the current state of the race. A state that’s leaning toward one party today may be reclassified as a toss up down the road if it becomes more competitive. Our race ratings are based in part on this estimate, on the size of the potential error in the estimate, as well as on trends in each state’s recent history and the current campaign.
There’s nothing here to account for forward-looking uncertainty — nothing about changing economic conditions or rollercoaster debates, for example. We fully expect movement before the first vote is cast, so we’ll update everything regularly throughout the fall.
4. Electoral scenarios
That brings us to scenarios. Down the road, the Battleground Tracker will offer plausible scenarios for how the election might unfold. We’ll do this using a combination of statistical simulation and tweaking some of the assumptions underlying our model, resulting in a range of possible outcomes.
Here’s an example. One of the most challenging things to figure out will be turnout: who’s actually going to bother to vote? Modeling who is likely to vote is a perennial challenge that the coronavirus pandemic is likely to make even thornier this year.
In our baseline model, we estimate which voters are casting ballots based on both what they tell us they’re planning to do and historical patterns in their states. In our scenarios, we’ll slightly alter the model’s parameters to explore what could happen if, for example, large swaths of voters stay home (perhaps for fear of getting sick) or if there’s a surge in voting by mail (also possible given intense interest in this election). We’ll roll out our scenarios later in the campaign, so check back for them!
5. Solid track record
While we’re taking a different approach than traditional polling, the Battleground Tracker is based on rigorous methods from the worlds of political science, survey research, and statistics. Moreover, we have a strong track record employing similar models at CBS News over the past few years.
Our 2018 model performed particularly well, steadily tracking Democratic candidates’ improvement in key congressional races and the eventual Blue Wave in the U.S. House. In fact, our high-turnout scenario pretty much nailed the final result, as historically high turnout powered Democratic gains.
How does multilevel regression with post-stratification work?
If you want to know more about the data and statistical model we use — and don’t mind a bit of jargon — then keep reading…
First, we survey thousands of registered voters across the country and make sure to draw larger samples in battleground states, which we expect to be more competitive. The most important survey questions we ask for estimation purposes are how likely they are to vote and which candidate they would vote for today.
We then determine how people’s vote intentions are related to their characteristics like age, gender, race, education, past vote, where they live, and so on. Each voter has a certain combination, which we’ll call a “profile” for shorthand. For example, one possible profile is a 36-year-old, black, college educated female, who votes in DeKalb County, Georgia. Change any one of these characteristics and you get another profile. For each possible profile, we run a multilevel regression on the survey data to estimate how many voters of that specific profile intend to vote for each major candidate. The regression includes the voter characteristics above, as well as state and county effects (the levels in “multilevel”).
The next step is estimating how many people of each voter profile live in each state. For this we use a combination of U.S. Census data and voter files, which includes counts of voters at very granular levels, such as voting precincts. In each state, we multiply the total number of voters of a given profile by the proportion of voters with that profile choosing a candidate (the “post-stratification” step). Aggregating across all voter profiles in a state, we finally get the estimate of that candidate’s statewide vote share. In Maine and Nebraska — the two states that award electoral votes by congressional district — we also estimate candidate support in each district.