One day last November, after the 2009 elections, Philip Stark found himself in the back room of the Yolo County Administration Building doing something statisticians seldom do — testing a theory in the real world.
For four hours, Stark hand counted 1,400 randomly selected ballots to test his mathematical technique for auditing voting results, the first rigorous method since such audits or canvasses were required in every county in 1965.
Stark’s technique passed the test and five others, impressing California Secretary of State Debra Bowen, the state’s chief elections officer, and spurring her to sponsor a bill, AB 2023, to conduct in 2011 a statewide experiment of this kind of “risk-limiting audit.”
Endorsed by the American Statistical Association, The Brennan Center for Justice, Verified Voting, Citizens for Election Integrity Minnesota and California Common Cause, the bill breezed through a hearing last week on April 20 in the California Assembly’s Committee on Elections and Redistricting, garnering unanimous, bipartisan support. A hearing in the California Senate is scheduled for June.
Stark, who testified at the hearing, was happy with the outcome, and so are election officials around the state.
“Although the canvass (we do today) is a comforting thing, and it’s certainly necessary, it is not state-of-the-art,” said Freddie Oakley, the county clerk recorder of Yolo County and the person responsible for double-checking voting machine counts as well as certifying the final results. “We will be close to state-of-the-art with the bill that the secretary of state is sponsoring.”
“This bill will allow a pilot project, and I hope Marin County can be one of those pilot counties,” said Marin County’s registrar of voters, Elaine Ginnold.
New pilot program
Oakley and Ginnold worked closely with Stark in 2008 and 2009 to test post-election audit methods on top of the legally mandated hand auditing of 1 percent of all ballots cast in an election. AB 2023 will encourage more counties to volunteer to conduct an additional risk-limiting audit and give them extra time to do so, beyond the 28-day period legally mandated for a ballot count, hand audit and certification. This will allow a broader range of counties, with different sizes, voting systems and hand-tally procedures, to test Stark’s methods and, by requiring that results be reported to the state, allow comparison with the current procedures.
If the experiment pans out, Stark’s technique — or one like it — may be mandated statewide, finally giving election officials, typically registrars of voters or county clerks, a reliable method to check not only machine counts, but also the accuracy of close races.
“California’s 1965 law was the first to mandate audits before the election results were certified,” said Stark, a UC Berkeley professor of statistics. “But the problem with the law is that it only says, count 1 percent of the ballots by hand and tell the state what happened. It doesn’t say that, if you find a problem, you need to count more votes, or how to use the audit to determine whether the outcome was right.”
As a result, election officials have typically followed the letter of the law, and nothing more: count 1 percent of precincts by hand and inform the state of the results within a 28-day window after the election. The audits or canvasses occasionally found mechanical or computer programming errors — the original intent was to double-check machine counts — but little else.
Stark hopes to change that. His auditing methods can guide election officials on what to do if they discover a discrepancy, and how to judge whether they have counted enough votes by hand to confirm that the election outcome is right. The methods provide voters with a statistically meaningful measure of confidence that electoral results are accurate.
His own analysis of the problem, published in half a dozen scientific papers, shows that the current way of selecting ballots to audit is unlikely to find errors unless the problems are widespread across many precincts. If the problems, ranging from programming errors and voter errors to fraud, are restricted to a few precincts, such sampling would rarely detect it.
Legacy of 2000 election
In the face of recent vote counting problems, ranging from the Florida “hanging chad” fiasco that resulted in the 2000 election of President George W. Bush to questions surrounding the accuracy of electronic voting machines, Bowen created in 2007 the Post-Election Audit Standards Working Group to, in her words, “take a fresh look at whether there’s a way to improve the auditing process to increase the chances of catching any errors and improving the public’s confidence in the election results.” Stark was appointed the group’s statistician.
Only after joining the group did Stark think seriously about the auditing problem and realize that little work had been done to find the best, statistically sound way to conduct a post-election audit. In fact, he decided that previous researchers had been asking the wrong question. Rather than asking what percentage of the ballots you should count by hand to be confident of finding at least one discrepancy if the outcome is wrong, he said, it makes more sense to ask how confident you are that the outcome is right, in light of the discrepancies you find.
The important question, then, is not how much to count at first, but when you can stop counting, knowing that there is strong evidence that the outcome is right.
“That reformulation was key,” Stark said. “A risk-limiting audit should be able to correct a wrong outcome, but count as few ballots as possible if the outcome is right. You want to stop counting when you can be confident that counting more won’t change the answer.”
Stark has come up with nearly half a dozen methods, each requiring fewer and fewer ballot counts to achieve the goal of confidence in the outcome. Depending on the number of people available to conduct the audit and the sophistication of the voting machines, election officials can choose the best scheme.
The standard method now is to choose 1 percent of the precincts in a county and hand tally each ballot in that precinct — typically 500 to 1,000 ballots per precinct. Stark’s analysis, however, shows that if you tally smaller batches — 10-50 ballots, for example — you can achieve a higher level of confidence with less effort and cost. In fact, the best method, statistically, and the one requiring the fewest ballot counts, is to lump all ballots together and choose randomly from those, he said.
Counting jelly beans
Stark draws an analogy between voting errors and jelly beans. Say you have 100 four-ounce bags of jelly beans, where each bag has a mix of flavors, possibly even all one flavor. The best way to determine the number of coconut jelly beans in the 100 bags is also the best way to determine the number of errors among all the ballots in 100 precincts.
Counting a single precinct has the same disadvantage as opening and counting the jelly beans in one four-ounce bag: the answer may or may not reflect the average among all 100 bags. You’d have to open many bags to get a reliable estimate of the number of coconut jelly beans in the mix. Or count many precincts to estimate the number of voting errors.
If you open all bags and mix the beans, however, then a four-ounce scoop is much more likely to yield an accurate estimate of the percentage of coconut jelly beans. Breaking the jelly beans into smaller batches, similar to lumping ballots into smaller batches, is better than counting by bag or precinct, but not better than randomly drawing from all beans or ballots.
Both Oakley and Ginnold think that pooling all the ballots and randomly selecting ones to hand-tally will be feasible as new vote-counting machines come online that can digitally scan ballots and assign them a number, thereby allowing a random selection of ballots.
“It’s a very intriguing concept,” Ginnold said. “I like the idea very much, and I certainly think it’s something that is going to be done in the future.”
While close races may always require a full or nearly full recount, Ginnold said, she is encouraged that academics have finally taken an interest in voting and audit issues.
“Academics like Professor Stark bring an unbiased, fact-based approach to solving problems, unlike some election reform activists that promote changes based on superstition and emotion,” she said. “It is the more objective approach that will result in meaningful election reform such the proposal in this election audit bill.”
Meanwhile, Stark continues to refine his methods — so far the only risk-limiting schemes around — while keeping in mind his own experience with the tedium of hand counting.
“Professor Stark is really outstanding, because he understands the time, financial and even political constraints that we have,” Ginnold said. “He knows that we can’t abandon our method of vote counting, and has stuck with us to come up with some very good ideas to limit the risk of the wrong candidate being elected.”
“Working with Philip has been extremely interesting,” Oakley added. “We have learned a lot about outcomes and what affects outcomes. Our bottom line is that we want to count every vote the way the voter intended, and better canvasses are a way of getting close to that goal.”
Link here for a user-friendly discussion of risk-limiting audits.