joftius

May 29 2016

FiveThiryEight is wrong: the system IS rigged against Sanders

FiveThirtyEight is wrong

Nate Silver and Harry Enten claim The System Isn’t ‘Rigged’ Against Sanders. I’ve written at length already debunking their argument and drawing attention to the statistical malpractice they rely on to make it. To summarize, their argument is that caucuses have favored Sanders by suppressing the vote, and that somehow this disadvantages Clinton supporters more than Sanders supporters. Using a severely flawed statistical model they estimate that Clinton would have done 20-25% better in caucus states if they held primaries instead. To their credit, Silver and Enten attempted to address the question of having open vs closed primaries. But despite the sweeping title of their article (the system!), their focus is entirely too narrow. They identified two possible mechanisms by which the system could be influencing votes: caucuses vs primaries and whether or not the vote is open to independents.

The system IS rigged against Sanders

I conducted my own analysis to address some problems with theirs. Their model included percent of population that is black, percent Hispanic, whether the vote was a primary or caucus, whether it was open or closed to independents, and the national polling margin at the time of the vote. I’ll do several things slightly differently. Instead of national polling margin, I’ll just use the date–this is highly correlated with the national polling margin anyway. I did this more out of convenience than anything else, because my data already had date but not national polls. This difference is not important. Next, I’ll include just one more variable: whether or not the state has same-day registration. It just so happens that almost every caucus state also has same-day registration. Here are the coefficients of the resulting model:

Variable	Estimate	Std. Err.	p-value
(Intercept)	69.2271	4.2055	<0.00
Date	7.5544	7.1372	0.3
Deadline	-5.6614	5.1451	0.28
Type	-5.946	5.2812	0.27
Independents	2.232	2.7884	0.43
RaceBlack	-1.1415	0.1488	<0.00
RaceHispanic	-0.3431	0.1974	0.09

Let me break this down for you. Ignore the (Intercept) variable. The Date variable estimate of roughly 7.5 means that, on average, Sanders has gained 7.5% comparing the most recent votes to the first votes early on. The Deadline variable at about -5.7 means Sanders loses about 5.7% on average when states do not allow same-day registration. The Type variable means Sanders loses 5.9% in primaries compared to caucuses, again on average. (As an aside, if I leave out Deadline and have almost the same model as 538, the Type variable estimate is about -10.46, still not quite the absurd estimate Silver and Enten present). The Std. Err. and p-value columns tell us roughly how certain we can be that the estimate is good and that the effect isn’t really just zero. Many of the p-values are above the traditional 0.05 “significance” cutoff because this model is not very good.

Let’s try a better model. As a described in a previous post, Silver and Enten are not adjusting for other important demographic variables like age, income, and so on. Due to the limited sample size (I have 44 rows in my data), it’s not realistic to simultaneously estimate many demographic effects. I’ll just include two more variables: median age and percent of population having a high school degree or less.

Variable	Estimate	Std. Err.	p-value
(Intercept)	130.7826	23.6031	<0.00
Date	6.9082	6.3529	0.28
Deadline	-5.3206	4.4755	0.24
Type	0.3334	4.9385	0.95
Independents	0.983	2.5311	0.7
RaceBlack	-1.1497	0.1502	<0.00
RaceHispanic	-0.8587	0.2384	<0.00
MedianAge	-0.8648	0.5918	0.15
EduHSorless	-0.7326	0.2329	<0.00

Surprise! The Type estimate is now only 0.33, meaning if you also do a slight adjustment for age and education Sanders only benefits by 0.33% in states having a caucus. The Deadline estimate is still roughly the same. The fact that the Deadline estimate is stable to this change in the model gives me more confidence that its effect is real. If I include another variable, InternetAccess–an estimate of what percent of the population has access to high speed internet–the Deadline estimate becomes -4.87 and Type is -0.25, consistent. If I also include some regional indicators for the South East, North East, and West (leaving the mid-west as part of the intercept) Deadline becomes -5.74 and Type becomes 0.55–meaning Sanders now actually benefits from primaries relative to caucuses.

The data and code for this analysis is available in this Github repo in the files DemPrimaryData.csv and Rigged.R

It’s the voter registration deadlines, stupid

I shouldn’t be writing any of this. I’m supposed to be finishing my thesis right now. So I’m not going to spend the time to find data for primary turnout this year and do a regression to show that turnout is depressed by early registration deadlines. Instead, I will cite several facts which are either obvious or easy to verify with Google.

Young people are more likely to be first time voters.
Young people and first time voters are less likely to be registered, and if they are registered they are more likely to be registered as Independents.
Young people and first time voters are less likely to know that registration deadlines exist and can be surprisingly early.
Some states with closed primaries, like New York, have even earlier deadlines for party affiliation changes. New York’s was back in October of 2015, four days before the first Democratic debate. New York’s turnout was also second lowest of any state…

Nate Silver and Harry Enten ignored all of this. They conducted a highly flawed statistical analysis that left out important demographic controls and had no data at all related to registration deadlines or other forms of voter suppression. Enten in particular with his background in political science should know there is a vast literature of research on voter suppression involving things like registration deadlines and voter ID requirements. By pretending that the caucus effect is the only one that matters, they claim to answer a far bigger and far more important question than they actually do, and the answer they give for their limited question is still flawed.

Bernie might have been winning…

My own analysis, controlling for more demographic variables and checking that my results are stable when I add or remove several of these controls, shows that Sanders probably lost at least 5% on average in states that did not allow same-day registration. Sanders currently has about 45% of the delegates. It’s impossible to say anything counter-factual about this with certainty, but try to imagine how different things would be. The first Super Tuesday would have been far less devastating, and we may never have seen the widespread media narrative that developed about Clinton’s commanding lead in “delegate math.” The following states might have switched from a loss/tie to a tie/victory.

State	Advanced days	Vote % Bernie
North Carolina	23	40.76
Arizona	28	41.39
NewYork	23	42.01
Ohio	27	43.13
Pennsylvania	28	43.56
Kentucky	28	46.33
Connecticut	1	46.42
Illinois	27	48.61
Massachusetts	19	48.69
Missouri	26	49.36

Conservatively, Bernie might have won 4 or 5 more states, and might have come close to a tie in New York. The clear change-point in this graph might not have happened:

depressed

I think it’s safe to say that the lack of same-day registration is a very significant factor in Clinton’s lead. In all of this, I did not even begin to ask how it might have been different if closed primaries were open to independents.

Feb 22 2016

5 Comments

Uncategorized

Despite lower turnout than 2008, Bernie’s political revolution is still on track

In the first three Democratic primary/caucus contests of 2016 voter turnout was lower than in 2008. Various news outlets have had articles saying this is a bad sign for Bernie’s political revolution, for example this VOX article by Jeff Stein or this MSNBC article by Steve Benen. In this post I present a counter-argument.

Comparison to 2008 is misleading.

It’s true that turnout so far in 2016 has been lower than it was in 2008. However, this is because 2008 was an outlier year. Consider turnout in the Iowa caucus. If we only look at the years 2008 and 2016, as the aforementioned articles do, it would be a bleak picture for Bernie indeed (and Hillary…). But look what happens when we expand the window to include a few more presidential election years:

qjc3rcs

Years not displayed were either midterms with lower turnout, or 1996 when Bill Clinton ran unopposed. These numbers were taken from Des Moines Public Library. It is clear that 2008 was an outlier, so we should be careful in trying to compare the present. We need to understand why it was an outlier before making conclusions about 2016.

First and most importantly, the 2008 election followed one of the most disastrous presidencies in modern US history. At the end of his term, Dubya was one of the most unpopular presidents ever with an abysmal 22 percent final approval rating. Also, in 2008 the Iowa caucus was held early in January when most college students were still on break. Sure enough, in 2008 voters aged 18-27 were 22% of Iowa Democratic caucus attendees, compared to just 18% in 2016.

In New Hampshire, turnout was about 118,000 in 1988, 154,639 in 2000, 287,557 in 2008, and 250,974 in 2016. This is less damning than Iowa. There is less justification for making comparisons in Nevada because it was recently changed from a primary to an early caucus, so previous years could be different for any number of reasons. At any rate, the 2/3 Nevada turnout cited in the VOX article is also less damning than the Iowa numbers where turnout was only about 60% what it was in 2008.

In summary, 2008 was an outlier for Democratic turnout because of the unpopular Republican president in office at the time. Sure enough, there were only 118,000 Republicans participating in the Iowa caucus in 2008 compared to 180,000 this year. Arguably the fairest comparison for Democratic turnout in Iowa in 2016 was Iowa in 1980 or 2000, the when the incumbent president was a Democrat. In that case, the political revolution effect is yooge, about the same size as the incumbent reversal effect Republicans experienced this year. Even compared to 2004, when Bush was still in office, 2016 had a significantly larger turnout.

The revolution really needs to happen in the general election, not the primary.

As long as Bernie can secure the nomination, it doesn’t matter if turnout in the primaries and caucuses is not at all time record highs. The place where he needs to win big in order to have a shot at implementing his agenda is in the general election. One might worry that disappointing turnout in primaries will lead to disappointing turnout in the general election. That’s a valid concern, so let’s address it by asking the fundamental question: Can Bernie win the general election by a large margin?

My answer: all signs currently point to a resounding yes. First let’s consider what drives high voter turnout, specifically what happened in 2008. The answer is that young people, blacks, and Hispanics all had higher turnout that year. Bernie’s support among young people is higher than Obama’s was in 2008. Some analysts have persistently pointed to Clinton’s high levels of support among non-whites as a sign of trouble for Sanders. It’s true that Bernie will have to win over more non-whites to win the nomination, and I believe he can and will. But it’s naive to suggest that a large margin of support among minorities in primaries for Clinton would translate to the kind of minority turnout that Obama was able to generate in 2008. I don’t know if either candidate will bring as many minorities out to vote because they are both white. But it is abundantly clear that Clinton would have a big millennial problem in the general election.

Another demographic trend that could have a big impact in the general election is the rise of independents. Look at this Gallup polling data:

Registered Republicans have plummeted, Democrats have declined but not as much, and Independents now dominate either group by a large margin. This trend also helps explain why this election cycle has political outsiders like Sanders and Trump doing so well. Clinton is perhaps the most Democratic party establishment figure to run for president in recent history, having picked up more insider endorsements earlier than anyone else:

Can she draw support from Independents in the general election? I strongly doubt it. Can Bernie? You better believe it. And dangerously, even Donald Trump and Ted Cruz get a much greater proportion of independent voters than Clinton:

In case you missed her in that graph, she’s right below Carly Fiorina. Clinton’s demographic issues with young voters and independents should be enough to make Democrats think twice before choosing her as a nominee. In fact, I would argue they should be downright terrified about her prospects. However, that’s not the point of this post… The demographic story here about Sanders is that he absolutely can carry the general election by a large margin. If he is the nominee, the voters who would have chosen Hillary will mostly still show up and vote for Bernie against Trump, Cruz, or Rubio. Some small minority will remain obsessed over the “socialist” label, and the 1% might be upset about their taxes. But I’m far less worried about those issues, considering Obama has weathered the same labeling and ran on a similar platform of increasing taxes on those with income over 250k, than I am about Clinton’s troubles.

Aside from speculation about demographic issues there’s another way to look at electability: favorability. This chart shows that high net favorability is an excellent predictor of the general election outcome. In fact, the only counter-example was a case where Gore won the popular vote and the outcome was decided by the Supreme Court…

So what does the favorability picture look like for Clinton and Sanders? These graphs show HuffPost’s average of polls over time:

Bernie’s favorability is good and looks like it’s getting better. Hillary’s favorability is bad and looks like it’s getting worse. This isn’t absolute proof that Bernie would fare better than Hillary in the general. But there is still more evidence available in hypothetical matchup polls. In those polls Sanders does better than Clinton against Trump, Cruz, Rubio, and Kasich, and about the same versus Carson. Of course, such polls also have their flaws and are not absolute proof. However, given the earlier discussion of demographics, this paints a consistent picture. Even if it’s optimistic for me to believe Bernie’s political revolution will allow him to enact a significant portion of his agenda, it seems clear that he is at least better positioned to win the general election than Clinton. Considering the Republican field, that should be terrifying to Clinton’s supporters.

One more difference between primaries and the general election: time. Sanders started way, way behind Clinton, and has had to spend all his time catching up. He doesn’t have the advantage of a huge political machine working on his behalf, and still he is keeping up and giving them a run for their money (literally!). If he wins the nomination, his organization will have expanded significantly by that time and he’ll be in a much better position to GOTV nationwide. Arguably, one more reason why turnout is low this year is that the machine which generated Obama-level turnout in 2008 is actually not trying to generate such turnout now because that would hurt their candidate. Again, if Bernie gets the nomination then this machine can do general election GOTV efforts among young people, taking out all the stops.

Low turnout is also a problem for Clinton.

Sanders is proposing a more ambitious agenda so low turnout in the general would be a bigger problem for him. But if the drop in turnout from 2008 reflects poorly on anyone, it should be the political establishment. The Iowa and Nevada state Democratic Parties should be ashamed at how disorganized and chaotic their caucuses were. The DNC and all its superdelegates should reflect on how their actions make voters feel shut out of the decision process.

Focusing on Iowa again as an example, let’s ask which candidate is actually drawing in new voters. In 2008, 43% of Democratic caucus-goers were first-time participants. Given the incredible high turnout that year, and the 4% drop in the age 18-27 group this year, we should expect in 2016 a much lower proportion who never participated before (they probably participated in 2008). In fact, it held steady at 44% this year, and Sanders carried the first-timer group with 59% to Clinton’s 37%. Even the Editorial Board of the NYTimes is feeling this Bern:

The youth vote’s biggest beneficiary by far is Bernie Sanders, who filled venues in Las Vegas with cheering young admirers last week, after winning more than 80 percent of this group in both Iowa and New Hampshire. On Saturday young people made up 18 percent of voters in Nevada’s Democratic caucus, five percentage points more than in 2008.

There is also evidence that Sanders draw out a greater proportion of Latino voters in Nevada. According to the WCV Institute, Latinos made up 19% of the caucus compared to just 13% in 2008, and Sanders won in that demographic by about 8%, larger than Clinton’s margin in the state overall.

In summary, Sanders really is drawing more people into the political process. Clinton has much less of a claim to that, so if anyone in this race is to blame for the drop in turnout since 2008 it’s her.

In conclusion…

Is it really optimistic for me to think Bernie’s political revolution could succeed? Consider how much success he has had already, coming from so far behind, struggling against a monolithic establishment that placed all its weight behind his opponent before he even announced. If he manages to overcome that and win the nomination, imagine what he can do in the general election with the Democratic establishment riding his coattails rather than actively opposing him.

Oct 01 2012

1 Comment

Academia

Universities, the purpose and future of

MOOC stands for massive open online course. MOOCs may or may not drastically change higher education. There are some obvious good aspects to them, like making high quality learning material available to people who otherwise can’t access it. But some of my colleagues who, like me, aspire to be professors some day are worried that MOOCs may put them out of a job. Why should a university hire them to teach a class year after year when students can see recorded lectures instead? And the clincher: these lectures are given by top professors from top universities.

Discussing these concerns lead me to think about the purpose of universities. I’m writing this post to gather my thoughts on that topic, sort of “thinking aloud.” Here are several different views on the purpose of universities and the resulting predictions each view gives about the affect of MOOCs on academia. (The different views vary in descriptive/prescriptiveness)

The classical economics view (circa the World Wars) is that universities train the labor force and conduct research that improves technology, both of these things increase productivity (hence GDP) and that’s (supposedly) best for everyone (the underlying utilitarian philosophy or assumptions of economic theory are just not open for discussion). In this case it is certainly true that the work force could be trained with far fewer professors than we currently have (if for no other reason than that we can cut out subjects/departments like philosophy which don’t improve worker productivity), so we aspiring academics should despair. I don’t find this view very convincing, especially the part about training the labor force. I think the vast majority of college graduates will end up working in positions that are not closely related to their degree and could have learned the relevant skills on the job.

The technological superiority view (circa the Cold War) is a modification of the above which places less emphasis on teaching and more on research. Here the purpose of universities is to advance technology, giving their host nations an advantage in arms races or space races, or improving medicine so we can all live longer, etc. In this view MOOCs have basically no negative affect on teacher employment. The number of years of required education is increasing as domains of knowledge become deeper. So even if all the intro courses are MOOCs we will still need teachers for upper level, highly specialized courses. And the number of specializations is growing fast. I find this view deficient because better technology doesn’t always leave us better off. Remember that time we almost wiped each other out with cutting-edge technology? I also have the opinion that much research is a waste of time, not because it isn’t eminently useful, but because it’s actually low quality scholarship done mostly for the sake of increasing the number of publications.

The humanist view (circa before the World Wars, but humanities profs/majors will never let go) is what most liberally-minded people want to believe about their own reason for going to college. “Higher education” doesn’t just mean it’s a level above “high” school; it means our minds or spirits or quality of life (or whatever) are improved by learning. People with this view usually espouse “liberal arts” education, because even if you’re studying to be an engineer you should take an art class and learn to better appreciate the arts because that improves you as a human being (it might even inform your choices as an engineer, e.g. Steve Jobs and the Apple aesthetic). MOOCs are either bad because they are placing even less emphasis on humanities (it’s difficult to grade things like long papers, art projects, etc, in an MOOC format), or they are good because they are making education free and more available. I am very sympathetic to this view, if for no other reason than that I am a contrarian at an engineering school. Some people with this view (but not all) tend to undervalue the other benefits of universities like scientific research.

Before I offer my own view, notice some things about universities that none of the views above explain. None of those views mention anything about maximizing the university endowment, increasing the prestige of the school, or anything like that. However, all universities behave (organizationally) as though those types of things are their most important goals. This is despite the fact that most universities explicitly state in their charters that they exist to serve the betterment of humanity or something like that. The school could have more endowment money than it knows what to do with, donations constantly rolling in, and tuition rates already scheduled to increase, and they will still re-re-subcontract their custodial service, firing all the janitors and re-hiring them at an even lower wage. Why? (The cheap and easy explanation is that many university administrations are business school graduates and they are simply behaving the way they learned to behave in business school)

Also note that the demand for education from “top” schools is much larger than those schools meet. Every year they receive far more qualified applications than they admit. Most of them could easily spend part of their large endowments to expand and accommodate more students, and also create more academic jobs in the process. Why don’t they?

My own view (descriptive) is that universities serve all the purposes above to varying degrees, but they are also a pyramid scheme of cushy jobs and fierce guardians of elitist credentials. They provide security and leisure to the class of people who can succeed at academia. The pyramid-structure (both within and between schools) enhances prestige through intense competition at the lowest levels, and prestige is important because it justifies the credential elitism. MOOC certificates can never compete with actual degrees precisely because they are open (so they fail at being elitist). MOOCs may take jobs away from the people with less job security- adjuncts, for example. But they will not be allowed to threaten the job security of the people with cushy jobs, because those cushy jobs are one of the main points of the entire system. And though (top) universities could provide more cushy jobs by using their large endowments to grow, that might dilute their credentials. So my theory also explains why they don’t do that- they are protecting their elite status and therefore the class of people who already succeed in the system.

I am a big fan of cushy jobs, but elitism makes me sad (if I didn’t benefit from it, it would probably make me more mad than sad). I hope that the humanist view becomes more popular (it’s close to what I think the purpose of universities should be), because I think the lessons of great literature, for example, will help us choose systems that work much better for all of us. I want everyone to have nice jobs, not work too much, and have more time to spend enriching their lives (by taking MOOCs, for example). And I think these goals are realistic (they might all be accomplished by just having shorter work weeks).

Tagged academia, MOOCs, thinking aloud