Back to previous post: AVPU

Go to Making Light's front page.

Forward to next post: Two from after the Hugos

Subscribe (via RSS) to this post's comment thread. (What does this mean? Here's a quick introduction.)

August 29, 2006

Slicing and dicing the Senate
Posted by Teresa at 11:27 AM *

From the weblog Ascription is an Anathema to any Enthusiasm, which is big on graphical representations of statistical data, comes Slicing and Dicing the Senate (via). It displays and explains a series of charts from VoteView.com, titled The Current U.S. Senate: Optimal Classification Estimates of Ideal Points and Cutting Lines for the Last 50 votes.

Don’t that just sound like salted peanuts? But it really is interesting:

Each point on the charts represents one US Senator; they don�t move. The idea of this technique, �Optimal Classification Estimates�, is to reduce each legislator to two numbers and them pin them onto that graph. It�s extremely reductionist, but it works. The lines slicing thru the chart illustrate how voting proceeds on various bills. A line running from top to bottom reveals that a bill was decided largely on economic issues; while a line running left to right is a bill that was decided on social issues. The Democrats on the left are economically liberal, i.e. they tend to look out for the weaker and more numerous economic actors. The Republicans on the right are economically conservative; look after the economically large, but few.

Because the majority of my readers live in the United States: for “weaker and more numerous economic actors,” read “you; as in, you personally.”

The chart on the right shows how well the model works. The handful of points shown there are Senators whose votes didn�t fit the model. …

The model is extremely accurate: around 95% these days. Amazingly, you don�t actually need two axes; you will get 90% accuracy with a single axis that runs almost top to bottom, but slices slightly at an angle. You can see the entire Senate sorted into that ordering here. For example Joe Liberman isn�t the most conservative Democratic Senator; there are a handful who are more to the right than he is.

I�ve written about this model before, and I keep coming back to it because it totally changed the way I think about politics. It�s all economic; all the noise about social issues never actually flows thru into the legislative agenda.

That’s a profound point. I’ve strongly suspected it was so, but didn’t have the numbers.

This also explodes the idea that there’s no real difference between the Democrats and the Republicans. That’s what Enthusiasm was writing about in that earlier essay. There is a difference. It is durable and consistent. Republicans make a lot of distracting noise about social issues that command the attention of many middle-class and working-class families; but when they turn from talking to acting, they cater to a small number of very rich people. Odds are, none of them are you.

Comments on Slicing and dicing the Senate:
#1 ::: Ailsa Ek ::: (view all by) ::: August 29, 2006, 01:09 PM:

I am not at all surprised to see Sununu at the very bottom of the list. Hmm, one plus if we move to NH - I get to vote against him.

#2 ::: Greg London ::: (view all by) ::: August 29, 2006, 01:20 PM:

I must be dense. I had to read the post, look at the link with the graphs, read the post, look at the graphs again, read the post, and look at the graphs one more time before I finally got it. At least, I think I got it.

If I in fact did get it, it's pretty wild.'

Now I want them to go back through time and do it for the entire history of the US senate so I can see the markers shift over time, or if they shift.

#3 ::: Skwid ::: (view all by) ::: August 29, 2006, 01:29 PM:

If ever there were a series of charts that would benefit from some hypertextual markup, this is them.

#4 ::: P J Evans ::: (view all by) ::: August 29, 2006, 01:45 PM:

I followed the linky from the blog post, and it looks like they have done it for some of the earlier Congresses (Congressi?). The farther back in time you go, the more difficult it may get, so I'm not expecting lots of backward analysis on this one.

It would be nice if it had more title to each vote than just 'roll call number n'.

#5 ::: Bill Higgins-- Beam Jockey ::: (view all by) ::: August 29, 2006, 02:19 PM:

The thought arises that my representatives could be replaced, at least for the purposes of voting, by a box following a simple algorithm.

I don't like that thought.

#6 ::: Todd Larason ::: (view all by) ::: August 29, 2006, 02:25 PM:

#2: There's _lots_ of data and charts at voteview.com, including lots of historical data. I don't have a single page to point you to which fits what you want (which may just mean I haven't found it yet), but the discussion of Strom Thurmond's career has some nice latter-20th-C charts:
http://voteview.com/Thurmond_Lott_Frist.htm

#7 ::: Dave Bell ::: (view all by) ::: August 29, 2006, 02:38 PM:

The UK system has much stronger parties, much more voting on party instruction, so this wouldn't work well.

#8 ::: Claude Muncey ::: (view all by) ::: August 29, 2006, 02:45 PM:

I do find these results intriguing and I want to agree with them. Just a question about methods with this first.

Following the various links, I found their description of what I feel is a basic issue with their work, how legislators and bills get classified as liberal or conservative in the first place. Each legislator has one or more scores (depending on whether or not you are using a one variable or two variable model) and each bill has the same. The analysis is mapping one set of values against another in a sophisticated way. But how are the legislators and bills scored for each of these values?

According to the paper above, what they did was assign what they thought was a reasonable value to each legislator, then used that value to work out a score for each bill (if more liberals voted for it, it was more liberal). Then these values for each bill are used to adjust the scores for each legislator, and so forth until a stable set of values results.

I wonder about that kind of approach. It may work well, and I simply do not understand the statistical safeguards in place. But I am slightly uneasy about the possibility of a algorithmic artifact in this case. Now, my statistics are limited to what a public administration graduate student is expected to know -- not bad but I am no mathematician, and the word "eigenvalue" is a null for me. Would someone with a bit more background in this kind of work comment on how reliable this kind of approach is, and how well these researchers used it?

#9 ::: Nick Brooke ::: (view all by) ::: August 29, 2006, 02:50 PM:

@#7 Dave Bell: The UK system has much stronger parties, much more voting on party instruction, so this wouldn't work well.

But it's still pretty to look at if somebody tries (link is to Chris Lightfoot's excellent weblog). NB: the links at the bottom give you an exciting interactive experience exploring what our UK members are up to, which is more than you usually get from Parliament.

#10 ::: Greg London ::: (view all by) ::: August 29, 2006, 02:58 PM:

#8, I think if the folks looked at each individual roll call, decided that the legislation has a "social" rating of Y and a "libera/conservative" rating of X, then looked to see who voted, and how, then you would establish if a senator is less than or greater than X, and you'd also know if they were less than or greater than Y.

If you looked at enough bills with a sufficient range of ratings on the bills, then you could nail down the approximate threshold of each senator. Alice might be at -2, 3. meaning any bill labeled on one side of that point she would vote "Aye" and any bill on the other side she would vote "Nay".

if this is the approach, the only thing you need to do is rate the bills appropriately. If you fail to rate them consistently, then you won't find a threshold for individual senators, since they'll appear to vote for something on one side, and then vote against something else on that same side.

The accuracy of the statistics would be reflected in teh amount of "error" that they show on the right. i.e. when they draw a line, if the numbers are right, everyone will fall into the graph as predicted by previous bills. If there is a large "error", then (1) some prior bills used to determine the senator's coordinates wasn't assigned an accurate value, (2) the current bill wasn't assigned an accurate value, (3) free will and human inaccuracy caused the senator to vote outside their norm.

Given the error seems fairly small for many individual votes, I'd say they are on to something.

#11 ::: Greg London ::: (view all by) ::: August 29, 2006, 03:01 PM:

#9 wow. That's pretty wild.

Is there a name for this sort of analysis, that isn't a mouthful of syllables?

#12 ::: Greg London ::: (view all by) ::: August 29, 2006, 03:03 PM:

someone ought to graph the supreme court folks, and then compare them with, say, Bush's* appointees.

* (or, King George the Mad, as I like to say)

#13 ::: Greg London ::: (view all by) ::: August 29, 2006, 03:08 PM:

Ya know, this would also explain why trying to "appease" the other side won't work in many cases. Depending on where the line for the legislation gets drawn, there may be huge gaps to switch from "Nay" to "Aye", which simply can't be justified, which then loses your base voters way back on the other side of the line.

If a particular senator were attempting to "straddle" the line, trying to appease to both sides of issues, they should show up more often than otehr senators as being flagged as an "error". Since teh senators coordinates don't move, a lot of errors would mean they switch sides often. Wonder if they have a list of senators in order of largest error to smallest.

#14 ::: James Angove ::: (view all by) ::: August 29, 2006, 03:10 PM:

Dave Bell: Actually what this indicates to me is that our parties are getting stronger, which I don't have much of a problem with as such. (That is, where the line falls tells you something about the nature of the parties. But the fact of the line tells you that the parties are getting stronger)

#15 ::: Nick Brooke ::: (view all by) ::: August 29, 2006, 03:19 PM:

#11: Is principal components analysis too syllabic for you, Greg? Chris L. says, "Principal components analysis picks out the combinations of the divisions which (in some sense) best explain the variations in the data" in that blog-entry I linked to above.

#16 ::: Greg London ::: (view all by) ::: August 29, 2006, 03:32 PM:

I'll have to read the wikipedia entry for Principal componenent analysis at some point. Right now, most of it is going over my head.

#17 ::: Lizzy L ::: (view all by) ::: August 29, 2006, 04:09 PM:

Thanks for this post -- I would never have found this stuff myself. It makes a pretty stark picture. And it makes what we have to do in November very very clear. I am damn proud to see Barbara Boxer's name where it is.

#18 ::: Magenta Griffith ::: (view all by) ::: August 29, 2006, 04:18 PM:

Gee, I was hoping this was a cooking thread. I'd love a recipe for what to do with various politicos. Hanging's too good for them, but broiling and serving to the starving might make some use of them after all.

Okay, they'd have to be braised. Too tough otherwise. My mind must be twisted by that repubican version of the NYT.

#19 ::: John Aspinall ::: (view all by) ::: August 29, 2006, 04:30 PM:

Here are two more polysyllabic terms for you to Google, since principal components analysis doesn't tell the whole story.

1. Expectation Maximization. The Wikipedia article has the density of, say... lead, here so don't start with that one. Some of the applications may be more suitable to give you a flavor of its capabilities than the general tutorials.

2. Support Vector Machines. Which is not a rallying cry for downtrodden parallel processors, but actually a statistical technique that was developed in the hard AI "machine learning" community.

Both these terms, as well as principal components analysis, are mentioned in a tutorial whose URL I cannot post because it triggers the "questionable content" filter. But search for it.

#20 ::: Thalia ::: (view all by) ::: August 29, 2006, 04:36 PM:

Anyone else notice that the supposed centrist McCain is the 4th most conservative voter on that rank ordering? More consistent than Frist and Santorum in voting conservatively.

#21 ::: JC ::: (view all by) ::: August 29, 2006, 05:05 PM:

McCain has never struck me as being centrist. Aside from campaign finance reform, he strikes me as having solidly conservative positions. The list also points out how ridiculous the idea floated during the 2004 debates that Kerry was the more liberal of the MA senators is.

However, I haven't decided if this means that I now have to think of Hillary (and John Kerry) as liberals or if this analysis says that the senate, on the whole, is a rather conservative body (such that solid moderates end up ranked 23.5).

#22 ::: Greg London ::: (view all by) ::: August 29, 2006, 05:15 PM:

woah, that's wild, the most liberal senators, in order, are

109 15011 71 CALIFOR D BOXER 27 485 0.944 1.000
109 15021 21 ILLINOI D DURBIN 23 502 0.954 2.000
109 49309 25 WISCONS D FEINGOLD 62 500 0.876 3.000
109 14230 31 IOWA D HARKIN 37 505 0.927 4.000
109 10808 3 MASSACH D KENNEDY ED 17 500 0.966 5.000

The midwest is more liberal that massa-f-ing-chusetts???

That's crazy talk.

#23 ::: Greg London ::: (view all by) ::: August 29, 2006, 05:17 PM:

#19, John, thanks. I'll take a google gander after work.

#24 ::: P J Evans ::: (view all by) ::: August 29, 2006, 05:17 PM:

Durbin, Feingold, and Harkin are more liberal than most senators. Although I'm still not sure how reliable these numbers are. Feinstein and Boxer are, AFAIK, more conservative than Kennedy on a lot of things.

#25 ::: James D. Macdonald ::: (view all by) ::: August 29, 2006, 06:10 PM:

If voting were all that it's about, I suppose you could replace Congress with a simple algorithm.

But they also propose new legislation, which is the main source of their importance.

#26 ::: Mitch Wagner ::: (view all by) ::: August 29, 2006, 06:39 PM:

Like others in this thread, I found the article interesting, but mathematically complex enough that I didn't really follow it entirely.

In particlular, I'm confused what algorithm they used to determine where the bills and legislators fell on the liberal/conservative axis. What distinguishes a 2.3 from a 2.5 rating?

Thalia (20) Anyone else notice that the supposed centrist McCain is the 4th most conservative voter on that rank ordering? More consistent than Frist and Santorum in voting conservatively.

When McCain first began getting a lot of buzz about his presidential campaign, I noticed that the media and punditry seemed to discuss him as if he were a liberal Republican, a throwback to Jacob Javits.

Later, they seemed shocked that he espoused conservative positions.

He was a self-described conservative all along, though.

#27 ::: Teresa Nielsen Hayden ::: (view all by) ::: August 29, 2006, 07:19 PM:

A good deal of McCain's "liberal" reputation comes from his ability to use clear plain English, as though he were speaking honestly and directly to his audience. This makes him sound more like the orators of the Left, who do speak directly to their audiences. By contrast, most of the High Right speaks in more or less distanced, abstract code phrases.

I will admit that a lot of my early liking for McCain was based on his straightforward language. Then he got savaged by the Rove machine during the 2000 primaries. He was wounded and angry for a year or more, after which he went Stockholm Syndrome with the Bush administration. I don't know what the moral of the story is, except perhaps that Bush & Co. are better torturers than the North Vietnamese.

#28 ::: Fragano Ledgister ::: (view all by) ::: August 29, 2006, 07:28 PM:

Teresa #27: Bush has something McCain wants, control of an essential bloc of Republican votes.

#29 ::: Greg London ::: (view all by) ::: August 29, 2006, 07:30 PM:

I can't remember exactly what kiled my respect for McCain. I have a horrible memory. But bits of gray matter say it had somethign to do with Kerry running for president, and McCain (1) didn't run as his VP, (2) didn't demand Bush renounce a bunch of SwiftBoatLiarsForMadBush crap, and (3) endorsed King George the Mad. I don't even know if any of that happened, but that's how I recall it. Actually, how I recall it is like this: every time I hear the name McCain, my brain mumbles:

Maverick, my ass.

#30 ::: Bill Higgins-- Beam Jockey ::: (view all by) ::: August 29, 2006, 07:49 PM:

I'm no great hand at statistics, but if, as I believe, "principal components analysis" is the same thing as "factor analysis," I think Steven Jay Gould explains it well in The Mismeasure of Man.

As it happens, I was recently referring in another thread to a book-length factor analysis of science fiction readers' preferences for various authors.

#31 ::: Bill Higgins-- Beam Jockey ::: (view all by) ::: August 29, 2006, 07:56 PM:

Another, quicker, illustrated introduction to principal-components analysis may be found here. (I got this link from the bottom of the impenetrable Wikipedia article.)

#32 ::: Randolph Fritz ::: (view all by) ::: August 29, 2006, 08:05 PM:

Unless I misunderstand this, it looks very much like most of the business of the Senate is the economic management of the USA. That's quite sobering, since I doubt that the Senate is well-equipped for the job and I don't think I've every heard any senator mention it.

Greg, Massachusetts liberalism is of the noblesse oblige sort; Midwestern liberalism, when it emerges is usually populist.

#33 ::: Mitch Wagner ::: (view all by) ::: August 29, 2006, 08:14 PM:

I've been troubled by all the things mentioned here about McCain, but what I find most troubling is McCain-Feingold.

I'm troubled by money dominating politics, but letting the federal government take over the pursestrings is not the answer. It allows people now in power to protect their allies and select their successors even moreso than they've been able to all along.

#34 ::: Claude Muncey ::: (view all by) ::: August 29, 2006, 09:12 PM:

Bill, I was thinking specifically of Gould's work in looking at this. As I said above, I may misunderstand the detail of this process and the specific safeguards involved here, but the twin problems of reification and ranking that Gould saw in the concepts of IQ and g are possible in this case.

I wondered about something similar years ago while digging into some political science research. A couple of times I saw someone, often by how a variable was operationalized, take something that was a nominal value, ordinal at best, and treat it like a interval or ratio value. Once you managed to get numbers, you could do anything you wanted with it, no matter what the underlying original information was like. An arbitrary example could be calculating the standard deviation of the house numbers on your block. The operation can be carried out correctly, but does the number really mean anything useful?

(Math geek note: Yes I understand that there is some rather high powered criticism of Stevens' system of value typologies -- by Tukey, for example. But I think they still are useful, if not abused.)

As I said above, I can easily imagine that the conclusions drawn are correct. But this is a process where the researchers first "reasonably" score the senators, then use those numbers to score the bills, then turn around and use those numbers to re-score or adjust the scores of those same senators. How does their approach ensure that all of this is not simply dependent on the original "reasonable" (their term) scoring of the senators? If their approach does do that, it would help if someone could explain it to some of us denser types around here.

#35 ::: mds ::: (view all by) ::: August 29, 2006, 10:43 PM:

Durbin, Feingold, and Harkin are more liberal than most senators. Although I'm still not sure how reliable these numbers are. Feinstein and Boxer are, AFAIK, more conservative than Kennedy on a lot of things.

Go, Midwest! Senator Wellstone hogged the top of the liberal list, too. Though we've certainly been a contradictory lot. Remember who knocked out populist giant Robert La Follette, Jr. in the 1946 primary: Joseph McCarthy. And for Senator Harkin's home state to switch from blue to red in 2004 makes me slightly embarrassed to be from Iowa.

(And Senator Feinstein does indeed frequently sound like a moderate Republican, but Senator Boxer is pretty unequivocally a liberal.)

#36 ::: Ben Hyde ::: (view all by) ::: August 30, 2006, 08:33 AM:

Thanks for the link!

Just to clarify: "But this is a process where the researchers first "reasonably" score the senators, then use those numbers to score the bills, then turn around and use those numbers to re-score or adjust the scores of those same senators."

One of the things that blew me away about this approach was that they didn't start from a "reasonably" scoring. If you tunnel back to the longer older posting I wrote about this I explain that they started in a time when computers were very weak and they wanted to fit a model to the voting data. The simplest model they could imagine was every senator would be a point on a line and every vote would split the line at some point. They then attempted to compute a best fit for all the senators (assuming they don't move) and every vote. Amazingly it worked and it's output was the reductionist ranking of every senator and a position for each bill. Later they were able to try the same trick in with more dimensions.

It actually took them a while to figure out what qualitative measures the dimensions were reflecting since it's often hard to puzzle out why a bill is actually about a social issue. For example one of the votes shown in that drawing was about the time limits on the guest worker proposal, it was a pure social vote, but it's not obvious to me why.

They have data now for many legislative bodies over very long periods. The book on voteview is just mind boggling.

They can also show that we have grown very polarized over the last few decades as the Republican legislators (though not the public) have shifted to the right. Their new book on that is just depressing.

#37 ::: Laurence ::: (view all by) ::: August 30, 2006, 09:49 AM:

Teresa said: This also explodes the idea that there's no real difference between the Democrats and the Republicans.

I don't think I understand these charts, but I did notice a couple things:

1. On each chart, the D's and R's are usually in two separate groups. Is that what gives the impression that there's a difference between Democrats and Republicans?

2. Even if the vote was close to unanimous, the D's and R's are still shown in two separate groups. Take a look at:

Number: 226 (Session 2)
Yeas-Nays: 96-1

Number: 230 (Session 2)
Yeas-Nays: 93-5

Why is that?

I can well believe that there is a difference in voting patterns between D's and R's, but the two groups in each chart don't appear to reflect the voting patterns for each question.

#38 ::: Clark E Myers ::: (view all by) ::: August 30, 2006, 12:26 PM:

#34An arbitrary example could be calculating the standard deviation of the house numbers on your block. The operation can be carried out correctly, but does the number really mean anything useful?

Sure, at least in some communities the number arrived at is a useful at least preliminary proxy for housing density against road density (how long are the blocks and how far apart are the house numbers) derived by a few key strokes from published data. Granted it's more useful for being quick and dirty from data already avaible in digitized form. Examples and details on request - nod to JVP.

For a first approximation - I may well be completely off base pending detailed analysis and an explanation of hypothesis, algorithm, model and process not completely in evidence (just maybe there is some confusion of hypothesis and model? or maybe economics and political science just uses the words differently?) - I see an interesting view of the data but as I read the hypothesis to be Republicans favor legislation that Republicans favor and Democrats favor legislation that Democrats favor I don't see that the graphics justify any further conclusion. That is I don't see more here than the tabular rankings or gradings of legislators in which a given special interest group grades or ranks legislators according to the given special interest. That is I suggest special interests could be divided Republican or Democrat by how the special interest ranks legislators - or legislators could be divided Republican or Democrat by how they are ranked by different special interests but the information is circular.

Very good pictures and in that sense useful just as a Power Point presentation may be more useful than plain text - there is a little bit of Temple Grandin in all of us and a picture is often worth many words.

Just the same on the axis I count most John McCain and Larry Craig are far apart - see also Craig from (formerly - the water is going to Ketchum/Sun Valley mansions instead of to farmers)) agricultural Idaho on immigration/guest workers or on my pet issues.

#39 ::: Del Cotter ::: (view all by) ::: August 31, 2006, 11:06 AM:

On each chart, the D's and R's are usually in two separate groups. Is that what gives the impression that there's a difference between Democrats and Republicans?

Laurence, no 'usually' about it, what you're seeing is a static feature of the graphs: look closely, it's the exact same two groups each time. So you can't infer from the unchanging position of those two groups that there's something fishy about the impression of difference.

D and R signify party; red and blue signify how they voted in that vote; the straight black line is a best straight-line fit drawn through the space; "errors" are reds on the blue side of the line, or blues on the red side of the line; the space is an unchanging backdrop against which to display the lines for each vote,

Each line is designed to minimize the errors in that space; the space is designed to minimize the sum of the errors incurred by all the lines; the existence of the clusters in that one space is the clue that there really is a difference between Rs and Ds. The vertical nature of the gap between them is the clue that that difference is about economic interest groups, not about Social Issues.

I'd be interested to see how many of the remaining errors can be eliminated if the black lines are permitted to be parabolas, or some other dog-legged function, and not constrained to be straight lines.

#40 ::: Greg London ::: (view all by) ::: September 11, 2006, 07:06 PM:

Hm, it would seem that one could create a relative comparison of any group of people's personal view if you have some yes/no answers to an assortment of questions of varying subjective points on the spectrum.

I just had the weirdest thought that this could somehow be applied to wikipedia.

You could use edit-histories as a source of yes/no "answers". putting text in is a "yes" to that text. taking text out is a "no" to that text. A single edit might have a number of yes/no answers built into it as text is removed and replaced with new text.

Within an article, this should be automatically extractable into some form of relative coordinate system. With the result being a graph showing the location of editors, the moderates towards the middle, and the extremists towards the edges.

Now if I only had an infinite amount of time and an infinite number of monkeys, I could code this up as a program and ship it.

Choose:
Smaller type (our default)
Larger type
Even larger type, with serifs

Dire legal notice
Making Light copyright 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 by Patrick & Teresa Nielsen Hayden. All rights reserved.