Determining Authorship of Ron Paul Newsletters through Text Analysis
Update: Part 2 is here.
Update: Part 3 is here.
Ron Paul sold newsletters in the 80′s and 90′s. The content of these newsletters was appalling though unsurprising. Here’s a sample:
“We don’t think a child of 13 should be held responsible as a man of 23. That’s true for most people, but black males age 13 who have been raised on the streets and who have joined criminal gangs are as big, strong, tough, scary and culpable as any adult and should be treated as such.”
“And Stanford, Michigan, and many other universities have banned speech that offends privileged groups. Anti-white, anti-male, anti-heterosexual or anti-Christian remarks are perfectly OK, of course.” You can imagine, then, what a relief it must be to minorities, homosexuals, women and non-Christians to find themselves the privileged people of America. The rest of this page and part of the second details a cabal of homosexuals in the Bush administration who like to lead “the young” astray.
“Boy, it sure burns me to have a national holiday for that pro-communist philanderer, Martin Luther King. I voted against this outrage time and time again as a Congressman. What an infamy that Ronald Reagan approved it! We can thank him for our annual Hate Whitey Day. Listen to a black radio talk show in any major city. The racial hatred makes a KKK rally look tame.”
“Dr. Douglass believes that AIDS is a deliberately engineered hybrid of these two animal viruses cultured in human tissue, and he blames World Health Organization experimentation at Ft. Detrick, Maryland…. Could the government have experimented with it in the civilian population, as it did in the 1950s with LSD, and had things get out of control? I don’t know, but these sure are interesting questions.”
“A well-known libertarian editor just back from New York told me: ‘The ACT-UP slogan, on stickers plastered all over Manhattan, is “Silence = Death.” But shouldn’t it be “Sodomy = Death”?’”
Paul claims not only to NOT have written the trash in his newsletters, but also claims to not have known of the content of them. I find it highly unlikely that, given Paul’s prolific written output, that Paul would not have had the time to write the content of newsletters and signitures which bear his name. I also find it unlikely that he himself wouldn’t have read them, given that he drew a portion of his income from their continued sale.
Regardless, the claim that Paul did NOT write the content of his own newsletters needs to be put to rigorous test. Clearly, Paul himself is of no use in this venture, given his precipitous position as a Presidential candidate.
PhiloComp.net offers the “The Signature Stylometric System,” a text analysis software package offered for free. One can use the package, for example, to determine if the same author wrote all of Shakespeare’s plays or to determine the authorship of the Federalist Papers. It compares word and sentence length between texts, and determines frequency of letter usage and punctuation. Authors have particular styles. For example, one author may often use three letter words (or four letter words!). We may take a disputed work, compare the word length of it against all other works by said author, and then statistically test whether there is evidence to suggest that the work came from that author.
I collected a number of works known for a fact to be written by Paul. I included a couple of chapters from “End the Fed,” a number of his speeches, and more than 20 articles and compiled them into a single corpus. On the internet, I then found four articles from his newsletters: one asking readers to assist in his re-election to office (his present seat in Congress, actually), one on the supposed government conspiracy to create and spread AIDS (partially quoted above), one on the coming race war, and one particularly deplorable article on carjacking and the need for an armed populace.
A graph of the distribution of word length in Paul’s output can be seen below.
Using the software, I compared the word length and sentence length of each of the four newsletter articles to works known to be written by Paul. The results are below. For those unfamiliar with stats and/or p-values, the gist is this: If the p-value is less than , say, .05, there is reason to believe that authors of the newsletter articles is someone other than Paul. If the p-value is greater than .05, we might concluded that there is not enough evidence to suggest that Paul did not write the articles, and move on to other methods of testing (as is seen in the next post).
The results are interesting. There is not enough evidence to suggest that someone other than Paul wrote the piece on AIDS and the piece speculating on a coming race war, though to confirm (or refute) Paul’s authorship, we may have to resort to other methodologies. On the other hand, there is reason to believe that someone else may have written the other two articles, the one on carjacking and the re-election piece.
I have also included a comparison with a piece on health care that is known to be written by Paul. The tests confirm that it compares nicely with the rest of Paul’s known writings (or at least provides no evidence that it is significantly different). For reference, I have also included the results of a tests between Paul’s writings and the entire text of this blog starting in 2007. Again, the test confirms that the authors are likely different people (which I already knew).
A visual comparison of word length between the feature on the coming race war and the rest of Paul’s works shows that the two are very similar. For reference, I have included a comparison of Paul’s works with my last blog post, which, incidentally was also statistically different from Paul’s writing on all measures.
Obviously, we will never know without a doubt who did or did not write the trash that appeared in Paul’s incendiary newsletters, though results like these and more casual spot-check analyses indicate that the case is hardly closed. I am convinced that Paul happily exploited the worst elements of the American political landscape. He willfully mixes with racists, conspiracy nuts and paranoid gun freaks for nothing more than political gain, political contributions and worse yet, book sales. I am also convinced that he was aware of the newsletters that he has “disavowed” though the results above indicate that he may, in fact, have farmed out some of the writing to other people.
Subjecting myself to his writing was one of the most painful and useless experiences of my life. I really wanted to give the man a chance, particularly after his impressive display at the Republican foreign policy debate. “End the Fed” read more like a paper from freshman comp than a serious book, though it somehow attempts to pass itself off as a work of deep economic analysis. Not to disparage people I know that may support Paul (and I do apologize), but I think that Krugman’s recent quip that Newt Gingerich is “a stupid man’s idea of what a smart man sounds like” is actually more true of Ron Paul.
It doesn’t take a piece of software to know that it is possible that Paul at least signed off on some of the nonsense in his newsletters. The jury on whether he did or did not write these articles may be out, but a reading of his works shows that philosophically, it doesn’t take a great leap of faith to move from Paul’s public persona to some of the ugliest portions of right wing politics.
Update: Please see further analysis in the next post that expands upon these results. If Paul didn’t write these letters, who did?
Further discussion of methods and criticism of this post on another blog can be found here.