Wikileaks Iraq War Data Part 2
In the limited time I have, I will engage on a rudimentary time series analysis of the Wikileaks Iraq War Diaries. As I states in the previous posts, I have limited the dataset to include only those entires which report deaths of some kind. There are three different classes of casualties I will be working with, coalition, civilian and enemy. I have left Iraqi armed forces out, because I question the classification. I will start by providing a list of main points for those who just want to breeze through:
1. Civilian casualties are through the roof compared to casualties among those actually fighting the war.
2. Things got really bad around the time of the surge, but have since quieted down. US and insurgent forces seem to have gotten better at attacking one another without actually killing one another, but civilians keep on dying.
3. While all deaths and injuries have decreased to levels not seen since the invasion, there were disturbing spikes in civilian deaths and injuries in the later part of 2009.
Time Series 101: Time series are just that: a series of observations made of the course of time. Here we have daily reports for the entire calendar years of 2004 to the end of 2009. All time series are made up of three components: trend, seasonality and random noise. If we were to look at a successful business, we could spot these three components in daily sales data, for example. Sales may be going up over the course of a few years (trend), sales may regularly fluctuate within a year due to seasonal events like Christmas (seasonality), and there may be some daily random ups and downs that just happen for reasons unknown (noise). I will examine these three components separately, and then move on to questions of relationships between the series.
The Data: First, let’s look at the series themselves. Like I said, I have three series of wounded and killed: one for civilians, one for coalition members and one for enemy combatants. Civilian casualties are the most striking. While casualties were low at the beginning of the war they peaked at around the start of 2007, reaching an all time high of 972 wounded and killed on February 3rd, 2007. The mean number of wounded and killed for 2004-2010 is 58 people per day. That’s 1.87 wounded or killed per 10,000 people per day in Iraq. In America, .59 people per 10,000 people per day are wounded or killed by guns. One’s chance of being killed or wounded by guns in Iraq over the course of the war is three times that of the States, which still has the highest number of gun related injuries and deaths in the developed world. Notice, at the end of the series, civilian casualties have again gone up.
On average, 5.26 coalition members die or are killed every day. The bloodiest day for coalition forces was Jan 26th, 2005, when 36 coalition service people were killed. 12.33 enemy fighters are killed or wounded each day, and the worst day for them was Valentine’s day, 2005 when there were 411 casualties. Note the extreme difference in scales between the three series. Civilians get killed and wounded the most, US and coalition members the least.
Trend: Below I have produced three plots representing the trend component of the time series. The patterns are interesting. At the start of the war, very few civilians died in combat incidents, but very many US and enemy combatants did. As insurgents and US forces got worse at killing each other, the efficiency and scale at which civilians started to die rose considerably. The surge happened in 2006, causing a severe uptick in civilian and enemy casualties, but US and coalition deaths did not reach levels that were seen at the start of the invasion. Post surge, all three casualty levels began to plummet until 2009, when things began to calm throughout the country. It is interesting to me that civilians were largely spared at the beginning of the conflict. It is possible that events at this time were largely limited to areas outside Baghdad, but as fighting moved into the city, civilian deaths and causalities rose again.
Seasonality: To the left, I have produced plots of seasonality. You can click on them to check them out but they are not entirely enlightening. A better way to check for seasonality is the autocorrelation plot. We expect that tomorrow will be somewhat like today. However, 2 days from now will be less like today than tomorrow. This is the concept of autocorrelation. Things that are close together are more alike than things that are far apart. If a series is seasonal, then we will expect there to be correlation between the same time period in the following year. The plots below show the autocorrelation for 3 years. There is no real evidence for seasonality in any of these plots but there do appear to be some blips in coalition and enemy casualites at the 4 and 6 month marks.
Next: Cross correlation between civilian, enemy and coalition deaths.