Archive | Pointless Data Exercises RSS for this section

Leeches for sale: The economy of bloodletting in France, 1827-1836

N8000057-Leech_use,_historical_artwork-SPLI found this great post on drug shortages that appeared in BMJ today.  Among all of the other great gems in it, was this incredibly interesting article on the creation of a mechanical bloodletting device. Jean-Baptiste Sarlandière, a French anatomist and inventor, created the “mechanical leech”, a device intended to extract a controlled amount of blood from the body. Sarlandiere intended the device to replace leeches, which were subject to increasing demand, were becoming expensive, were difficult to cultivate, and were subject to shortages in the Netherlands, who was a large producer of leeches at the time.

A paper was written on the device back in 2009, and within there is a dataset of leech imports and exports to France, which includes data on the monetary value of leeches and public consumption. Of course, I couldn’t resist pulling this data out and doing something with it (despite having better things to do.)

Here is the data, pulled from yet another paper (Alexandre E Baudrimont, Adolphe J Blanqui, et al.Dictionnaire de l’industrie manufacturie`re, commerciale et agricole, Paris, J-B Baillière, 1833–1841, pp. 25–30.):

Year Number of leeches imported Value in Francs National consumption Exports Import export ratio Value per leech
1827 33653694 1009611 33456744 196950 170.87 0.03
1828 26981900 809457 26689100 292800 92.15 0.03
1829 44573754 1337212 44069848 503906 88.46 0.03
1830 35485000 1064550 34745848 739250 48 0.03
1831 36487975 1094639 35245875 1242100 29.38 0.03
1832 57487000 1724610 55591700 1895300 30.33 0.03
1833 41654300 1249629 40785650 868650 47.95 0.03
1834 21885965 656579 21006865 879100 24.90 0.03
1835 22560440 676813 21323910 1236530 18.24 0.03
1836 19736800 592104 18721555 1015245 19.44 0.03

Leeches

Of course, I am fascinated with this. The number of leeches exported from France rose during this period as did the market price of each leech. Though the entire industry would eventually collapse because other medical advances of the nineteenth century would supercede it, it is clear that increased demand and expense led to innovation to create devices to replace it. I don’t know whether the “mechanical leech” led to the development of other medical devices, but would like to think that even batshit ideas like how swamp worms draw out blood to cure any and all medical conditions would lead to the creation of methods which do improve health.

Mapmaking with ggmap

I am always looking for free alternatives to ArcGIS for making pretty maps. R is great for graphics and the new-to-me ggmap package is no exception.

I’m working with some data from Botswana for a contract and needed to plot maps for several years of count based data, where the GPS coordinates for facilities were known. ArcGIS is unwieldy for creating multiple maps of the same type of data based on time points, so R is an ideal choice…. the trouble is the maps I can easily make don’t look all that good (though with tweaking can be made to look better.)

ggmap offered me an easy solution. It downloads a topographic base map from Google and I can easily overlay proportionally sized points represent counts at various geo-located points. This is just a map of Botswanan health facilities (downloaded from Humanitarian Data Exchange) with the square of counts chosen from a normal distribution. The results are rather nice.

BotswanaHF

library(rgdal)
library(ggmap)
library(scales)

#read in grographic extent and boundary for bots
btw <- admin<-readOGR(“GIS Layers/Admin”,”BWA_adm2″) #from DIVA-GIS

# fortify bots boundary for ggplot
btw_df <- fortify(btw)

# get a basemap
btw_basemap <- get_map(location = “botswana”, zoom = 6)

# get the hf data
HFs.open.street.map<-read.csv(“BotswanaHealthFacilitiesOpenStreetMap.csv”)
# create random counts
HFs.open.street.map$Counts<-round((rnorm(112,mean=10,sd=5))^2,0)

# Plot this dog
plot
ggmap(btw_basemap) +
geom_polygon(data=btw_df, aes(x=long, y=lat, group=group), fill=”red”, alpha=0.1) +
geom_point(data=HFs.open.street.map, aes(x=X, y=Y, size=Counts, fill=Counts), shape=21, alpha=0.8) +
scale_size_continuous(range = c(2, 12), breaks=pretty_breaks(5)) +
scale_fill_distiller(breaks = pretty_breaks(5))

Does sampling design impact socio-economic classification?

DSC_2057Doing research in developing countries is not easy. However, with a bit of care and planning, one can do quality work which can have an impact on how much we know about the public health in poor countries and provide quality data where data is sadly scarce.

The root of a survey, however, is sampling. A good sample does its best to successfully represent a population of interest and can at least qualify all of the ways in which it does not. A bad sample either 1) does not represent the population (bias) and no way to account for it or 2) has no idea what it represents.

Without being a hater, my least favorite study design is the “school based survey.” Researchers like this design for a number of reasons.

First, it is logistically simple to conduct. If one is interested in kids, it helps to have a large number of them in one place. Visiting households individually is time consuming, expensive and one only has a small window of opportunity to catch kids at home since they are probably at school!

Second, since the time required to conduct a school based survey is short, researchers aren’t required to make extensive time commitments in developing countries. They can simply helicopter in for a couple of days and run away to the safety of wherever. Also, there is no need to manage large teams of survey workers over the long term. Data can be collected within a few days under the supervision of foreign researchers.

Third, school based surveys don’t require teams to lug around large diagnostic or sampling supplies (e.g. coolers for serum samples).

However, from a sampling perspective, assuming that one wishes to say something about the greater community, the “school based survey” is a TERRIBLE design.

The biases should be obvious. Schools tend to concentrate students which are similar to one another. Students are of similar socio-economic backgrounds, ethnicity or religion. Given the fee based structure of most schools in most African countries, sampling from schools will necessarily exclude the absolute poorest of the poor. Moreover, if one does not go out of the way to select more privileged private schools, one will exclude the wealthy, an important control if one wants to draw conclusions about socio-economic status and health.

Further, schools based surveys are terrible for studies of health since the sickest kids won’t attend school. School based surveys are biased in favor of healthy children.

So, after this long intro (assuming anyone has read this far) how does this work in practice?

I have a full dataset of socio-econonomic indicators for approximately 17,000 households in an area of western Kenya. We collect information on basic household assets such as possession of TVs, cars, radios and type of house construction (a la DHS). I boiled these down into a single continuous measure, where each households gets a wealth “score” so that we can compare one or more households to others in the community ( a la Filmer & Pritchett).

Distributions We also have a data set of school based samples from a malaria survey which comprises ~800 primary school kids. I compared the SES scores for the school based survey to the entire data set to see if the distribution of wealth for the school based sample compared with the distribution of wealth for the entire community. If they are the same, we have no problems of socio-economic bias.

We can see, however, from the above plot that the distributions differ. The distribution of SES scores for the school based survey is far more bottom heavy than that of the great community; the school based survey excludes wealthier households. The mean wealth score for the school based survey is well under that of the community as a whole (-.025 vs. -.004, t=-19.32, p<.0001).

Just from this, we can see that the school based survey is likely NOT representative of the community and that the school based sample is far more homogeneous than the community from which the kids are drawn.

Researchers find working with continuous measure of SES unwieldy and difficult to present. To solve this problem, they will often place households into socio-economic "classes" by dividing the data set up into . quantiles. These will represent households which range from "ultra poor" to "wealthy." A problem with samples is that these classifications may not be the same over the range of samples, and only some of them will accurately reflect the true population level classification.

In this case, when looking at a table of how these classes correspond to one another, we find the following:

Misclassification of households in school based sample

Assuming that these SES “classes” are at all meaningful (another discussion) We can see that for all but the wealthiest households more than 80% of households have been misclassified! Further, due to the sensitivity of the method (multiple correspondence analysis) used to create the composite, 17 of households classified as “ultra poor” in the full survey have suddenly become “wealthy.”

Now, whether these misclassifications impact the results of the study remains to be seen. It may be that they do not. It also may be the case that investigators may not be interested in drawing conclusions about the community and may only want to say something about children who attend particular types of schools (though this distinction is often vague in practice). Regardless, sampling matters. A properly designed survey can improve data quality vastly.

Terror in the Mid-East: It’s never been worse

TerrorWe are entering into one of the most chaotic chapters of modern history, though the geographic space of this chaos is smaller than it has ever been. While most countries are experiencing less terror, Mid-Eastern terrorist have never been busier or more successful.

I downloaded data from the Global Terrorism Database, which comprises more then 125,000 individual acts of terror and found that, since 2010, the number of weekly terror events when from somewhere around 10 to more than 40, and the trend doesn’t look like it’s ending anytime soon.

Moreover, while terror events are becoming more frequent, they are becoming more and more unpredictable.

While the world was shocked over Charlie Hebdoe, the troubling scale up in the number of terror events seems to have mostly gone unnoticed. Terrorists strike Islamic countries far more than they do France, and kill more than just cartoonists and policemen.

It is unproductive to view all terror groups and even acts of terror as being the same. Terror has turned into a morass of competing groups, with differing political aims and the loose nature of Al Qaeda has led to an outsourcing of terror by any local thug with a gun.

It is also unproductive to view Mid-East terror as simply restricted to the angry victims of drone attacks. Islamic terrorism has a deep history with roots going back decades, a history which seems to be widely ignored. It is also important to note that ISIS’ membership consists of a frighteningly large number of Westerners and a careful watch of their videos reveals that English, rather than Arabic, is a common language among its followers.

Where will this go? No one knows, but Charlie Hebdoe will be just a blip on the pattern on terror.

TerrorPlot

A brief thought on evolution: multi-generational survival

Often people will mention that we are “adapted” to do this or another thing, either indicating some crime of modernity (of course, ignoring the fact that a larger percentage of babies are surviving and people are living longer and healthier than at any time in human history) or trying to point out some example of the glaring perfection of our creation, with either an implicit or vocal reference to divine creation.

For example, obesity is attributed to fat and protein rich modern diets since we aren’t “adapted” to eat these types of foods (despite having found the food in East Africa so unpalatable that we had to learn to crush or cook it to digest it efficiently). Our bad disposition is blamed on a lack of sleep since we aren’t “adapted” to sleep as little as we do (this might be true). Most recently, a book writer blamed our problems with depression on a divorced relationship to nature, given that we are “adapted” to hunt and kill for food and then revel over the blood stained corpse (of course, the writer doesn’t consider that people in antiquity might have been depressed, too).

There may be some truth to some of this. However, “adaptation” implies something about the individual, when evolution, in fact, is about reproduction. We aren’t “adapted” to anything. Rather, certain traits are selected for based on the survival of at least two generations of living things, at least for complex social animals like ourselves.

Simply surviving as an individual does not insure the survival of a species. Living things must first survive long enough to reproduce and then, at least in humans, insure that the children make it to reproductive age. Human babies are horribly weak in contrast to sharks, which are ready to go even before they leave the mother. Further, in the case of humans, a full three generations must live at once to insure long term survival.

Thus, we maintain a tenuous relationship with out environment, where traits do not necessarily favor a single individual, but rather an entire family unit, and these traits may or may not imply perfect harmony with an environment, but rather do the job at least satisfactorily.

Nature cares little for quality as numerous examples throughout our physiology show. To claim that we are somehow “perfectly suited” to a specific environment is just simply wrong. Merely, we have come to a brokered peace (after generations of brutal trial and error, what we eat today is thanks to the deaths of millions, mostly children, who had to die to allow us to do so) with wherever we live in order to allow a few of our kids and grandkids to survive.

This, of course, has deep implications for public health. Some public health problems are known to be passed down from parents to children, but in a multi-generational evolutionary framework, it is possible that certain public health problems can be passed through 3 or more generations at a time, complicating interventions. Certainly, the multi-generational health problems of the descendants of African slaves can be an example of this. How can we intervene to protect the public health over a full century?

OK, back to work.

2014 in review: I’m not sure what this all really says

It is pretty obvious that after July, something happened and I stopped posting with any sort of regularity. I really need to fix this or whatever is keeping me from posting. I don’t get a whole lot of traffic on this blog, but it seems that every day I don’t post is a missed opportunity for me.

Anyway, to all of you who read this blog in 2014, I thank you. It’s great to have you around. I wish everyone a great 2015.

Pete

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here's an excerpt:

Madison Square Garden can seat 20,000 people for a concert. This blog was viewed about 62,000 times in 2014. If it were a concert at Madison Square Garden, it would take about 3 sold-out performances for that many people to see it.

Click here to see the complete report.

Does the environment cause poverty?

SESKwaleAfrican countries are blessed with ample cropland and resources, but suffer from crippling and unforgivable levels of poverty, have some of the shortest lifespans on the planet and the highest rates of infant mortality in the world. Meanwhile, Japan, Korea, Sweden, Switzerland and Singapore are wholly the opposite, yet mostly lacking in everything that Africa has. Clearly, the picture is more complicated than merely having access to a natural resources.

However, within countries, the picture might be different. African countries are complex and diverse places. Poverty is often confined to the most unproductive regions, areas with poor soils, poor rainfalls or dangerous terrains.

I was just working with some socio-economic data from one of our field sites, and noticed some interesting patterns (note the map up top). In Kwale, a small area along the Coast, socio-economic levels vary widely, but neighbors tend to be like neighbors and patterns of socio-economic clustering emerge.

Note that the poorest of the poor are concentrated to an area in the middle, which I know to be extremely dry, difficult to get to, difficult to farm and generally tough to live in.

I tried to see if socio-economic status (as measured through a composite material wealth index a la Filmer and Pritchett but using multiple correspondence analysis rather than PCA) was related to any environmental variables that I might have data for.

I fit a generalized additive model using the continuous measure of of wealth from the MCA as an outcome. Knowing that very few things in nature or human societies are linear, I also applied smoothing to the predictors to relax these assumptions. The results can be seen in the plot at the bottom.

A few interesting things came out. While it is hard to tell much about the poorest of the poor, we can tell something about the most wealthy. The richest in this poor area, tend to live in areas with the richest vegetation (possibly representing water), a high altitude (low temperature), high relief (no standing water) and in locations distant from a wildlife reserve (far from annoying and dangerous wildlife).

I’m not sure the wildlife reserve is meaningful (unless the reserve was an area undesirable for human habitation to begin with), but the others might be and represent a trend seen in other Sub-Saharan contexts. Areas without malarious swamps and ample farm land tend to do the best. Central Province, one of the most developed areas of Kenya, would be an example.

But the question has to be, does a harsh environment doom people to poverty, or do people self shuffle into and compete for access to more favorable areas? Is environmentally determined poverty (or wealth) an accident of birth, or the result of competitive selection?

Alright, back to work. Oh wait, this is my work. Well….

Results of GAM model of SES in Kwale. Y axis is the continuous measure of socio-economic status.

Results of GAM model of SES in Kwale. Y axis is the continuous measure of socio-economic status.

%d bloggers like this: