Does sampling design impact socio-economic classification?

DSC_2057Doing research in developing countries is not easy. However, with a bit of care and planning, one can do quality work which can have an impact on how much we know about the public health in poor countries and provide quality data where data is sadly scarce.

The root of a survey, however, is sampling. A good sample does its best to successfully represent a population of interest and can at least qualify all of the ways in which it does not. A bad sample either 1) does not represent the population (bias) and no way to account for it or 2) has no idea what it represents.

Without being a hater, my least favorite study design is the “school based survey.” Researchers like this design for a number of reasons.

First, it is logistically simple to conduct. If one is interested in kids, it helps to have a large number of them in one place. Visiting households individually is time consuming, expensive and one only has a small window of opportunity to catch kids at home since they are probably at school!

Second, since the time required to conduct a school based survey is short, researchers aren’t required to make extensive time commitments in developing countries. They can simply helicopter in for a couple of days and run away to the safety of wherever. Also, there is no need to manage large teams of survey workers over the long term. Data can be collected within a few days under the supervision of foreign researchers.

Third, school based surveys don’t require teams to lug around large diagnostic or sampling supplies (e.g. coolers for serum samples).

However, from a sampling perspective, assuming that one wishes to say something about the greater community, the “school based survey” is a TERRIBLE design.

The biases should be obvious. Schools tend to concentrate students which are similar to one another. Students are of similar socio-economic backgrounds, ethnicity or religion. Given the fee based structure of most schools in most African countries, sampling from schools will necessarily exclude the absolute poorest of the poor. Moreover, if one does not go out of the way to select more privileged private schools, one will exclude the wealthy, an important control if one wants to draw conclusions about socio-economic status and health.

Further, schools based surveys are terrible for studies of health since the sickest kids won’t attend school. School based surveys are biased in favor of healthy children.

So, after this long intro (assuming anyone has read this far) how does this work in practice?

I have a full dataset of socio-econonomic indicators for approximately 17,000 households in an area of western Kenya. We collect information on basic household assets such as possession of TVs, cars, radios and type of house construction (a la DHS). I boiled these down into a single continuous measure, where each households gets a wealth “score” so that we can compare one or more households to others in the community ( a la Filmer & Pritchett).

Distributions We also have a data set of school based samples from a malaria survey which comprises ~800 primary school kids. I compared the SES scores for the school based survey to the entire data set to see if the distribution of wealth for the school based sample compared with the distribution of wealth for the entire community. If they are the same, we have no problems of socio-economic bias.

We can see, however, from the above plot that the distributions differ. The distribution of SES scores for the school based survey is far more bottom heavy than that of the great community; the school based survey excludes wealthier households. The mean wealth score for the school based survey is well under that of the community as a whole (-.025 vs. -.004, t=-19.32, p<.0001).

Just from this, we can see that the school based survey is likely NOT representative of the community and that the school based sample is far more homogeneous than the community from which the kids are drawn.

Researchers find working with continuous measure of SES unwieldy and difficult to present. To solve this problem, they will often place households into socio-economic "classes" by dividing the data set up into . quantiles. These will represent households which range from "ultra poor" to "wealthy." A problem with samples is that these classifications may not be the same over the range of samples, and only some of them will accurately reflect the true population level classification.

In this case, when looking at a table of how these classes correspond to one another, we find the following:

Misclassification of households in school based sample

Assuming that these SES “classes” are at all meaningful (another discussion) We can see that for all but the wealthiest households more than 80% of households have been misclassified! Further, due to the sensitivity of the method (multiple correspondence analysis) used to create the composite, 17 of households classified as “ultra poor” in the full survey have suddenly become “wealthy.”

Now, whether these misclassifications impact the results of the study remains to be seen. It may be that they do not. It also may be the case that investigators may not be interested in drawing conclusions about the community and may only want to say something about children who attend particular types of schools (though this distinction is often vague in practice). Regardless, sampling matters. A properly designed survey can improve data quality vastly.

At the KEMRI Scientific and Health Conference: what is the way forward for African research?

I didn’t hear about this until the very last minute, but was lucky enough to get the invitation letter in time to at least make it to the last day.

The Kenya Medical Research Institute (KEMRI) has, for the past five years, held a research dissemination event intended to highlight KEMRI sponsored and Kenya based research.

Research led by Africans is sadly scarce. R&D funding in SSA is the lowest in the world. In a context where so few people are able to receive an education of sufficient quality to allow post graduate studies, African researchers are few and the resources available to them are low.

Kenya has committed 2% of GDP to R%D. Contrast this with South Korea, which at one point committed 23% of GDP to R&D efforts. While KEMRI is truly a leader in the context of African research, the low level of commitment on the part of the national government makes it tiny in the context of worldwide research.

The presentations I have seen so far have been excellent, but of course, much of this research survives on the good graces of international funding and training. Most of the research presented was performed within the CDC.

So this begs the question, when will and can African countries take ownership of their research? Is this even possible given the dysfunctional nature of politics here?

The story of Africa and African identity (in a global context) is written by the rest of the world. As a foreign researcher, I quite aware that I am part of this phenomenon.

Presenters have pointed to two main issues (which I agree with). First, African countries cannot proceed to develop their research sectors (or any other sector really) unless Africans take charge of in country and continent wide research priorities. It is important to note that foreign research often takes on issues which were of importance in the colonial period (childhood infectious diseases) despite a growing burden of chronic diseases and diseases of aging which will break the budgets and economies of African countries.

While I do not suggest that attention be diverted from the incredible burden of infectious disease in African countries, it is telling that research priorities are still driven by the international community. Central Province in Kenya is quite well developed. Even my taxi drivers ask me why we don’t do research in Central, given the incredible problems of heart disease, cancer and alcoholism up there. Unless Kenyans spearhead the main issues impacting their country, these problems will go unadressed.

Second, as noted before, governments have to make firm commitments to support domestic research. As of now, African countries wait for international funding to support their projects, which shifts the conversation away from domestic priorities to international priorities. This is a tall order here, of course.

Of interest, though, besides the macro level problems of funding and support, presenters passionately call for people with Masters and PhD to use the degrees. “Why don’t you do research? What is wrong with you?”

I can’t speak to this issue effectively. But my sense is that many capable people don’t sense the urgency of doing research and lack the personal initiative to make it happen. I’ve seen it happen that researchers wait to have foreigners write their research for them, and simply wait to have their name rubber stamped on the paper, taking credit for work that they did not do. This is an unacceptable situation that we, unfortunately, enable. Certainly there are issues of experience and capability, but we shouldn’t handle capable African researchers with kid gloves, particularly this well educated young generation.

Sadly, the history of aid and foreign involvement here has set this precedent. This is an era that needs to come to an end. In the private sector, it has. In the public sector, these problems persist. Older researchers, many of whom came of age during the beginnings of the post-independence era, here are screaming that point at the top of their lungs.

Pigs and jiggers: could wild swine be spreading the awful foot burrowing flea?

OLYMPUS DIGITAL CAMERAI wanted to go and see what this jigger thing was really about so I had my guys rent a car and we drove into Mtsangatamu town. Mtsangatamu (I still can’t pronounce it properly) lies along the edge of the Shimba Hills Wildlife Reserve and, according to my data, is a hot spot for tungiasis, or infections from the so called “jigger flea.”

It is a beautiful area. Filled with tropical trees and overgrowth, the landscape looks almost uncontrollable, despite the soil being so sandy that not a drop of water stands anywhere. The air is blistering hot.

OLYMPUS DIGITAL CAMERAPeople don’t get out here much, though the packed buses that pass by every few minutes indicate that the area isn’t entirely isolated. We drop off some gas for one of our drivers, who has to slowly fill his tank, drop by drop, with the tiniest of plastic funnels. Some development project should provide proper plastic funnels to these guys.

For some reason, we drive into the bush along a foot path, until we find ourselves wedged between a number of small pine trees. “We have to walk now,” I am told while I wonder why we drove this far anyway. Walking would have been easier.

We exit the car, walk through what a patch of neatly arranged trees. A tiny tree farm. I never see this in Western, ever. Coming out, we walk into a compound laid out in a manner wholly uncharacteristic of Kenya. A two story building sporting an upstairs patio complete with a winding staircase to the top, the place looked like the type of patchwork architecture that you associate with off-gridders in the US rather than Kenyan peasants.

OLYMPUS DIGITAL CAMERA“The Mighty Paraffee: If in need call XXXXXXXXXXX”

The Mighty Paraffee turns out to be a kid of about 24, chilling out in the shade. He built this place himself, installed power, has a guest room and an upstairs shower and toilet. His room is decorated with reggae stars and pictures of the saints. Indian music is blaring out of the building. I’ve seen creative interiors from reggae fans in Kenya, but this is something else. This kid should be in architectural school. He even made sure to place the building under a giant tree to keep it cool.

I never figure out what the family does for money and no one can tell me, but the mother is exceedingly proud.

No jiggers here. We walk on. After about a kilometer, we find a poor family sitting outside their house. Children aren’t in school and no one speaks any English indicating that none of them go.

OLYMPUS DIGITAL CAMERAHassan (one or our workers) brings over a little girl and tells me to look at her feet. Fatuma is 10 years old and her feet are infested with jiggers. She says the don’t hurt much in the day, but they itch at night. Her brother apparently has them, too. Her mother and her aunt do not.

Everyone is barefoot and they all sleep in the same house. I’m wondering if there might be something about the skin which makes kids susceptible while adults are spared.

I notice a group of goats in a pen and start asking questions about animals.

OLYMPUS DIGITAL CAMERATungiasis is a zoonotic disease. It is passed from wildlife to domesticated animals to people who bring it into the household and infect their other family members. Or so it is though. Not many people have really explored the question sufficiently. Of course, this is why I’m here.

They have about 15 goats, a few chickens and I notice a young dog and a cat walking around. I ask if they ever notice whether the dog ever has jiggers. They say no.

“What kinds of wildlife do you see around here?” One of the kids was killed by an elephant last year. There are wild dogs and hyenas which come and try to get the goats. Wild pigs dig up the cassava at night.

OLYMPUS DIGITAL CAMERAPigs. That has to be it. A big mystery has been why there is such a tight relationship between distance to the park and jiggers infections. Wild pigs come out of the forest, raid the fields of the locals and get water from the river, and then recede back into the darkness before morning. 5km is approximately the distance that a pig could feasibly travel and return home in one night.

Pigs travel through and around the compound, dropping eggs, they mature and are probably picked up by dogs, but are most likely picked up by kids walking in the bush. They then bring them back home and pass them on to their family members.

Hassan associates jiggers with mango flowers, but I probe him further and find that the flowers coincide with the very dry season, which could explain why pigs are making the trek to the river and why they prefer the fields since both water and food are probably scarce in the forest.

I have to send a student out to investigate this further.

OLYMPUS DIGITAL CAMERAAn old man comes out. He looks nearly 90, but is mostly likely on 60 at most. He has arthritis in his back. He shows me his feet which are moderately infected, mostly only between the toes. He asks for medicine. I tell him I’ll send some along. He offers me some boiled cassava which I graciously take. My colleague refuses because there are no cashew nuts with it, but I suspect that he’s worried about getting sick. I become concerned.

We take some pictures and go.

On the way back, we run into an elderly lady. She’s sitting next to her husband, who is busy getting lit on homemade beer at 11 in the morning. She shows me her feet. The spaces around her feet are infested with jiggers. It must be horribly painful.

OLYMPUS DIGITAL CAMERAShe points out that she doesn’t have a whole lot of feeling in her left foot. I notice that her skin in this area is clear; the bone is visible through her skin. I ask what happened. She says that she got bitten by a snake 40 years ago. She was pregnant. I ask her if the baby was ok. “The baby is standing there!”

I consider making a joke about a snake baby, but think better of it. I’m just amazed that both of them survived. The wound was horrible looking.

Somehow, we manage to pull ourselves out of the trees and move on. There are some baboons removing mites from one another on the road on the way back, and I take some pictures. My colleague is about to pass out from the heat. I offer to drive.


Links I liked January 23, 2015

Measles cases by yearSome public/global health things that caught my eye today:

1. A visit to the most sickest town in America, a coal mining town in Virginia. Dear Republicans, pay for health care now and abandon “clean coal” or pay more later. It’s up to you. (The Atlantic)

2. How paid sick leave could boost American productivity. (CEPR)

3. Dealing with antibiotic resistance is going to take more than just technology. We can’t sit by and watch everything burn around us while we wait for new drugs to come down the pipe…. because they aren’t coming. (Project Syndicate)

4. I want to deny vaccine deniers. Generally speaking, I don’t like people who are willing to sacrifice kids for politics. Vaccine deniers stick together and increase risks for everyone. (WP) and this one, which puts it all into a nice picture for you. (WP)

5. Diseases without borders: ignoring the problem of piss poor health care in developing countries won’t help us from Jim Kim of the World Bank. (Project Syndicate)

What are we talking about when we discuss socio-economic position and health in developing countries?

OLYMPUS DIGITAL CAMERAA wide body of literature has found that socio-economic position (SEP) has profound impacts on the health status of individuals. Poor people are sicker than rich people. We find this relationship all over the world and in countries like the United States, it couldn’t be more apparent.

Poor people, particularly poor minorities, are more likely to see their children die, are more likely to be obese, have worse cardiac outcomes, develop cancer more often, are disproportionately afflicted by infectious diseases and die earlier than people who are not poor. There is ample evidence to support this.

However, the exact factors which lead to this disparity are up to debate. Some focus on issues of lifestyle, diet, neighborhood effects and access to health care. Poor people, particularly minorities, live hard, eat worse, live in dangerous or toxic environments and have low access to quality care all contributing to a perfect storm of dangerous health risks.

However, even when controlling for all or any of these factors, we still find that poor people, and particularly African-Americans, still get sick more often, get sicker and die earlier. This leads us to speculate that health disparities are not simply a matter of access to material goods which promote good health, but are tightly related to something less tangible, such as social marginalization and racism, which are both incredibly difficult to measure. Though difficult to quantify, however, we do have plenty of well documented qualitative and historical data which indicate that these relationships are entirely plausible.

The awful history of slavery and apartheid, however, is somewhat (but not completely) unique to the United States. Further, our ideas of class come from another Western idea, the Marxist concept of one privileged group exploiting the weak for their own financial gain, particularly in the context of manufacturing.

Yet, though these ways of conceiving of race and class are so specific to the West, they are applied liberally to analyses of developing country health, with little consideration of their validity.

It is not uncommon to see studies of socio-economic status and health. The typical method of measuring socio-economic status in developing countries is to examine the collection of household assets such as TVs, radios, bicycles, etc. and, using statistically derived weights, sum up all of the things a household owns and call that sum a total measure of wealth. The collection of total measures for each household are then divided into categories, with the implication that they roughly approximate our conception of class.

Not surprisingly, it is usually found that people who don’t own much are, compared with people who do, at higher risk for malaria, TB, diarrheal disease, infant and maternal mortality and a host of other things that one wouldn’t wish on anyone.

But this measure is problematic. First, there is often little care taken to parse out which items are related to the disease of interest. For example, we would expect that better housing conditions are associated with a decreased risk for malaria, since mosquitoes aren’t able to enter a house at night. We would also expect that people with access to clean water would be more likely to not get cholera. If we find relationships of SEP with malaria or diarrheal disease which include these items, these associations should be treated with suspicion.

Second, if we do find a relationship of “class” with health, can we view it in the same way in which we might view this relationship in the United States? A Marxist approach, with a few exploiting the many for profit, in sub-Saharan Africa doesn’t make a whole lot of sense. The manufacturing capacity of African countries is tiny, and most people are sole entrepreneurs operating in an economy that hasn’t changed appreciably from pre-colonial times. Stripping away any requirements of legal protection of property rights, Africa looks incredibly libertarian.

Further, the elite in Africa hardly profit financially from the poor, receiving their cash flows mainly from abroad in the form of foreign aid or bribery and foreign activity is mostly limited to resource exploitation, which doesn’t make a dent into Africa’s vast levels of unemployment. While the West is certainly complicit is Africa’s economic woes, post slavery, the West rarely engages Africans themselves.

So, is it valid to attempt to apply the same ideas of class to African health problems? Is there a way to attribute health disparities to class in societies with limited economic capacity and where the “citizenry” is only marginally engaged and groups suffer mainly from a reluctance to cooperate and engage people of other tribes or neighboring countries?

Certainly, the causes of poverty and marginalization in Africa need to be examined, but I don’t think that we can approach them in the same way we do in the States.

A brief thought on evolution: multi-generational survival


Often people will mention that we are “adapted” to do this or another thing, either indicating some crime of modernity (of course, ignoring the fact that a larger percentage of babies are surviving and people are living longer and healthier than at any time in human history) or trying to point out some example of the glaring perfection of our creation, with either an implicit or vocal reference to divine creation.

For example, obesity is attributed to fat and protein rich modern diets since we aren’t “adapted” to eat these types of foods (despite having found the food in East Africa so unpalatable that we had to learn to crush or cook it to digest it efficiently). Our bad disposition is blamed on a lack of sleep since we aren’t “adapted” to sleep as little as we do (this might be true). Most recently, a book writer blamed our problems with depression on a divorced relationship to nature, given that we are “adapted” to hunt and kill for food and then revel over the blood stained corpse (of course, the writer doesn’t consider that people in antiquity might have been depressed, too).

There may be some truth to some of this. However, “adaptation” implies something about the individual, when evolution, in fact, is about reproduction. We aren’t “adapted” to anything. Rather, certain traits are selected for based on the survival of at least two generations of living things, at least for complex social animals like ourselves.

Simply surviving as an individual does not insure the survival of a species. Living things must first survive long enough to reproduce and then, at least in humans, insure that the children make it to reproductive age. Human babies are horribly weak in contrast to sharks, which are ready to go even before they leave the mother. Further, in the case of humans, a full three generations must live at once to insure long term survival.

Thus, we maintain a tenuous relationship with out environment, where traits do not necessarily favor a single individual, but rather an entire family unit, and these traits may or may not imply perfect harmony with an environment, but rather do the job at least satisfactorily.

Nature cares little for quality as numerous examples throughout our physiology show. To claim that we are somehow “perfectly suited” to a specific environment is just simply wrong. Merely, we have come to a brokered peace (after generations of brutal trial and error, what we eat today is thanks to the deaths of millions, mostly children, who had to die to allow us to do so) with wherever we live in order to allow a few of our kids and grandkids to survive.

This, of course, has deep implications for public health. Some public health problems are known to be passed down from parents to children, but in a multi-generational evolutionary framework, it is possible that certain public health problems can be passed through 3 or more generations at a time, complicating interventions. Certainly, the multi-generational health problems of the descendants of African slaves can be an example of this. How can we intervene to protect the public health over a full century?

OK, back to work.

Development as a faith-based activity: the role of the RCT in alleviating poverty

DSC_0060The essence of epidemiologic field trials is the RCT (randomized control trial). A random set of people get some sort of treatment (like a new drug), another random set of people don’t and we compare the results. It’s pretty simple stuff.

The trouble with RCTs is that they don’t necessarily work well when people from the two groups are able to influence each other’s outcomes. As a simple example, a trial of a vaccine which prevents people from getting infected with some pathogen might have impacts on people who don’t get the vaccine, since the number of opportunities for transmission are reduced. This is a welcome outcome (and may even be the point of the study), but it doesn’t help us to understand exactly how effective the vaccine is in the individuals who actually receive the vaccine.

Many RCTs make the (flawed) assumption that individuals are independent entities, following a long tradition of statistical analysis. This is a reasonable assumption to make in some cases, but entirely wrong in others (i.e. most public health outcomes).

Development economists have recently adopted the RCT as a means of evaluating the effectiveness of programs intended to relieve poverty or improve human well-being. On the surface, there’s nothing wrong with adopted public health methods to deal with economic problems, as most public health problems have their roots in economics. Jeff Sachs, or course, would argue that many economic problems have their roots in public health problems.

The major problem with RCTs is that while we do our best to control for all of the possible other factors that might impact outcomes given a particular treatment, without a trove of detailed data and prior knowledge of context and contingencies, we really have no idea at all whether and how some public health intervention works. Epidemiology tends to fall back on the “reasonable suspicion” argument, backing up claims of effectiveness with potentially reasonable assumptions of causal pathways. This is clearly quite easy when doing drug trials, where animals models and a century-plus of medical research has given us a reasonably clear pictures of the pathophysiological pathways that might lie between drug and outcome.

But with issues of human behavior and economics (which is essentially a science which seeks to uncover mysteries of human behavior), the causal pathways are much more difficult to assess and the factors which lie between intervention and outcome are for more difficult to measure. For example, assessing the outcome of an education program on reproductive behavior is really, really difficult without monitoring all of the possible things that happened between the time that a woman attended an NGO sponsored event at a clinic and the time when she chose to use or not use a condom. In fact, we can’t even really verify that she used the condom, since we weren’t around to observe it.

But we assume, and assume to the point of falling back to faith that our efforts did what we intended them to do.

Lant Pritchett, a Harvard economist that I’m a great fan of for his work on economic measurement in developing countries, penned an interesting article on the website of the Center for Global Development seemingly questioning the merit of the RCT as an rigorous and necessary evaluation tool for poverty alleviation development programs.

First of all, the argument that RCTs had, until recently, been used sparingly, if at all, and yet are important in achieving good outcomes sits in kind of embarrassing counterpoint with the obvious fact that lots of countries have really good outcomes. That is whether one uses the Human Development Index or the OECD Better Life Index or any social indicator—from poverty to education to health to life satisfaction—there is a similar set of countries near the top. (In the HDI the top five are Norway, Australia, USA, Netherlands, and Germany. In the OECD Better Life Index they are Australia, Sweden, Canada, Norway, and Switzerland.) No one has ever made the arguments that these countries are developed and prosperous because they used rigorous evidence—much less RCTs—in formulating policy and programs. While one might have faith that RCTs can help along the path to development, RCTs didn’t help for those that are there now.

It is very true that development in the United States occurred without the help of RCTs. In fact, malaria elimination in the United States occurred without any of the complex set of interventions that we’re so desperately selling to malaria-endemic countries. It’s even true that, despite more than a decade of research on ITNs, that we aren’t really sure whether the declines in malaria that we’ve seen all over Sub-Saharan Africa are due to ITNs or just simply due to processes associated with urbanization and development (as in the US). Actually, a lot of research is telling us that the declines in malaria might be false and that we are simply suffering from a paucity of accurate measurement in malaria endemic countries.

And this is where Pritchett comes in. He’s right. Research in developing countries is inherently challenging to the point where the conclusions we draw from research are somewhat contentious at best, and the result of blind faith at worst.

But coarse and incomplete data and loose assumptions shouldn’t discourage public health (or even economic) professionals from doing research in developing countries. While I have issues with the condescending, neo-classical nature of RCTs in economics (another discussion, but can a peasant lady’s behavior in Western Kenya be reduced to that of Homo economicus? ), the truth is that policy makers don’t care about data. They care that people are making the case for action in an impassioned and convincing way. While academics should strive to be as rigorous as possible, the sell won’t happen based on our complex data collection strategies and statistical methodologies. They (and the public) are convinced through impassioned calls for action.


