Doing research in developing countries is not easy. However, with a bit of care and planning, one can do quality work which can have an impact on how much we know about the public health in poor countries and provide quality data where data is sadly scarce.
The root of a survey, however, is sampling. A good sample does its best to successfully represent a population of interest and can at least qualify all of the ways in which it does not. A bad sample either 1) does not represent the population (bias) and no way to account for it or 2) has no idea what it represents.
Without being a hater, my least favorite study design is the “school based survey.” Researchers like this design for a number of reasons.
First, it is logistically simple to conduct. If one is interested in kids, it helps to have a large number of them in one place. Visiting households individually is time consuming, expensive and one only has a small window of opportunity to catch kids at home since they are probably at school!
Second, since the time required to conduct a school based survey is short, researchers aren’t required to make extensive time commitments in developing countries. They can simply helicopter in for a couple of days and run away to the safety of wherever. Also, there is no need to manage large teams of survey workers over the long term. Data can be collected within a few days under the supervision of foreign researchers.
Third, school based surveys don’t require teams to lug around large diagnostic or sampling supplies (e.g. coolers for serum samples).
However, from a sampling perspective, assuming that one wishes to say something about the greater community, the “school based survey” is a TERRIBLE design.
The biases should be obvious. Schools tend to concentrate students which are similar to one another. Students are of similar socio-economic backgrounds, ethnicity or religion. Given the fee based structure of most schools in most African countries, sampling from schools will necessarily exclude the absolute poorest of the poor. Moreover, if one does not go out of the way to select more privileged private schools, one will exclude the wealthy, an important control if one wants to draw conclusions about socio-economic status and health.
Further, schools based surveys are terrible for studies of health since the sickest kids won’t attend school. School based surveys are biased in favor of healthy children.
So, after this long intro (assuming anyone has read this far) how does this work in practice?
I have a full dataset of socio-econonomic indicators for approximately 17,000 households in an area of western Kenya. We collect information on basic household assets such as possession of TVs, cars, radios and type of house construction (a la DHS). I boiled these down into a single continuous measure, where each households gets a wealth “score” so that we can compare one or more households to others in the community ( a la Filmer & Pritchett).
We also have a data set of school based samples from a malaria survey which comprises ~800 primary school kids. I compared the SES scores for the school based survey to the entire data set to see if the distribution of wealth for the school based sample compared with the distribution of wealth for the entire community. If they are the same, we have no problems of socio-economic bias.
We can see, however, from the above plot that the distributions differ. The distribution of SES scores for the school based survey is far more bottom heavy than that of the great community; the school based survey excludes wealthier households. The mean wealth score for the school based survey is well under that of the community as a whole (-.025 vs. -.004, t=-19.32, p<.0001).
Just from this, we can see that the school based survey is likely NOT representative of the community and that the school based sample is far more homogeneous than the community from which the kids are drawn.
Researchers find working with continuous measure of SES unwieldy and difficult to present. To solve this problem, they will often place households into socio-economic "classes" by dividing the data set up into . quantiles. These will represent households which range from "ultra poor" to "wealthy." A problem with samples is that these classifications may not be the same over the range of samples, and only some of them will accurately reflect the true population level classification.
In this case, when looking at a table of how these classes correspond to one another, we find the following:
Assuming that these SES “classes” are at all meaningful (another discussion) We can see that for all but the wealthiest households more than 80% of households have been misclassified! Further, due to the sensitivity of the method (multiple correspondence analysis) used to create the composite, 17 of households classified as “ultra poor” in the full survey have suddenly become “wealthy.”
Now, whether these misclassifications impact the results of the study remains to be seen. It may be that they do not. It also may be the case that investigators may not be interested in drawing conclusions about the community and may only want to say something about children who attend particular types of schools (though this distinction is often vague in practice). Regardless, sampling matters. A properly designed survey can improve data quality vastly.
I was just reading a post from development economist Ed Carr’s blog, where he reflects on a book he wrote almost five years ago. Reflection is a pretty depressing excercise for any academic, but Carr seems to remain positive about his book.
He sums it up in three points:
“1. Most of the time, we have no idea what the global poor are doing or why they are doing it.
2. Because of this, most of our projects are designed for what we think is going on, which rarely aligns with reality
3. This is why so many development projects fail, and if we keep doing this, the consequences will get dire”
Well, yeah. This is a huge problem. In academics, we filter the experiences of the poor through a lens of academic frameworks, which we haphazardly impose with often no consultation with our subjects. Granted, this is likely inevtiable, but when designing public health interventions, it helps to have some idea of what the poorest of the poor do and why or our efforts are doomed to fail.
I remember a set of arguments a few years back on bed nets. Development and public health people were all upset because people were seen using nets for fishing. The reaction, particularly from in country workers was that poor people are stupid and will shoot themselves in the foot at any opportunity.
I couldn’t really understand the condescension and was rather fascinated that people were taking a new product and adapting it to their own needs. Business would see this as an opportunity and would seek to figure out why people were using nets for things other than malaria prevention and attempt to develop some new strategy to satisfy both needs (fishing and malaria prevention) at once. Academics simply weren’t interested.
To work with the poor, we have to understand them and understanding them requires that we respect their agency. If we don’t do this, we risk alienating the people we seek to help.
New Publication (from me): “Insecticide-treated net use before and after mass distribution in a fishing community along Lake Victoria, Kenya: successes and unavoidable pitfalls”
This was was years in the making but it is finally out in Malaria Journal and ready for the world’s perusal. Done.
Insecticide-treated net use before and after mass distribution in a fishing community along Lake Victoria, Kenya: successes and unavoidable pitfalls
Peter S Larson, Noboru Minakawa, Gabriel O Dida, Sammy M Njenga, Edward L Ionides and Mark L Wilson
Insecticide-treated nets (ITNs) have proven instrumental in the successful reduction of malaria incidence in holoendemic regions during the past decade. As distribution of ITNs throughout sub-Saharan Africa (SSA) is being scaled up, maintaining maximal levels of coverage will be necessary to sustain current gains. The effectiveness of mass distribution of ITNs, requires careful analysis of successes and failures if impacts are to be sustained over the long term.
Mass distribution of ITNs to a rural Kenyan community along Lake Victoria was performed in early 2011. Surveyors collected data on ITN use both before and one year following this distribution. At both times, household representatives were asked to provide a complete accounting of ITNs within the dwelling, the location of each net, and the ages and genders of each person who slept under that net the previous night. Other data on household material possessions, education levels and occupations were recorded. Information on malaria preventative factors such as ceiling nets and indoor residual spraying was noted. Basic information on malaria knowledge and health-seeking behaviours was also collected. Patterns of ITN use before and one year following net distribution were compared using spatial and multi-variable statistical methods. Associations of ITN use with various individual, household, demographic and malaria related factors were tested using logistic regression.
After infancy (<1 year), ITN use sharply declined until the late teenage years then began to rise again, plateauing at 30 years of age. Males were less likely to use ITNs than females. Prior to distribution, socio-economic factors such as parental education and occupation were associated with ITN use. Following distribution, ITN use was similar across social groups. Household factors such as availability of nets and sleeping arrangements still reduced consistent net use, however.
Comprehensive, direct-to-household, mass distribution of ITNs was effective in rapidly scaling up coverage, with use being maintained at a high level at least one year following the intervention. Free distribution of ITNs through direct-to-household distribution method can eliminate important constraints in determining consistent ITN use, thus enhancing the sustainability of effective intervention campaigns.
In 2012, my friend Akira and I went hiking in the mountains outside Osaka. It was a pretty easy hike, but on the way down Akira twisted his ankle and sort of lumbered down the rest of the trail. After a few days, the pain got worse and he had to cancel an upcoming research trip to Vanuatu. He asked me to go in his place and offered to pay my expenses. I was due to go on a couple of other research trips that summer so I couldn’t commit, but the only other gringo on the trip begged me and at the last minute I decided to go.
Long story short, it was a crazy set of interpersonal dynamics, we suffered bacterial infections, got stuck on an island for ten days because a plane needed to be repaired, one of us didn’t eat or drink water for ten days, much fish was eaten (but the people who ate), much kava was drank and stories were told. Our diet alternated between delicious seafood and fresh fruits to ramen noodles over rice.
It was a surreal experience. I lost ~16 pounds, down from 175 to 159, came back with numerous skin infections and was a general physical wreck for months, more so than usual. It was challenging, but an experience I am unlikely to forget. I hope to go back one day.
The paper can be found here.
Pictures from Vanuatu (back when I took pictures) are here.
Insecticide-treated nets (ITNs) are an integral piece of any malaria elimination strategy, but compliance remains a challenge and determinants of use vary by location and context. The Health Belief Model (HBM) is a tool to explore perceptions and beliefs about malaria and ITN use. Insights from the model can be used to increase coverage to control malaria transmission in island contexts.
A mixed methods study consisting of a questionnaire and interviews was carried out in July 2012 on two islands of Vanuatu: Ambae Island where malaria transmission continues to occur at low levels, and Aneityum Island, where an elimination programme initiated in 1991 has halted transmission for several years.
For most HBM constructs, no significant difference was found in the findings between the two islands: the fear of malaria (99%), severity of malaria (55%), malaria-prevention benefits of ITN use (79%) and willingness to use ITNs (93%). ITN use the previous night on Aneityum (73%) was higher than that on Ambae (68%) though not statistically significant. Results from interviews and group discussions showed that participants on Ambae tended to believe that risk was low due to the perceived absence of malaria, while participants on Aneityum believed that they were still at risk despite the long absence of malaria. On both islands, seasonal variation in perceived risk, thermal discomfort, costs of replacing nets, a lack of money, a lack of nets, nets in poor condition and the inconvenience of hanging had negative influences, while free mass distribution with awareness campaigns and the malaria-prevention benefits had positive influences on ITN use.
The results on Ambae highlight the challenges of motivating communities to engage in elimination efforts when transmission continues to occur, while the results from Aneityum suggest the possibility of continued compliance to malaria elimination efforts given the threat of resurgence. Where a high degree of community engagement is possible, malaria elimination programmes may prove successful.”
In my seminal paper, “Distance to health services influences insecticide-treated net possession and use among six to 59 month-old children in Malawi,” I indicated that Euclidean (straight line) measures of distance were just as good as more complicated, network based measures.
I didn’t include the graph showing how correlated the two were, but I wish I had and I can’t find it here my computer.
Every time I’ve done presentations of research of the association of distances to various things and health outcomes, someone inevitably asks why I didn’t use a more complex measure of actual travel paths. The idea is that no one walks in a straight line anywhere, but rather follows a road network, or even utilizes a number of transportation options which might be lost in a simple measure.
I always respond that a straight line distance is as good as any other when investigating relationships on a coarse scale. Inevitably, audiences are never convinced.
A new paper came out today, “Methods to measure potential spatial access to delivery care in low- and middle-income countries: a case study in rural Ghana” which compared the Euclidean measure with a number of more complex measurements.
The conclusion confirmed what I already knew, that the Euclidean measure is just as good in most cases, and the pain and cost of producing sexy and complicated ways of calculating distance just isn’t worth it.
It’s a pretty decent paper, but I wish they had put some graphs in to illustrate their points. It would be good to see exactly where the measures disagree.
Access to skilled attendance at childbirth is crucial to reduce maternal and newborn mortality. Several different measures of geographic access are used concurrently in public health research, with the assumption that sophisticated methods are generally better. Most of the evidence for this assumption comes from methodological comparisons in high-income countries. We compare different measures of travel impedance in a case study in Ghana’s Brong Ahafo region to determine if straight-line distance can be an adequate proxy for access to delivery care in certain low- and middle-income country (LMIC) settings.
We created a geospatial database, mapping population location in both compounds and village centroids, service locations for all health facilities offering delivery care, land-cover and a detailed road network. Six different measures were used to calculate travel impedance to health facilities (straight-line distance, network distance, network travel time and raster travel time, the latter two both mechanized and non-mechanized). The measures were compared using Spearman rank correlation coefficients, absolute differences, and the percentage of the same facilities identified as closest. We used logistic regression with robust standard errors to model the association of the different measures with health facility use for delivery in 9,306 births.
Non-mechanized measures were highly correlated with each other, and identified the same facilities as closest for approximately 80% of villages. Measures calculated from compounds identified the same closest facility as measures from village centroids for over 85% of births. For 90% of births, the aggregation error from using village centroids instead of compound locations was less than 35 minutes and less than 1.12 km. All non-mechanized measures showed an inverse association with facility use of similar magnitude, an approximately 67% reduction in odds of facility delivery per standard deviation increase in each measure (OR = 0.33).
Different data models and population locations produced comparable results in our case study, thus demonstrating that straight-line distance can be reasonably used as a proxy for potential spatial access in certain LMIC settings. The cost of obtaining individually geocoded population location and sophisticated measures of travel impedance should be weighed against the gain in accuracy.
Was reading Chris Blattman’s list of books that development people should read but don’t and found this in the Amazon description of “The Anti-Politics Machine: Development, Depoliticization, and Bureaucratic Power in Lesotho.”
Development, it is generally assumed, is good and necessary, and in its name the West has intervened, implementing all manner of projects in the impoverished regions of the world. When these projects fail, as they do with astonishing regularity, they nonetheless produce a host of regular and unacknowledged effects, including the expansion of bureaucratic state power and the translation of the political realities of poverty and powerlessness into “technical” problems awaiting solution by “development” agencies and experts.
Note that I do not harbor any ill will toward development or even, as a general rule, “technical solutions.” Having been involved with bed net distributions and having watched the outcomes of reproductive health interventions, for example, I can say that there are many positive outcomes of development projects. In my area, fewer kids are dying and women are becoming pregnant a whole lot less, decreasing the risk of maternal mortality.
Disclaimers aside, there is no doubt that development projects often fail for a number of reasons, the first of which is that leaders have no interest in seeing that they succeed. While leaders are indifferent to the outcomes, they happily take on the power that comes with them, embracing bureaucratic reforms, which are mostly just expansions of power at all levels of government.
This wouldn’t necessarily be a bad thing, except that African countries never embraced many of the protections of individual rights which restrict the powers of the state. Independence movements in much of Africa was predicated on an eventual return of power to the majority. Not many (none?) of these movements sought to protect the rights of the minority, much less the individual. Thus, there is little restriction on the types of rules which may be created and since many of these development projects influence policy, development projects unwittingly feed into the autocracy machine.
In the past, surveys were done on paper, either through a designed questionnaire or by someone frantically writing down interview responses. When computers came around, people would be hired to type in responses for later analysis.
Nowadays, with the advent of cheap and portable computing, research projects are rapidly moving toward fully digital methods of data collection. Tablet computers are easy to operate, can be cheaply replaced, and now can access the internet for easy uploading of data from the field.
Surveyors like them because large teams can be spread out over a wide space, data can be completely standardized and the tedious process of data entry can be avoided.
Of interest to me, however, is whether the technology is influencing the nature of the responses given. That is, will someone provide that same set of responses in a survey using digital data collection methods as in a paper survey?
Recently, we attempted using a tablet based software for a small project on livestock possession and management on Mbita Point in Western Kenya. I intended it as a test to see if a particular software package might be a good fit for another project I`m working on (the one that`s paying the bills).
We had only limited success. The survey workers found the tablets clunky and a number of problems with the Android operating system made it more trouble than the survey was actually worth. Of interest, though, was how the technology distracted the enumerators from their principle task, which was to collect data.
Enumerators would become so wrapped up in trying to navigate the various buttons and options of the software that they couldn`t effectively concentrate on performing the survey. Often they appeared to skip questions out of frustration or would just frantically select one of the many options in the hope of moving on to the next one.
In a survey of more than 100 questions, the process started taking far more time than households were willing to give. We eventually had to abandon the software and revert to a paper based method.
Surveys went from lasting more than one hour, to taking under 30 minutes. Workers were more confident and had more time to interact with the respondents. Respondents had more of an opportunity to ask questions and consider the meaning of what they were being asked. They offered far more information than we expected and felt that they were participating in the survey as a partner and not just as a passive victim.
One of our enumerators noted that people react differently to a surveyor collecting data on the tablets than with paper. She described collecting data with technology as being “self absorbed” and alienating to the respondents. Collecting data on paper, however, was seen as a plus. “They can see me writing down what they say and feel like their words are important.”
I`m thinking that the nature of the responses themselves might be different as well. Particularly with complex questions of health and disease, often the surveyors will have to explain the question and give a respondent a chance to ask for further clarification. Technology appears to inhibit this process, perhaps compromising the chance for a truly reasoned response.
While I am absolutely not opposed to the use of technology in surveys, I think that the survey strategy has to be properly thought through and the challenges considered. At the same time, however, data collection is a team effort and requires a proper rapport between community members and surveyors who often know each other.
Is technology restricting our ability to gather good data? Could the use of technology even impact the nature of the response by pushing them in ways which really only tell us what we want to believe rather than what actually exists?