As it is something I don’t know a whole lot about, I recently got the bright idea to to start working with social network analysis in infection transmission. A search of the literature turned up a few interesting gems, mainly of infection transmission through sexual networks, but little in the way of actual data. There were plenty of boiled down examples of other people’s data, but they don’t post the data for people to play with. I could easily simulate some data. A network analysis software package, UCINET, has a feature to create a random network. However, I felt this to be cheating and desired to get my hands on something real.
In a rare moment of spontaneity, I posted a call for study subjects through the School of Public Health’s open student mailing list. Surprisingly, I got about 65 responses of people willing to expose their contact networks for a day. Picking a single day, I asked the group of volunteers to fill out a form stating whether they had contact any of the other volunteers on the list and how many people they may have had contact with who were not on the list. I was intrigued by how many people were willing to answer the survey and return them to me with out compensation of any kind. I was also surprised at how difficult it is to create a survey that provides you with exactly the type of data you want in the format you wish.
The basic network of people who were on the list looks like this:
At first I was worried that the data might be worthless, due to the lack of overlap in volunteers or possibly due to too much overlap, as might be the case if all of the people on the list have a class together on the study day. However, the network appears intuitive, and knowing the individuals on the list, the clustering present is logical. The circles in the top corner are isolates who had no contact with people in the group. The red dot in the center is me. I had a wide variety of contacts since I was the one doing the survey. Although scientifically, it might not make sense to include my self in the contact network, I do have contact with many of the people on the list regularly, so I could as a member. The clump to the left of me is primarily Epid PhD students, of which I’m a member. It has to be said that they provided the most concise data.
Including the contacts people had that were not on the list, we can see that the results get a little more interesting:
In addition to the cool looking patterns, we can see that many people have contacts well outside the immediate study group. In fact, the people in this study had a mean of 20 contacts per person for the single day. The contact distribution was highly skewed, with some people having as high as 150 contacts for the single day. Contact rates varied by department and by degree. The colored circles represent what’s called “K-cores”, that is groups that are more connected to one another than with other groups. Here, in this case, the K-cores roughly turn out to represent differernt departments represented by the study group. In fact, it’s is fairly surprising how well it pegged individuals into their respective groups. It even positioned me right between Epid and Biostat. I am one of the larger blue dots up top, and Epid (blue) and Biostat (green) are to the left and right of me. HBHE is mostly scattered black dots. The size of the dot represents the relative proportion that the member constitutes of the K-core.
Most of the study group were folks from the School of Public Health. HBHE is by far the most connected of all the departments. Other (a mixture of departments spread around campus) was the least connected of the people who bothered to report their affiliation. Not surprisingly, PhD students reported a lower level of contact than Master’s students, a difference confirmed by a Wilcoxon test with a p-value of .033.
Grad school is marked by long periods of isolation and silence. To make up for a lack of social life (and skills), I, like many others, resort to creating semblances of social circles through social networking sites like Facebook. In doing research for a paper I’m working on, I am reading up on network theory and happened upon some software that allows me to export my friends list from FB with all of the internal connections between them. UICNET (http://www.analytictech.com/ucinet/) is a small but powerful program designed to perform social network analysis. Basically, it takes all the connections between people you know and is able to draw them for display using a number of criteria. The raw list read into it ends of looking like this:
I have approximately 300 friends, most of which I actually know. Using a neutral criteria for display, I can see distinct groups from various points of my life. There’s an EMU group in the bottom right corner, a University of Michigan Group on the left, and a huge clump of people that are mostly music related, i.e. pre-grad school. Within the pre-grad school group, I can see distinct clouds of varying time points, Boston, Noise related folks, Mississippi folks and some others. There are some isolates, basically people I know from various disconnected events such as the time I spent in Germany and students of mine from JCC, among others. I find it interesting that Joe Kacemi is my bridge from UM to my cloudy music world. Thom Klepach is my link from EMU to my music cloud. Without Thom and Joe, there would not be a single link between pre-grad school music world and my grad school life.
Using the software to isolate 6 specific “Factions” within the entire list I am able to produce this:
Now, it’s much more clear. There is one group of complete isolates, a random bag that combines Boston and EMU, some odd group consisting of people that I rarely talk to, a Mississippi group, a University of Michigan group, a collection of msuic related people that I knew from Boston and San Francisco (basically, 99-2002) and a group of Michigan music people. Mostly, it makes sense in the grand scheme of my life. The amalgam of Boston, EMU and Canada is quite strange, however. I think that the level of connections between the major music-related groups is fascinating. The program divides them up into two distinct factions, despite the large number of connections between the two.
Finally, I did an eignevalue analysis and got this one:
The only reason I present this is because, oddly, all three members of Wolf Eyes appear to form the base of the organized list. I’m not sure why. Basically, according to this, all of my social relationships start with Aaron Dilloway.
In trying to decide what courses to take next semester, I did some exploring on complex systems modeling. I created a lame model of two distinct classes of people (or things) who have a certain dislike for one another and prefer to be surrounded with a certain number of like individuals within an arbitrary radius. An distributed number of individuals of both categories are initially randomly placed on a grid. Each individual then scans the area of certain radius around them, figures the percentage of individuals like them relative to the total number of individuals within the scan radius, then makes a decision to move based on an arbitrary threshold. The individual them moves to some randomly chosen open space on the grid within a certain move radius.
I am assuming that the model represents to distinguishable groups who have some dislike for one another. Consider African Americans and white people, Hutus and Tutsis, poor people and rich people, Republicans and Democrats, etc. Upon seeing that an unacceptable percentage of the people around them are of the other category, they then can only move within a certain distance, assuming that resources are limited or moving too far will remove them from some desirable geographic proximity to work, resources, etc.
The model is quite simple, but the results are rather interesting. First we start with a 100 x 100 grid, yielding 10,000 possible occupable spaces. We assume 5000 total individuals, and a 50/50 distribution of each group. Placing them randomly, we obtain an initial grid that looks something like this:
Blue represents an unoccupied space, and yellow and red represent spaces occupied by one of the groups.
I started by assuming that individuals would not tolerate any less than 50 percent representation of their group within a radius of 10 squares. If they happen to occupy a square where the percentage of their own group compared to the total number of individuals within 10 squares is less than 50 percent, they will move to a randomly selected open square somewhere within a 20 square radius of their present position. I repeat this process 25 times. At the end, we see that even after 25 steps, people have already formed segregated clusters of individuals that are not necessarily contiguous. In fact, the entire grid is completely segregated after a mere 10 steps:
Adjusting the parameters a bit, we increase the percentage that people will tolerate to 80%. We can see that given a higher level of “racism” and a dense population, groups have a more difficult time clustering and are thus relegated to a life of constant movement and avoidance, with no resolve. I found this behavior to be true given a smaller population, and even a wider radius of movement. Given a higly level of intolerance for the other group, individuals have a difficult time forming clusters but there is little stability.
Assuming a high tolerance for the other group, leads to the opposite effect, leaving more individuals happy with their present position and less willing to move. This leads to high stability and less segregation, as one would expect.
The “sweet spot” for total segregation appeared to be approximately 50% tolerance. Individuals are happy as long as they make up 50% of the community, but this level of tolerance leads to the highest level of segregation overall. It is assumed that if an individual were to randomly move to an area occupied by the other group, they would immediately move as they felt overwhelmed by the presence of a majority that consisted of individuals of a group other than their own.
I also ran a model assuming that one group only made up for a quarter of the overall population and a moderate level of racism. The results were interesting. The minority group was forced to maintain a nomadic existence while the majority group hardly moved at all. When adjusting for extreme levels of racism, the majority group clusters almost immediately and the whole grid basically becomes a segregated urban area after approximately 25 steps. I call this the Jackson, Mississippi model.
My conclusions were simple and expected, but I was surprised that even this simple model was able to bear them out. High levels of racism lead to high levels of instability but low clustering due to the random nature of movement in the model. Low levels of racism lead to low clustering, but high stability of movement. Moderate levels of racism that we likely see within the US, lead to high levels of segregation and clustering with high levels of stability as people, once they have clustered are unlikely to leave what they consider to be a favorable situation. When creating a less than even distribution of groups, the minority group must maintain a nomadic state of existence, while the majority group remains fixed. Having a high level of racism under these conditions, creates a segregated society as seen above demonstrating the interplay between racist attitudes and imbalance in group representation.