Modeling Human Behavior and The Rise of Big Data

8450190120_479862a08b-2In the wake of the 2012 Presidential election, hundreds of Internet users asked a very important question: “Is Nate Silver a witch?” The accusations of witchcraft came after Silver, statistician and blogger for the New York Times, predicted the winner of all 50 states with 100% accuracy [1]. Following his perfect prediction of the outcome of all 38 Senate races and correctly calling 49 out of 50 states in the 2008 presidential election cycle, this victory for the power of statistics has lead many to declare that the age of Big Data has arrived [2].

The age of Big Data refers to the increasing tendency for scientists and statisticians to utilize vast amounts of data to computationally model and predict human behavior. Data is drawn from an array of sources ranging from the chatter of the Twitterverse to the meticulously gathered databases of academic and government organizations. Analysis that utilizes big data is a shift away from theories and models based on a small number of case studies, towards the usage of as much data as possible to try and tease out patterns of human behavior. Elections are well suited to statistical analysis and have long been a target of prediction. The mantra of Big Data is that all human behavior is a subject for creative and innovative statistical analysis. The scientists of this new age examine correlation and create statistical models to predict and analyze everything from human voting patterns to—even more interestingly—musical creativity and human intelligence.

The idea behind this new trend is that if we can gather enough data, a sufficiently advanced statistical model will allow scientists and statisticians to tease out an explanation of human behavior from the numbers that we leave in the wake of our everyday lives.  The hope is that a complex understanding of humans, from patterns of behavior to creativity, is waiting to be uncovered.

The increasing ability of statistical models to predict human behavior has caught the eye of both the public at large and the United States Military. The Department of Defense is currently investing in several programs aimed at predicting insurgent behavior in Afghanistan and Pakistan. One of the more successful programs, the Spatio-Cultural Abductive Reasoning Engine, or SCARE, is able to predict the areas most vulnerable to attack as well as the likely location of weapons and ammunition caches [3].

Insurgents often mobilize support by tapping into preexisting social and religious networks. Doing so allows the insurgents to hide operatives and weapons within certain cultural regions in different cities. SCARE works by analyzing data from street maps, previous bombings, and the ethnic and religious makeup of individual neighborhoods in different regions to map out the areas most likely to be hiding grounds for insurgents and the areas at the most risk of an insurgent attack [4]. So far, SCARE has been shown to be able to predict the location of insurgent weapons caches to within a 700-meter range of accuracy.

Haunted by the disastrous counter insurgency tactics of Vietnam and the early years of the War on Terror, the military is more invested than ever in trying to lead a smarter, leaner, and more successful counterinsurgency. But as doomed efforts in Vietnam have shown, counterinsurgencies that become mired in rigid formulations and fail to creatively adapt to new situations ultimately fail. Insurgencies thrive on resourcefulness and adapting very quickly to circumstances, far more quickly than the bureaucratic American military. If the military comes to rely solely on programs like SCARE, then a simple tactical change by the insurgents renders the program useless. The creativity element of human tactics is lost on the machine, no matter how advanced it is. SCARE works on the premise of predicting insurgent locations based on past behavior. If the insurgents were to shift where and how they hide within an urban environment, then SCARE would need to fundamentally change the way in which it analyzes the data provided. And as more and more programs and models attempt to predict the action of future insurgents, this element of human creativity becomes more important.

If a computer program such as SCARE could have the creative capacity of the native human mind, if it could change and react in the same way as well-trained counterinsurgents, then many of the limitations of SCARE could be surpassed. But are computer programs able to capture the unpredictability and creative aspect of the human mind? One of the more interesting results of the advent of Big Data is the success that researchers have made with regards to the creation of creative AI programs.

David Cope, a music professor at the University of California, Santa Cruz, is a well-respected composer, but when he hit a block in the 1980’s, he turned to his other love, computer science. He began experimenting with composing music using a program called EMI, or Experiments in Musical Intelligence. At first, the program just generated random musical tones and melodies. Dissatisfied, Cope kept tinkering and refining the program. He realized humans don’t compose in a vacuum; they draw on centuries of musical teaching and tradition. Approaching the problem in the same way that SCARE approaches the problem of insurgency, Cope turned to the world of Big Data. He built a massive database of thousands of musical compositions and wrote algorithms to analyze the music, starting first with short chorales, and later moving on to analyze more complex and original works. Cope “taught” the program the patterns that make great music great and was soon generating original compositions through EMI [5]. An updated EMI program, named Emily Howell, has released two albums [6].

Cope created something extraordinary, a program capable of modeling and emulating the uniquely human pursuit of musical creativity. Many researchers have followed him into the growing field of artificial creativity. As this technology grows and learns, it has serious potential to change the face of science, social science, war, and even art and music. Yet, while Cope uses EMI to assist with his compositions, he is ultimately still at the helm of the creation process. No matter how advanced the programs become, they are still being guided and taught by humans. The computers learn only with the human hand guiding the process. Creativity and originality originates not in the program itself, but is brought about by the human hand. No matter how much data is gathered, it can’t account for the unpredictability of the truly original. No matter how tempting it is to turn the world over to the models created through Big Data, awe-inspiring predictions happen when the models are correct, but the human hand is needed to ensure that the models remain accurate.

1. Silver, Nate. 2012. “FiveThirtyEight’s 2012 Forecast,” The New York Times, November 6, 2012
2. “Nate Silver,” Wikipedia.
3. P. Shakarian, M. Nagel, B. Schuetzle, V.S. Subrahmanian. Abductive Inference for Combat: Using SCARE-S2 to Find High-Value Targets in Afghanistan – IAAI, 2011.
4. “The science of civil war: What makes heroic strife.” The Economist. April 21, 2012
5. Wilson, Chris. “I’ll Be Bach.” Slate Magazine. May 19, 2010,
6. “Emily Howell,” Wikipedia.

Will Craft is a second year at The University of Chicago majoring in Political Science and minoring in Physics. He is also interested in philosophy, cognitive science, and artificial intelligence. Follow The Triple Helix Online on Twitter and join us on Facebook.