Big Data

Why Big Data Is Good At Puzzles, But Bad At Mysteries

JimYou might say I’m a contrarian. Maybe it was my studies in philosophy when I was in college; maybe it was my strict upbringing; maybe it was 11 years of Catholic school. If you asked my wife, she would just say that I like to argue, regardless of the subject. Whatever the source, when I’m confronted by irrational exuberance and unbridled certainty for an idea, my immediate response is to bristle and offer a counter.

This happens a lot when I’m faced with the regular proclamations in the digital media industry. Some fabulous new technology or app is anointed a hedge against the fundamental alteration of human behavior slipping along the shifting sands of media consumption. Nothing gets me fired up more than the wild-eyed, People’s Temple-esque declarations about Big Data.

Big Data knows all. It sees all. And it will cure all. Because of big data we will serve better, more relevant ads to more people in ways that will make them love us.

This is the kind of thing we say to each other in the business. The problem is it misses the fundamental other half of the question marketing seeks to address. Big data gets at the “what;” but it doesn’t do much getting at the “why.”

The “consumers want better, more targeted, highly relevant” ads mantra is a post-facto rationalization for an approach to implementing big data ad technology, not a pre-facto rationale for doing so, and it falsifies what big data really accomplishes in the realm beyond the countable. The regular human being’s relationship with advertising is something between passive ennui and managed hostility; big data doesn’t alter that condition with enabling better or more relevant ads.

Because the larger question about the effectiveness of advertising in the modern age didn’t have an answer that could easily be understood and applied — how do people feel, what makes them tick, what rhetorical exercises of persuasion will have the impact we want? — the focus of digital and now, the rest of advertising, is to move to the tangible elements that can be manipulated to more forcefully and decisively mitigate risk: data. The industry opted to work on solving puzzles and give up on solving mysteries. The forest was abandoned for the sake of counting trees.

Now this exercise, by itself, is valuable… but only when it isn’t done in isolation.

We’ve built a system that is too good at collecting data, average at recognizing patters, and anemic at interpretation. It is akin to John Searle’s Chinese room argument. John Searle is a professor of the philosophy of language and a founder of the Cognitive Studies department at the University of California at Berkeley.

“Suppose that a person were given a set of purely formal rules for manipulating Chinese symbols. The person does not speak or understand written Chinese, and so he does not know what the symbols mean, though he can distinguish them by their differing shapes. The rules do not tell him what the symbols mean: they simply state that if a symbol of a certain shape comes into the room, then he should write down a symbol with a certain other shape on a piece of paper. The rules also state which groups of symbols can accompany one another, and in which order. The person sits in a room, and someone hands in a set of Chinese symbols. The person applies the rules, writes down a different set of Chinese symbols as specified by the rules on a sheet of paper, and hands the result to a person waiting outside the room. Unknown to the person in the room, the rules that he applies result in a grammatically correct conversation in Chinese… In sum, the rules are a complete set of instructions that might be implemented on a computer designed to engage in grammatically correct conversations in Chinese. The person in the room, however, does not know this. He does not understand Chinese.”

The current system the digital advertising industry has developed is geared in this way. This doesn’t mean that the system by itself is wrong, just that it is incomplete. The present systems more or less use “syntactic” rules to manipulate symbol strings, but have no understanding of meaning or semantics. Back to missing the forest for the trees: it just counts trees and records their placement in the forest, but it doesn’t tell us much more about the landscape.

The lopsided system we have now yields success only in so far as it identifies those who already have familiarity with the brand being advertised and may already exist as a prospective or current consumer.

Much of advertising looks like what it’s trying to accomplish is getting a non-buyer to be a buyer. But mostly what advertising actually does is help those who bought feel better about having done so.

What can be of greater value than simply getting the non-buyer to be a buyer is what data is gleaned from the already-was-going-to-buy buyer and how that can be used to expand on business objectives.

However, that’s still only half the picture. The current machine-based system is only going to give you what it can based on positive response data; it’s going to give you almost nothing about the non-responding non-buyers. Marketing has undergone a foundational shift that existentially repositions its tools of communication, advertising qua advertising chief among them. The import of marketing now is more “ambient,” or what I’ve mentioned over the years as being more a part of our “flow experiences.” Approaches that are data-driven only are going to have precision without accuracy and insinuation without subtly.

What the data need to be used for is in reducing some of the guess work out of where to start with who to reach, where to reach them, and what rhetorical flourishes have the best chance of working at persuasion. Its the last part that’s been the most neglected in most of digital advertising’s history, because at first, it lacked (and still does to a large degree) the artistic glamor reserved for traditional media platforms and, second, it is the one thing that can’t be easily rendered into machine readable form. That is, you can’t easily “pro forma” results on it.

This isn’t really a Mad Men/Math Men conundrum, as it’s sometimes been termed over the last couple of years, but a resource allocation one: smart technology for solving puzzles, and smart humans for solving mysteries, and a marketing discipline that allows for synthesis of the two.

Jim Meskauskas is a co-founder and Chief Strategic Officer of Media Darwin, a consultancy specializing in strategic planning of commercial communicative action. He’s a medialogist who has spent the last 20 years living, breathing and thinking about how to use media to move people to action. Outside of that, his likes are horror movies, Southeast Asian cuisine, his wife and his cat — not necessarily in that order. His dislikes are mean people, people who text while walking in or out of the subway entrances, pestilence, war, famine and death.