Facebook Researcher 'People Are More Predictable than Particles'
British scientist Stephen Wolfram has already developed two highly influential computational systems: Mathematica, an algebraic software program, and the popular search engine Wolfram Alpha. Now he's taking on Facebook's treasure trove of data, with results that would interest Mark Zuckerberg.
SPIEGEL ONLINE: Wolfram Alpha, the computational search engine you developed, just released a detailed report based on what people reveal about themselves on Facebook. How many people gave you access to their Facebook data for this research?
Wolfram: I don't have the precise number, but it's certainly over a million. You can now work out things like "how does the number of friends you have vary with your age?" or "given that I'm a particular age, what are the ages of my friends?" You find that with very young people, it very sharply peaks around their own age, and then it broadens out the older people are. Distribution of friends' age is a function of a person's age. You can work out relationship status as a function of age -- with increasing age, the singles go down and the marrieds go up. When you compare that with data from the census bureau, it tracks very nicely, with some differences. There are people who claim they're married on Facebook even though they are 13, and that's just for fun obviously. You can also track migration of people from different countries, based on what they say on Facebook where they're from and where they're currently living.
SPIEGEL ONLINE: What else does the data tell you?
Wolfram: There are topics people discuss on Facebook, based on their gender and age, like movies or politics. Men are more interested in politics, and the amount men talk about politics increases with age. Women seem to be less interested in writing about travel, compared to men, the older they get. And people talk about the weather more and more as they get older.
SPIEGEL ONLINE: You actually decipher what people talk about in their posts?
Wolfram: Yes, based on natural language processing. We're training a natural language classifier using large text corpuses. It's a technology we haven't actually released. We get to use it before the rest of the world gets to use it. For example if someone uses the world "movie" in a post it's fairly probable that they're talking about movies.
SPIEGEL ONLINE: Would you be able to spot the title of a newly released film as well?
Wolfram: That's the benefit of having this large natural language understanding system: We know the name of every movie. That helps in being able to decipher things like this.
SPIEGEL ONLINE: Has Facebook asked to license this? Your sample is one million or so, but theirs would be one billion people.
Wolfram: This isn't yet out and about in the world. But we obviously know the people at Facebook, so ... we'll see what happens. The good thing is: There is nothing in this data that could be embarrassing to Facebook. It's just a snapshot of the world. The interesting thing is: Generating all this stuff took about three weeks. This is just an example of what we can do with data.
SPIEGEL ONLINE: What's your take on Facebook's Graph Search?
Wolfram: There's an interesting endpoint, when you combine computational knowledge with this kind of personal analytics stuff. What's been done so far is an interesting start, they have some interesting user experience ideas. We'll see what happens in the future with our computational knowledge stack combined with that type of data. One thing that's been so satisfying about this is seeing how easy it is to do it. People have had access to various parts of this data for a while, but for whatever reason the tools they've been using have not made it easy enough for them to be able to do the kinds of exploration that it's been easy for us to do.
SPIEGEL ONLINE: Obviously the first people who would be interested in this sort of analysis are advertisers.
Wolfram: Yes. There are some interesting directions.
SPIEGEL ONLINE: You come from a hardcore natural science background, you started out as a particle physicist. Do the social sciences become interesting to you at this point for the first time because the available data now allows a different, more computational approach?
Wolfram: I'm interested in science, but also, quite separately, I'm interested in people. I've been building a company for a long time and I'm interested in how talented people manage to do great stuff. I watch some of these things over the course of peoples' lives, and I have all kinds of theories on these things. And now I realize that on this project for the first time what I know in science intersects with what I've been interested in about people -- the trajectories of their lives. It's tantalizing that when you look at data about the dynamics of people, you get very precise curves.
SPIEGEL ONLINE: What does that mean?
Wolfram: We had a joke at our company a while ago: Our web analytics team was full of former experimental particle physicists. They were used to doing experiments on neutrinos or something, where they get data at some rate and make these plots on the behavior of particles and so on. The data rate in our web analytics system is about the same as the one they got in their particle physics experiments, the number of clicks is about the same as the number of particles going through a detector. The surprising thing is: The curves in web analytics are actually smoother than those they were used to in particle physics. People are, in a sense, more predictable than the quantum mechanics of particles.
Interview conducted by Christian Stöcker