If ever there is a story that can illustrate the importance of mathematics in everyday life, this is it. Like many people looking for love in L.A., Chris McKinlay had joined OKCupid. The way the site matches people up is simple: they allow users to select any number of questions out of a pool of thousands, and then rate them in terms of relevance to their lives. Then, using this data, the site combs other profiles looking for like-minded matches. This makes sense — unless you happen to answer questions that no one else in your area deems very important. Then you don’t ever show up as a potential match for them. Answering the wrong questions, in effect, makes you invisible.
McKinlay’s matches turned up less than 100 women who were over 90 percent compatible with him according to their questions. One hundred — out of 2 million in Los Angeles, 80,000 of whom are using the site. So McKinlay decided to boost his chances by doing some math to the problem. First, he had to determine what questions mattered to the kind of women he could connect with. Once he knew these questions, he could answer them honestly, and see who in the group of women was a match for him.
Using a bit of Python script, he created 12 different accounts and started going through hundreds of profiles of straight and bisexual women between 25 and 45, looking for the questions these had answered. Then he started grouping the women into clusters based on their characteristics. Because OKCupid only lets you see answers to questions you yourself have answered, McKinlay set up his dummy profiles to randomly answer a lot of these questions. But OKCupid doesn’t like data-harvesting, so as soon as the dummy profiles had accumulated data for a thousand accounts, OKCupid shut them down. McKinlay upgraded the operation, making his dummy profiles simulate human click-through rates and answering speed.
Three weeks later, he had harvested 6 million questions (and answers) from some 20,000 women in the United States. Of course, information is useless unless you know what to do with it. What McKinlay did with it is nothing short of brilliant — he applied a modified Bell Labs algorithm called K-Modes that clumps data into clusters.
“First used in 1998 to analyze diseased soybean crops, it takes categorical data and clumps it like the colored wax swimming in a Lava Lamp. With some fine-tuning he could adjust the viscosity of the results, thinning it into a slick or coagulating it into a single, solid glob,” writes Wired‘s Kevin Poulsen. “He played with the dial and found a natural resting point where the 20,000 women clumped into seven statistically distinct clusters based on their questions and answers.”
Among these clusters, he found two that contained people he could really relate with. These clusters would provide for him the questions he needed to answer in order to find a woman who was compatible. Using his sample, he picked out 500 questions most popular with both clusters and created two profiles to optimize for each cluster. Because he was looking for a real connection, McKinlay answered the survey questions honestly. The key here was knowing which question to answer, not how to answer them. Obviously he could have lied, but what’s the point of building a mathematical model to find love if you’re going to settle for someone who’s not really a match?
He did take cues from his sample with regard to how important to label each question, though. And he also used the characteristics of the clusters to determine what to highlight in his bio and what photo of himself to use on his profile. Anyone who’s ever made a dating profile will probably agree — that’s not lying. That’s working a competitive edge. Hey, this is L.A. We do what we can to stand out. We have to.
The moment of truth had arrived. He ran a search and voila — women appeared matched at 99 percent. Thousands of profiles scrolled by before he dropped from 90-something percent compatibility. Now he built a program to visit the pages of the women who matched him best with each profile. OKCupid pings users whenever someone visits their profile, so this looking in would signal his interest. Would anyone bite? We’ve all heard that women don’t lift a finger online, but that wasn’t the case for McKinlay. The messages poured in, around twenty per day. He responded to ones that came from women with interesting profiles, or women who said something engaging. Bad one-liners — they work on no-one.
Now he had to go on some dates.
“It was scary,” McKinlay admits. “Up until this point it had almost been an academic exercise.”
The dates were exhausting, though. He ended up deleting one of the profiles to keep up — too many people in it lived in east L.A. Of the 55 dates he’d been on over the three-month period, only three had led to second dates. Then at some point before date 90, he got a ping from Christine Tien Wang, who’d gotten him as a 91 percent match during a search for tall, blue-eyed dudes near UCLA. You can read the rest of the story at Wired, but I’ll tell you this much: it’s a really sweet ending.
Header image by Jimmie.