Image by Casey Chin
Dual degree students share how they went from writing code to writing for WIRED

How We Got Published in WIRED

When news broke that NBC had identified thousands of accounts tied to the Internet Research Agency (IRA), the Russian “troll army” accused of interfering in the 2016 U.S. election, students in the Spring 2018 session of Investigative Techniques began a class project on New Yorkers who had unknowingly interacted with the bots on Twitter. But as they mined data from thousands of conversations related to the accounts, they quickly realized they had something larger on their hands.

With tweets on topics ranging from schoolwork to music, a number of the conversations just didn't seem like they came from Russian election bots. Nonetheless, these Twitter handles joined a list of more than 2500 accounts that were not only suspended, but submitted to Congress and later published in the public record as being potentially linked to the IRA.

As the students dove deeper to track down the real-life users behind these accounts, what emerged was a larger story about the gaps in big data, the need for congressional oversight and digital due diligence on behalf of the platforms.  Before the semester was over, the students were working with editors from WIRED, which published the story in July 2018.

Here, two of the story’s writers, Shreya Vaidyanathan and Erin Riglin, share how their class project came to be published.

How did the story develop?

ER: For our investigative techniques class we had to do a long form data investigation of our choosing. The night before pitches were due, NBC had released over 200,000 tweets from Russian propagandist accounts that were already removed from Twitter; they recovered them by scraping internet archives. Although the dataset was limited, only representing about 10% of the accounts Twitter declared were associated with the Internet Research Agency at the time, it revealed how they operated.

We wanted to do a story about how New Yorkers interacted with those accounts, but when my teammates and I started to build a database of conversations people were having with them, we noticed a lot of routine topics about schoolwork, making plans to go out, and family, topics uncharacteristic of the IRA. A new story emerged.

SV: We finally came across a tweet of Rebecca’s — one of our sources — new account referencing her old account that she had lost access to. That was when we knew we had landed on a possible story. We followed up with her and began finding others who had similar experiences and detailed that in our final piece.

Did you experience any roadblocks or challenges along the way?

SV: We had the technical roadblock of accessing Twitter data initially as they have a rate limit. We also had some challenges in contacting Twitter for comment.

ER: One of the biggest challenges was tracking down the people that owned the now-defunct accounts. We scoured all social media platforms for matches, spoke with their followers, and used clues in their limited tweets to direct searches. The first person we were able to get ahold of had also been contacted by researchers at Clemson University who she then put us in touch with. They were able to obtain a full archive of over 3 million tweets from the full list of IRA accounts from which we worked together to analyze. 

There were many Twitter profiles we had suspicions about but couldn’t track down the people behind them. We suspect the scope of misidentification was larger than we were able to detail in the final story but because of the challenges we faced in tracking people down, we couldn’t corroborate them all.

How did your professor help in the story’s development?

ER: Susan McGregor, our professor and program advisor, helped us ask the right questions about what the larger picture was beyond simple cases of misidentification.

It is quite common for algorithms that classify large amounts of information to derive a few false positives. But the fact that this list containing what were, in our opinion, conspicuous errors was provided to Congress, which then published it in the public record, raised larger questions around the trust our society places in tech companies. Having access to a wealth of experience from the journalism school’s faculty helped us pursue the proper channels to articulate the bigger picture.

How did you apply what you learned in your courses to the story?

SV: I was able to apply what I learned through my reporting class in our interviews. Our understanding of social networks and databases from a technical standpoint was also crucial in analyzing the data for the story.

ER: We had to quickly write a lot of code and build databases to find the leads, but the basics of reporting and investigation we learned in our journalism courses were key to bringing the story to fruition. The computational components will always supplement a story, but will never make it. 

How did you connect with the editors at WIRED?

SV: We were interested in pitching to WIRED, and Professor McGregor connected us to Scott Thurm, an editor to whom we could pitch the story. We sent a synopsis and a draft of the story and they were immediately interested in the piece.

How long did it take for the story to be published?

SV: It took about two months to get the final story published. We had to fine-tune and fact check and make sure we were doing everything correctly.

ER: Although we already had 3,000 words, WIRED wanted to make sure we got all of the facts right. We worked with WIRED fact checkers, followed up with sources and pulled more data to ensure everything was airtight.  It was the first time any of us worked with a major publication and we were pleasantly surprised by the meticulous standards they required.

What impact has working on the story had on your life?

SV: I was really motivated to pursue this intersection of computer science and journalism with more vigor. It reinstated why this is important and that I was capable of making some difference in the world with my storytelling. The dual degree has given me a lot of exposure and opportunities that I think will be instrumental in my career going forward. 

 

Q:  What made you pursue the dual masters degree program at Columbia Journalism School?

SV: I was always interested in storytelling and in applying my skills in computer science/engineering in a meaningful and fulfilling manner. I came across the dual degree program at Columbia years earlier and decided that I wanted to pursue this combination. As soon as I received the admittance letter, there was no looking back.

ER: I was working as a software engineer and consultant and had reached the point where my career started to gain what felt like irreversible momentum. The work I was doing was challenging and rewarding, but I felt I wasn’t leveraging my luck to make a real impact. I was also deeply influenced by Anthony Bourdain’s raw accounts of the world at the time. So I quit my job and took off with a backpack to travel and do some volunteer work for a year, admittedly seeking adventure, but also to put a pause on things to reflect and figure out what I wanted to do with my life.

What I didn’t expect was the enthusiasm with which people shared intimate details of their lives, their culture and often persecution they experienced. I wondered, with the explosion of technological interconnectivity why can’t these people have their voices heard?

That’s when I decided to become a journalist. And when I looked into different universities, Columbia’s dual degree program with computer science introduced me to this young and exciting field of computational journalism. It was a perfect fit.

Learn more about the School's Data Journalism Webinar Series hosted by award-winning faculty and experts.