Tell stories with data.
Journalism in the 21st century involves finding, collecting, analyzing and visualizing data for stories. The Journalism School offers foundational courses in data‐driven journalism as well as a three-semester M.S. in Data Journalism degree for students interested in advanced skills.
We are in the midst of a data explosion. It’s estimated that 2.5 quintillion bytes of data are created every day. Successful reporters must be equipped to make sense of data, critically assess its origins and limitations and use it to deliver accurate and compelling stories to a variety of audiences.
The Journalism School’s data offerings continue to increase and advance, allowing students to take a variety of data and computation courses along with more traditional journalism classes. Foundational courses allow those with no data experience to hone their skills in data acquisition, extraction and analysis. More advanced courses include data visualization, journalistic computing and data‐driven journalism on specific topics, such as science, U.S. politics and cross‐border reporting.
In addition, the journalism school offers an M.S. in Data Journalism for students who want deeper and more advanced training in data and computational skills. The school also has a dual degree program in Journalism and Computer Science in which students learn the fundamentals of reporting and writing while developing a working background in computer science and software design. And The Lede certification program equips students with the computational skills to turn data into narrative.
Whatever their degree choice, students have many opportunities to incorporate data and data analysis into their reporting. M.A. students study statistics in Evidence & Inference and learn about data reporting in M.A. Essentials. The result is a strong grounding for all students to use data to advance their journalistic work.
In the fall, a foundational, seven-week Data I class teaches the basics of using spreadsheets and databases for reporting to students enrolled in the 10-month M.S. degree. For more advanced M.S. students, a Data II class explores various tools for accessing, manipulating and publishing data.
The following are introductory classes for students in the 10-month M.S. program:
This course teaches students how to evaluate and analyze data for appropriateness, context and meaning. Students leave the class knowing how to apply basic statistical methods to numerical data sets. They will also learn how to obtain, clean and load various types of commonly encountered data. They will be drilled on devising interesting, thoughtful and answerable questions to ask of data sets. They will also be taught how to translate the results of their data analysis into clear and concise findings. Visualization in this course will be used primarily for data analysis and story formation, not publication.
This course is designed to give students who have taken and passed (and hopefully enjoyed) an introductory course on Statistics a more advanced treatment of the process of storytelling with data. This includes: Frameworks and tools for finding, accessing, manipulating and publishing data (APIs, various databases, and some techniques for data "cleaning"); simulation-based approaches to statistical inference when data have special designs (surveys, A/B testing); "models" for data and the stories they tell (regression, trees); and advanced tools for visualization (to explore both data, the effects of data processing, and models). Throughout we will emphasize best practices for documenting your code and analysis ("showing your work").
Professor: Mark Hansen
In the spring, M.S. students can choose from a selection of 15-week data journalism and computation workshops. In addition, several of the spring classes incorporate a data component in their coursework. The following are introductory data classes for students enrolled in the 10-month M.S. degree.
Students in the data degree take a separate menu of classes throughout the three semesters they are enrolled in the School (see here for more information). Students in the dual degree program with Computer Science have a mandatory fall class, The Frontiers of Computational Journalism, a hands-on introduction to the areas of computer science that have a direct relevance to journalism.
The following are 15-week data classes offered in the spring semester and open to all master's students:
Please note: The classes listed here represent recent offerings at the Journalism School. Choices vary each semester depending on faculty availability and other considerations. Classes described now may change or be dropped to make room for new additions.
Lydia Namubiru, '16 M.S. Data Concentration, is back in her native Uganda, working on a technology platform that will use text messages to connect professional journalists with sources in small towns and to aggregate the news from these communities. She is also teaching data journalism in Kampala. Read more about Lydia.
Students in Professor Mark Hansen's computational journalism class contributed to "The Follower Factory," a New York Times report on fake accounts in social media networks. It found that some 48 million Twitter accounts may be automated and designed to simulate real people.
While enrolled in the data concentration, Kevin Sun, '17 M.S., examined how Australia is making covert propaganda videos to discourage asylum seekers from reaching its shores. What started as a Medium blog for one of his classes ended up running as a story for Quartz.
Columbia fellows and Toronto Star reporters reviewed thousands of pages of bankruptcy and other documents in three countries and interviewed dozens of people in Canada, the U.S., the U.K. and Israel to uncover "How every investor lost money on Trump Tower Toronto (but Donald Trump made millions anyway)."
David and Helen Gurley Brown Professor of Journalism and Innovation; Director, David and Helen Gurley Brown Institute of Media Innovation
John S. and James L. Knight Professor of Professional Practice in Data Journalism
Adjunct Faculty; Director, The LEDE Program