
M.S. Data Journalism
Journalism in the 21st century involves finding, collecting and analyzing data for storytelling, presentation and investigative reporting. The journalism school offers a Master of Science in Data Journalism for students interested in advanced skills.
"Before coming to Columbia, I didn't know anything about web scraping; I had never used Sequel before. Those are all great things to have on your resume, along with anything about Python. I hadn't gone too deep into any of those data programming languages before studying here."
Curriculum
Data Journalism students begin their program in the fall taking foundational computational and data courses as well as courses on the fundamentals of reporting. In the second semester, they continue honing their journalistic skills with Writing with Data and the Data, Computation, Innovation workshop, where they will explore cutting-edge storytelling using data and computation, and take a 15‐week seminar and production course with the Master of Science students. In the final semester, students work on the Master’s Project, a substantive piece of data-driven journalism. They also join the Master of Science students in taking a suite of courses called Journalism Essentials, which covers the business, historical, legal and ethical issues of the field. They also take Storytelling with Data and Data, Computation, Innovation II.
Students who enroll in the data journalism degree are not eligible for admission to the Stabile investigative or documentary programs. They have the option, however, to take investigative, video and other classes. Some of the data journalism courses available, though not limited to, include:
Courses Available:
-
Data Visualization
- Taught by Mark Hansen, David and Helen Gurley Brown Professor of Journalism and Innovation and Director of the Brown Institute for Media Innovation
-
Algorithms
- Taught by Dhrumil Mehta, Associate Professor of Professional Practice and Deputy Director of the Tow Center
-
Data, Computation, Innovation I
- Taught by Jonathan Soma, John S. and James L. Knight Professor of Professional Practice in Data Journalism and Director of the Data Degree Program
-
Information Warfare
-
Taught by Emily Bell, Leonard Tow Professor of Professional Practice and Director of the Tow Center for Digital Journalism
-
-
Writing with Data
-
Taught by Justin Elliott, Adjunct Assistant Professor of Journalism
-
Who Should Apply
The M.S. in Data Journalism provides the hands-on training needed to tell deeply reported data-driven stories in the public interest.
The current era needs journalists who can extract stories and meaning from data and massive information flows. The program trains students to confidently use data to report compelling stories.
Applicants do not need to have experience with data or computation to enroll in this three-semester program. All students are required to attend foundational courses that allow those with no data experience to hone their skills in data acquisition, extraction and analysis.
More Classes
Semester 1
Reporting I
In this introductory reporting course, each student will be assigned a beat and will be expected to produce news stories on deadline. Students will learn to think like reporters and to practice the core skills of the trade: developing sources, conducting interviews, structuring a story, writing clearly and getting the facts right. As data journalists, they will also seek out and analyze data, both to deepen their reporting and to identify promising leads. In this way, the tools and techniques learned during the summer will be immediately applicable as data students begin to develop a journalistic mindset and the capacity to find and produce journalistic stories.
Foundations of Computing
The course is an introduction to the ins and outs of programming and data analysis using the Python programming language, with which students will build a foundation for future coding-intensive classes and journalistic work. After this course, students will be able to find and execute solutions to most of the coding- or data-related problems they encounter in the newsroom. The course focuses on cleaning and analysis using the Python programming language, the command line, Jupyter Notebooks and the data package pandas.
Reporting II
Students will continue to learn how to apply their data and computational skills to real-world journalism. They will hone their ability to construct a narrative from both quantitative and qualitative sources, how to think critically, how to report under deadline and how to document so that others can replicate and critique their work.
Data & Databases
Students will become familiar with a variety of data formats and methods for storing, accessing and processing information. Topics covered include comma-separated documents, interaction with web site APIs and JSON, raw-text document dumps, regular expressions, SQL databases and more. Students will also tackle less accessible data by building web scrapers and converting difficult-to-use PDFs into useable information.
Semester 2
Data Studio
In this project-driven course, students work on their own projects and learn everything from obtaining and cleaning data to data analysis and final presentation. Data is explored not only as the basis for visualization, but also as a lead-generating foundation, requiring further investigative or research-oriented work. Regular critiques from instructors and visiting professionals are a critical piece of the course.
Writing with Data
This class will build upon the introductory reporting class and focus on honing the use of data and computation to find and tell stories.
Data, Computation, Innovation I
Students build on the lessons from the Data Analysis Studio class and deepen their learning of the concepts and foundations of data visualization, from chart-building to human perception. They will use tools such as the D3 JavaScript visualization framework for building custom interactive graphics and web-friendly maps.
Semester 3
Storytelling with Data
By dissecting pieces ranging from prize-winning to their own work, students will be trained in the standard of work that an editor will expect when pitching and executing a data story. Students will leverage their advanced technical skills in pursuit of asking the right questions of data sets and communicating about data findings in an accurate but accessible manner, while avoiding pitfalls common to data-driven pieces.
Data, Computation, Innovation II
Machine learning and data science are integral to processing and understanding large data sets. Whether you're clustering schools or crime data, analyzing relationships between people or businesses, or searching for a needle in a haystack of documents, algorithms can help. Students will generate leads, create insights, and evaluate how to best focus their efforts with large data sets. Topics will include building and managing servers, linear regression, clustering, classification, natural language processing, and tools such as scikit-learn and Mechanical Turk.
Journalism History & The Business of Journalism
You will take one of these modules in the first half of the semester, and the other in the second half: Journalism History explores the historical development of the values, practices and social roles that cluster around the institution of journalism. The Business of Journalism will help you to understand the challenges and vicissitudes of this period of historic flux in the journalism industry.