home * about us * contact us * past features * columns * resource links * site map


9/11 Remembered
Data and Analysis Come to the Newsroom
Posted by J.T. Johnson
The digital information revolution presents new opportunities and new responsibilities for journalists. In the past 15 years, the often-limited application of digital information resources and tools has come to be called Computer-Assisted Journalism. The term is somewhat confusing because users of the phrase often neglect to differentiate mere number crunching from the theoretical knowledge necessary to know what numbers to crunch. Moreover, the term mistakenly implies that computer-assisted journalism is somehow different in form and function from computer-assisted data and information management used by other disciplines.

A more exact term is Analytic Journalism, wherein data is identified, retrieved, analyzed, and communicated via a variety of media. Here, the emphasis is not on computers, but on analysis—drawn from a broad spectrum of disciplines—and communication of the results of that analysis in both traditional and evolving journalistic styles and media.

A Little Background
Analytic Journalism was born in the early evening of the first Tuesday in November 1952. The presidential election that year pitted Dwight Eisenhower against Adlai Stevenson. Reporters on the campaign trail traveled on trains or propeller-driven planes. They wrote their stories on portable Underwoods and Royals or dictated them over the phone to rewrite men back in their newsrooms. That is, they dictated when there were long distance lines available and the cost bearable. If not, they sent telegrams by Western Union.

On that Tuesday night in November, however, when the vote counts started coming in, journalists at CBS News had a new tool in their New York City "Election Central" TV studios. It was the UNIVAC I computer. The machine could store only 1,200 characters (compared to the multimillion characters that can be stored in today’s PDA), but it was of impressive size and speed for its day.

CBS’s Charles Collingwood waxed on about its ability to project the results of the election based on exit polls and statistical analysis, and that night, UNIVAC I delivered the goods. With a mere 8 percent of the votes tabulated, it predicted Eisenhower would win 43 states with 438 electoral votes. That projection would prove to be off by just six electoral votes from the final result.

UNIVAC I was a visible, tangible tool. But to focus on vacuum tubes and switches misses its true importance to contemporary journalism. The Machine was not and is not the important entity or concept. Instead, journalists today must recognize that data of all sorts—from the full text of the Patriot Act to vote counts to SEC filings—all exist somewhere in the world in the form of 1s and 0s. And as 1s and 0s, that data can be easily moved at the speed of light, reformatted for textual, visual, and quantitative analysis, and redistributed essentially without cost, save for the journalist’s time.

Digital Data
Storing, retrieving, and analyzing data in a digital form is actually more than 100 years old. The U.S. government used a punch card system designed by Herman Hollerith to record, store, and analyze data gathered in the 1890 Census. Hollerith's binary system in which particular spots on the cards were either punched or not punched, was the direct forerunner of today’s electronic databases. Not only did the system mark the beginning of a dramatic change in the storage of data, but also the beginning of a specialized language that could not be read without the aid of a codebook and mechanical devices.

Today, in the United States, journalists must assume that the "paper trail," as it applies to contemporary government, no longer exists. The business of government is conducted with word processors, spreadsheet files, and database records. Since mid-1993, all U.S. government "open records" are available in computer-readable form. Corporate filings to the Securities and Exchange Commission are now on the Internet, as are the records pertaining to legislation for the U.S. Congress, as well as multiple state and local jurisdictions. The 1990 Census was published in digital form a year before printed versions reached depository libraries, and the 2000 Census will never be found in ink-on-paper format, at least not by the federal government. Simply put, journalists who cannot retrieve and analyze digital information lack the literacy to function in this age of digital information.

Consider: David Bloom of the Los Angeles Daily News regularly adds to his own database names of people appointed to commissions and agencies in his city. "I'm interested in how the people in power are doing appointing men versus women to commissions. In the agency I cover, it’s easy: The form that accompanies each appointee approval includes a check box for gender. I can see if certain members of the Board of Supervisors are naming too many men or women, and how things are going overall in gender fairness, a big issue on a board that only recently added its second female member. Not a major story, but an easy one. At the same time, the data can also be married to campaign contribution information for a story about who gets appointed: unsurprisingly, those who give, get. Just have your spreadsheet sort appointees by name and the official who named them, and compare it to contributions. Semi-instant story."

The Concept
Analytic Journalism is the merger of intellectual and journalistic methodologies. For example, six months after the 1992 riots in South Central Los Angeles, the Centers for Disease Control and Prevention dispatched a team of six medical epidemiologists to help sift clues, pore over records, interview witnesses, and search for patterns in the five days of violence. First, the investigators hoped to describe what happened in an epidemiological sense—how many people were injured, how they got hurt, and what patterns show up in the chain of violent events. By studying these patterns, they hope to glean insights into how cities might break that chain, or prevent it. This methodological search for patterns is an important aspect of Analytic Journalism. At the very least, a journalist today must understand the limits and potential of methodologies used by the epidemiologists. And the best journalists will be familiar enough with such methodologies to employ aspects of them in their own research and reporting.

Journalists will always need to know how to write, but they have to have something to say, something that is grounded in data and analytic methods that go far beyond Who, What, When, Where Why, and How.

---------

Mr. Johnson currently serves as a co-director for Institute for Analytic Journalism and is a Professor of Journalism at San Francisco State University.