Building the bridge between journalism and computer science

Computational Thinking in Journalism: Recently Asked Questions

My Poynter.org  essay on the need to foster comptational thinking in journalism has generated responses ranging from skeptical to intrigued. In this post,  I want to particularly respond to two of the questions and observations that have been raised.

I’ll take the comments at Poynter first. One commenter said that the essay should have been entitled, “Digital and Internet Tools Augment Analytical Thinking.”  This was my response:

Thanks for your comment. I don’t pretend to have a lock on this, and the responses help me refine my thoughts.

Your proposed title, “Digital and Internet Tools Augment Analytical Thinking” accurately describes some of the traditional ways in which journalists have used computing technologies. What I’m suggesting though, it that computational thinking can also help journalists conceive new tools, or new ways of applying existing tools.

This isn’t limited to the collection and presentation of editorial content. For example, Blogher, Inc. runs a Flickr RSS feed of member photos tagged “BlogHer” below the nameplate. It’s a design feature that reinforces their identity as an online community. That’s an example of computational thinking in design.

It’s largely a matter of understanding how to use the computing technologies in ways that are most effective in accomplishing your goals. Does that make sense?

Another Poynter commenter scoffed that the essay was “a mashup of every conceivable digital buzzword.”  She added,

Is ‘deconstructing algorithms’ some kind of code for managing searchablity?
Not sure why reasoning “abstractly” is a new goal – isn’t that a given?

The buzzword charge tells me I have to work a little harder at translating my thoughts.  I don’t know what “managing searchability” means, but I can tell you what I meant when I talked about deconstructing algorithms.  At a base level, I’m talking about having the computer literacy to understand that a search engine locates and categorizes information based on a set of rules.  If you understand the rules, you can do a better job of querying the search database and grouping the results.  For example, some search engines sort results on the basis of popularity, while others sort on the basis of some sort of credibility scheme.  Some search engines try to decipher the meaning of search terms and factor in other items that weren’t explicitly requested but might be related. These are called semantic search engines.

This has very practical implications for the editorial and business side of news organizations.  As writers and editors craft headlines and article text with the goal of search engine optimization in mind, the differences in search algorithms might require differernt search strategies.  This 2006 article comparing the Google, Yahoo and MSN’s search algorithms is instructive in that regard. This May, 2009 blog post discusses Google’s plan to change its search algorithm to make it harder for spammers to achieve high rankings. The bottom line is that these differences in algorithms can cause the same site to have different rankings on different search engines.

The search and ranking algorithms within sites also matter. A February 2009 article in from CNET reported the controversy engulfing Yelp, a website that posts customer reviews of businesses and services after a newspaper investigation disclosed charges that the company accepted bribes from businesses to delete negative reviews. According to the article, Yelp blames the problem on their algorithms.

There is also some debate about the biases that that may be inadvertantly reflected in sites using popularity to determine what content is promoted.  The arguments over sexism at digg.com is a prime example of this debate.

Finally, as to the question about “abstract thinking. ” Of course, teaching abstract thinking is one of the goals of college education in any discipline. However, the concept of abstraction has particular meaning in computer science.  It’s a way of grouping similar types of information or procedures in order to simplify a computing operation. This is part of what Adrian Holovaty was getting at in his 2006 blog post, “A fundamental way newspaper sites need to change,” when he urged journalists to learn to think in terms of structured data. Bambooweb explains the concept well:

“In software development, abstraction is the process of combining multiple smaller operations into a single unit that can be referred to by name. It is a technique to factor out details and ease use of code and data. It is by analogy with abstraction in mathematics. …

Abstraction can be either that of control or data. Roughly speaking, control abstraction is the abstraction of actions while data abstraction is that of data structures. Control abstraction, seen in structured programming, is use of subprograms and control flows. Data abstraction is primary motivation of introducing datatype and subsequently abstract data types.”

The concept of abstraction can be challenging for those of us who have not been trained in mathematics or computer sciencce.  Part of my challenge in this collection of writings it to help translate the concept in ways that those of us with more traditional journalism backgrounds can understand.

A foundational concept for the new news economy

A journalist’s introduction to computational thinking

Kim Pearson, Department of English, Program in Interactive Multimedia, The College of New Jersey

Note:  A revised version of this essay has been published by the Poynter Institute’s E-Media Tidbits weblog.

I’m part of the post-Watergate generation of journalism school graduates, and right now I’m watching my peers struggle to master digital tools in an effort to stay relevant to an industry that is shifting ground under their feet. After years of working and collaborating with computer scientists at the forefront of the digital transformation of our culture, I’ve come to understand that what we need, most of all, is to master the fundamentals of what computer scientists have begun to identify as “computational thinking.” The good news is that there so many parallels between computational thinking and the ways of knowing that are embedded in the practice of journalism that one my collaborators, computer scientist Ursula Wolz argues that there is an “isomorphism,” or functional equivalence, between the two fields.

What is computational thinking?

It’s a way of reasoning — and a  way of defining problems, processes and relationships through which those problems are resolved. Jeannette Wing, a computer science professor at Carnegie Mellon University who also works at the National Science Foundation as Assistant Director for its Computer and Information Science and Engineering Directorate,  has argued that:

Computational thinking involves solving problems,
designing systems, and understanding human
behavior, by drawing on the concepts fundamental
to computer science. Computational thinking
includes a range of mental tools that reflect the
breadth of the field of computer science.

The website at the CMU Center for Computational Thinking elaborates concisely on Wing’s concept:

  • Computational thinking means creating and making use of different levels of abstraction, to understand and solve problems more effectively.
  • Computational thinking means thinking algorithmically and with the ability to apply mathematical concepts such as induction to develop more efficient, fair, and secure solutions.
  • Computational thinking means understanding the consequences of scale, not only for reasons of efficiency but also for economic and social reasons.

Computational thinking is more than digital literacy

Let’s begin with the obvious. Journalism had become a computing dependent profession long before the online revolution upended the business models that sustained the industry since the 1830s.  Investigative journalists, particularly, have been using government databases for decades. They have been creating databases since the early 1990s, and it’s no accident that many of the Pulitzer-prize winning stories over the last 15 years rely heavily on database reporting.

There’s no longer an argument about whether journalists need to be digitally literate. Today, newsgathering requires the ability to write programs that scrape public records databases and design interfaces that make the information in those databases interesting, relevant and accessible. It requires the programming and design skills to create interactive presentations that model complex public policy issues or explain social processes. It requires the mastery of social media technologies used to organize online communities around shared interests, issues and concerns.  It requires the ethical grounding needed to ensure that the content generated by these advanced tools is accurate, fair, comprehensive and proportional.

However, the digital transformation of newsgathering and delivery requires that journalists become creators, not just consumers of computing technologies.  I’m not saying that journalists need to become programmers.  I’m saying that we need to be able to reason abstractly about what we do,  understand the full pallette of computational tools at our disposal, and collaborate to deploy those tools with maximum efficiency and effectiveness.  That means understanding the underlying structures and processes of media creation.

What does that mean in practice?

Think about one of the basic functions of a local news operation: delivering occasional major breaking news bulletins. In the old days, an editor would tell a page make-up editor to tear up a front page to make space for a banner headline above the fold, along with a fast write-up of whatever information is available at the time, in inverted-pyramid style. There are rules – algorithms, if you will – that govern the entire process, from the fact that the headline has to contain a subject and predicate to the fact that there should be a dateline, and that sources should be authoritative and quotes should be pithy.

Now envision the same task in a modern newsroom. A programming-savvy editor will likely have worked with the site’s interactive editor to define a field within the site’s content management system called “Breaking News.” The most efficient policy would be to constrain headlines to 140 characters, and to have the RSS feed for the headlines linked to twitter via an API.  Similarly, the twitter feed should dump to a Facebook status message, as well as to SMS subscribers’ news alerts.   However, suppose the news site is a hyperlocal site without a full-time staff to actually develop the breaking news story.  Assuming that the site is a member of the Associated Press or a similarly credible pool service, the programming-savvy editor can create a function (or have one created) that will post an AP story that meets pre-defined criteria for a breaking news story to its content management system as a draft for approval, then alert the editor. After vetting the story, the editor can release the story as-is, or quickly get additional value-added content. The  editor’s knowledge of underlying computing structures and processes enhances the productivity and efficiency of the news operation.

Here are some additonal examples of how computational thinking is already changing the way we do journalism:

Traditional practice Practice informed by computational thinking
Getting news tips from sources Crowd-sourcing
Vetting information through multiple sources Not only vetting information through multiple sources, but also deconstructing the algorithms used to assign credibility to said sources
Text stories in inverted pyramid or narrative format Text stories “chunked” with lede grafs subheads and titles optimized for search engines.
Headline writing for clarity and reader engagement In addition, headlines are optimized for search engines and RSS readers
Spot news photos Interactive photo slide shows, perhaps with audio narration, that might allow panning, zooming, or remixing the content depending upon the editorial intent
Layout for news value, advertising placement Layout also based on requirements of multiple platforms, eyetracking, accessibility standards, microformats, and usability research
Information graphics Interactive, database driven information graphics segmented for easy, blogging, tagging, twittering, embedding or mashing-up
Investigative reporting and analysis: text images and other static, linear or tabular content Pulling aggregated, time-stamped geo-tagged data as part of the reporting process, creating or using social networks, user-generated content appropriately vetted and sourced, interactive information graphics development using an appropriate web-development framework, database structure, and user-centered interface design as part of the news presentation, text, audio, video (perhaps annotated and/or linked, still images
Editorial art Interactive web comics, games
Letters to the editor Comments, social media functions, APIs and other tools for community-building and reader engagement – need to balance editorial judgment with community-building needs

Best practices for computational journalism: a researchable question

Infusing computational thinking into journalism alters the epistemology of the field as fundamentally as the advent of objective reporting did 100 years ago.  Formal journalism education  emerged as part of the effort to codify and institutionalize the best practices of that day, and to serve a news industry oriented to an assembly-line based manufacturing culture.  A new journalism is emerging,  grounded in computational thinking, that mimics the values and processes of knowledge production in the information age — what some experts call remix culture.  (See Lessig , Navas, and Jenkins for more on that concept.)  As Clay Shirky has argued, that new journalism requires prolific experimentation to help us discover sustainable business models that will the civic functions of news.

Obviously, the marketplace will answer some of our questions. At the same time, scholars need to develop ethnographic models to help us understand these emerging news practices work and how they affect our culture. We need assessment models to help us understand how the creation and presentation of online and interactive news and information affect learning, civic participation and community cohesion. Some of this is happening, of course — witness the work of MIT’s Center for Future Civic Media, for example.  Our Interactive Journalism Institute for Middle Schoolers at The College of New Jersey, a National Science Foundation-funded demonstration project that uses interactive journalism to infuse computational thinking into the language arts curriculum, is another example.

This combination of marketplace experimentation and systematic documentation and reflection will yield a new set of best practices that will become the bedrock of journalism education in the future.  The actual tools that we use to implement those practices will continue to change.  However, if we educate ourselves properly, we can help to lead that change, ensuring that those evolving practices serve the best interests of democracy.