What is a computational journalist?

A friend posed this question on Facebook in response to my last blog post, and I was tempted to respond, “We’re still figuring it out.” Then I was tempted to be glib and say, “It’s CAR (computer assisted reporting) on the Information Superhighway.” There’s a sense in which both of these statements are true, and yet, there are some things that can be said with some degree of confidence.

Computational journalism is the application of computing tools and processes to the traditional task of defining, gathering and presenting news. This definition is what I was reaching for in my May 2009 essay, “How Computational Thinking is Changing Journalism and What’s Next.” As Adrian Holovaty explained in this September, 2006, blog post, computers aggregate and manipulate structured data, so we make the best use of the technology when we organize our content accordingly. This not only means cataloging our content in ways that make it easier to find (SEO metadata, tags, links and trackbacks for example), but choosing the most efficient and effective forms of information-gathering and presentation for the task and audience at hand.

One example that I used in my essay involved building a module into a local newspaper’s content management system that would pick up specific pieces of metadata from a wire service’s RSS feed (such the time stamp and the dateline) and automatically dump the headline into a breaking news field that loads on the front page.

This kind of automation is one way in which computing technologies can help make the newsgathering process more efficient and timely.  Megan Taylor’s July 2010 post for Poynter reported on how companies such as the New York Times are building applications that automate the retrieval and manipulation of certain kinds of information, such as congressional votes.  Taylor also noted that news operations routinely employ algorithms, or step-by-step procedures that can be codified, or sometimes translated into software applications that can aid reporting and editing.  The third important quality is abstraction, which is a way of generalizing about objects or processes. For example, this web page is governed by an cascading style sheet that is built on a set of abstractions such as “text,” “header,” “link,” “post” and “footer.” Each of these “objects” has properties, such as font, color and alignment  that define its “style.” The webpage interacts with a database organized according to its own set of abstractions.

Why is this useful for the non-programmer journalist to understand?  For one thing, I’ve found it helps me understand what programmers are talking about when we are collaborating. For example, when I worked with my computer science colleague Monisha Pulimood and our students to create the content management systems for our campus online magazine Unbound and our Interactive Journalism Institute for Middle Schoolers, our programmers had to ask detailed questions about the journalists’ workflow in order to create the databases and interfaces for each system. It took a while to understand what was most useful and relevant on both sides, when we worked on unbound, but the process was much smoother during the IJIMS project because we were more practiced at the conversation.

Computational includes, but is not limited to computer assisted reporting.

Sarah Cohen, Duke University’s Knight Foundation Chair in Computational Journalism’s 2009 report “Accountability through Algorithm: Developing the Field of Computational Jounrlaism (.pdf), , envisions new tools that will help reporters gather, analyze and present data and interact with news consumers and sources in more efficient, useful and engaging ways.

One simple example is  Gumshoe, the database manager that Pulimood  and her students built to help another TCNJ journalism colleague, Donna Shaw, analyze data she’d obtained about the disposition of gun crimes in the Philadelphia municipal courts. Using a sample of data from just a two-month period in 2006, Shaw and her students were able to document the fact that hundreds of cases weren’t going to trial, often because evidence and/or witnesses disappeared.  Shaw’s findings were part of the document trail that led to “Justics: Delayed, Dismissed, Denied” a Philadelphia Inquirer multi-part series  on problems in the Philadelphia court system that ran in 2009. (One of the reporters on that project, Emilie Lounsberry, has since joined our TCNJ journalism faculty.) (Reference)

Social network analysis is another great computational tool. I really like this 2006 project created by students from Emerson College a few years ago that illuminated how social networks affected the transmission of health information in Boaston’s Chinatown. The network maps are accompanied by a series of video podcasts about health care issues in the neighborhood.

News games are another important area of development, and I think that collaboration between journalists and game developers are going to lead to the emergence of multithreaded interarctive non-fiction narratives. Another TCNJ colleague, Ursula Wolz, has been helping me think about the possibilities of this field for the last several years. In 2007, we published a paper and a Poynter. org post outlining our idea for a multi-threaded non-fiction storytelling engine. We’ve made progress since then, which I hope to be able to demonstrate in more detail in the coming months. For the moment, here is a very primitive example of a fictional mutithreaded story that I wrote in Scratch using a simple storytelling engine that Wolz wrote for my interactive storytelling class last Spring. (This was actually part of a larger collaboration supported by the CPATH distributed expertise project, which Wolz and I will be presenting, along with our Villanova colleagues, Tom Way and Lillian Cassel, at the SIGSCE conference next March.)


Endnotes

  1. Shaw, Donna., Pulimood, Sarah Monisha. and Lounsberry, Emilie.The Gumshoe Project: A Model for Collaboration Between a Small College and a Large NewspaperPaper presented at the annual meeting of the Association for Education in Journalism and Mass Communication, The Denver Sheraton, Denver, CO, Aug 04, 2010 . 2010-11-15
  2. (with U. Wolz) “ Multi-threaded Interactive Storytelling for Literary Journalism “, The New Media Consortium Summer Conference 2007, Sparking Innovative Learning and Creativity”, invited expanded paper, http://www.nmc.org/publications, pp 38 – 45, 2007

When artists and scientists collide: Steve Harrison on collaboration

Steve Harrison is an architect by training whose work in academia and industry has crossed into engineering, computer science and interactive media. He is also a provocative thinker about the value of cross-disciplinary collaboration in research and teaching. Steve is also my co-PI on the CPATH Distributed Expertise Project funded by the National Science Foundation. The PI is Lillian (Boots) Cassel. In this video, Steve talks about what it’s been like to build collaborations between scientists, engineers and artists at Xerox PARC and between computer science, art, design and media students at Virginia Tech.

[bubblecast id=290102 thumbnail=475×375 player=475×375]

Electricity for Haiti: Wisdom from Africa

Not surprisingly, a dearth of electrical power is one of the major obstacles to rescue and recovery efforts in Haiti.  While rescuers struggle to get emergency power in place, I’m thinking that low-tech inventions coming out of Africa might be helpful. For years, Afrigadget.com has been tracking such inventions for years. Here are a few that might be helpful in Haiti right now. The inventions here come from that site and links from there:

Recycled car batteries as generators:

Bicycle powered cell-phone charger:

The solar FLAP (Flexible Light and Power) messenger bag:

Wearable flexible solar-paneled vest

Distributed Expertise in Enhancing Computing Education With Connections to the Arts

I’ve written quite a bit about my work on the IJIMS project, but it’s not my only major research project. I’m also co-PI on another exciting NSF-funded project (Award #0829616) that involves creating model curricula and resources that connect computer science education with other disciplines. The formal name of the project is Distributed Expertise in Enhancing Computer Science Education With Connections to the Arts, or Distributed Expertise for short.

The PI for the project, Lillian Cassel, has been thinking about these issues for a long time.

Last spring, I team-taught a game production class with my TCNJ colleague Ursula Wolz, in parallel with a game development class at Villanova taught by our colleague Tom Way. We used a PBworks Wiki and Skype to manage the distance collaboration. You can explore the documentation here:

Meanwhile, our colleague at Virginia Tech, Deborah Tatar, team-taught an ethics class with a colleague in Ireland. I’ll post a link to more information about that project soon.

This semester, I’m working with Wolz and Way again, coordinating my interactive storytelling class with Wolz’s game production class and Way’s software engineering class. Wolz will also be working with Way’s computing with images class. We are running separate classes, but will use material generated by each other’s students to form the basis of specific assignments. It’s going to be an interesting and exciting semester.

I also want to start a series of conversations about how to make these kinds of collaborations work, and extend them to to more institutions. Part of our vision is that this could be a way of providing CS expertise to disciplines that are becoming computing dependent, such as journalism, while helping CS students understand the nuances of working with content from different knowledge domains. Also, we hope that this can become a model for augmenting the resources of financially strapped institutions, such as small liberal arts colleges and HBCUs.

I plan to do some blogging in this space about our experience, as well as the general concept of our these kinds of collaborations can work. I really look forward to comments and feedback.