December 27, 2016
It’s been almost a year since my last blog post, so I thought it might be time to provide an update for anyone interested in what’s been happening with the Structured Stories project.
During the first half of 2016 I focused on research work, much of it associated with my fellowship at the Reynolds Journalism Institute at the University of Missouri. Early in the year I used the WordSmith tool from Automated Insights Inc. to demonstrate use of the Structured Story database to drive natural language generation – so-called ‘automated journalism’. Later in the spring I worked with a University of Missouri team to conduct a broad survey of user comprehension of structured stories, using a greatly simplified user interface and a small set of stories tied event-by-event to text articles. Publication of academic work has continued past the end of the fellowship, including delivery of a joint paper at AEJMC in Minneapolis in August, and delivery of an overview paper at the Computing News Storylines workshop in Austin in November, titled “Computable News Ecosystems: Roles for Humans and Machines”. Other papers about aspects of the project, by myself and others, are in progress and by the time the dust settles there should be another 3 or 4 published papers about the project, touching on reporting efficiency, story consumption and automated journalism. More on that to come.
But despite this academic success, and despite the substantial interest in Structured Stories from the structured journalism community, it has become clear to me that the journalism world is not well-suited to long-term research and development (R&D) projects. Journalism has no tradition of reimagining its fundamental components and assumptions, no institutions charged with exploring technology-heavy alternatives and, critically, no appetite for investment in work that might pay off years in the future. Fortunately, however, there is another domain that shares most of the characteristics and challenges of journalism but which does have a long tradition of supporting long-term R&D – the intelligence domain.
The intelligence community’s essential task is to gather information from a wide variety of sources and to refine and contextualize that information into products that help decision-makers understand a complex world. Despite recent controversies about access to communications metadata, the overwhelming majority of information processed by intelligence agencies these days is actually open source intelligence – ‘OSINT’ in the acronym-heavy government vernacular. The intelligence community employs tens of thousands of analysts to find and organize all this information, and relies largely on text documents to synthesise and communicate its work. Its workflows and processes parallel journalism in many ways, and the too-much-text problem is becoming as critical in intelligence circles as it is in news. Unlike the journalism ecosystem, however, the intelligence world has the ability and willingness to fund long-term R&D of technology-based solutions.
Beginning in mid-2016 I have therefore been adapting Structured Stories for use in intelligence applications, funded by a small research grant and aided by a research company with experience in government R&D funding. Although it is still early in this transition, results have been promising and the prospects for further funding for 2017 and beyond appear encouraging. Much of the functionality addressed by this new development work is directly applicable to journalism, and it is my intention to incorporate it into a journalism-facing product sometime in the future. I don’t expect to publish anything about the intelligence applications of Structured Stories until at least late 2017, but the ‘Computable News Ecosystems’ paper mentioned earlier describes the general system pretty well.
Developing an alternative to text articles as units of news is an audacious goal, and will likely take years, and a small but necessary detour along the way may be the quickest way to reach it. I am still deeply committed to offering a data-centric alternative to our rapidly collapsing text-centric news ecosystem. I intend to keep the existing Structured Stories site alive, and hopefully maintained, and I intend to remain active in structured journalism forums and conversations. I invite anyone interested in the project, or interested in helping out, to get in touch at any time.
July 7, 2015
It has been an intense time for the Structured Stories project – mostly due to the ongoing Structured Stories NYC experiment. And while I haven’t been very active on this blog there has been lots written about our experiment on the RJI blog, in a Nieman Labs article, and on the Reporter’s Lab blog, which is becoming a really interesting record of the experiment.
A deluge of excellent reporting has been pouring in from the impressive and productive team in New York (Ishan Thakore, Natalie Ritchie and Rachel Chason), along with a deluge of revelations, clarifications, consternations and realizations – all valuable learning that could only have resulted from real reporting. The quantity of rich information being generated about the reality of structuring news events and narratives far exceeds what was available before this experiment, and will take months to fully digest. This is the first time that Structured Stories has been used in a production setting and, while many bugs and inadequacies have been revealed, the team has nonetheless been able to successfully use the software to continually capture events and stories, enabling us to explore the editorial aspects of the approach.
Some of the things we’ve learned are substantial. The range of situations described in the FrameNet semantic database seems to be sufficient to cover the majority of news events that we are seeking to report, which is very encouraging. A significant proportion of reporting seems to involve speech acts and other forms of communication by characters – probably to an extent that will require special handling of those kinds of events. There seems to be a previously unappreciated challenge in distinguishing between the structuring of events from language and the structuring of events from ‘models’ of stories. There are several built-in trade-offs in the nature of the event frames that we are creating – for example general vs specific, or across multiple FrameNet frames – which will probably require a shallow taxonomy of event frames (as FrameNet itself already has). Reporting and editing tools for handling characters and entities that have no external knowledge graph references will need to be substantially improved.
We have also learned much that will enable the development of nascent editorial guidelines to aid future structured reporting – how to define and choose event frames, how to choose between importance values and sub-narratives to represent detail, how to name characters and entities, how to systematically select external references for characters and entities, how to approach the specificity required for capturing structure. The list of software issues to be fixed and improved is also long and somewhat daunting. We have not yet come across any specific issue that suggests an insurmountable editorial barrier to the concept, although there are still lots of puzzles, questions, weirdness, vagueness and things-to-explore that may yet prove to be major challenges.
This isn’t easy. We are attempting to record general news events and news stories as structured data, which is a radical and unexplored notion. Success of any kind is not guaranteed and the events and stories that we are reporting and recording may be somewhat simplistic, coarse and clunky. All of this is, obviously, much harder than just writing more text. But we are actually reporting and recording general news as structured data. That is actually happening. For real.
June 8, 2015
In the month since my last blog post I’ve made three week-long trips across the country, engaging with the two communities most closely associated with Structured Stories – Journalism and Computational Narrative.
My first trip was to the Reynolds Journalism Institute at the University of Missouri in Columbia, where I am a fellow this year, conducting a formal evaluation of Structured Stories. I met most of the RJI leadership team, including Executive Director Randy Picht and Research Director Esther Thorson, and spent several days with my research partner at Mizzou, Frank Russell. I was impressed by the intellect and seriousness of everyone I met and I’m convinced that RJI is the perfect environment for a careful, thoughtful and credible evaluation of the concept. I will publish more details on our research program as Frank and I develop the particulars.
Trip two was to Atlanta, Georgia, where the 6th workshop on Computational Models of Narrative was taking place. I was there to deliver my paper on ‘Narrative Structures as a Framework for Journalism”, and this was the first time I had presented the Structured Stories concept to the computational narrative community. I was very pleased by the interest and response, and I came away feeling confident about the conceptual basis of the approach and about the place of Structured Stories within the field. I also met many fascinating people with long experience in representing narrative as data, made new friends at the workshop and over dinner each night, and was introduced to several other people doing interesting and related work in the Atlanta area.
The third trip was to New York, New York – specifically the exciting and fast-paced neighbourhoods of Soho and TriBeCa. This trip was for the training program for the team participating in the Structured Stories NYC reporting project. The team members are Ishan Thakore, Natalie Richie and Rachel Chason – all students from Duke recruited and guided by Bill Adair. We were also joined by several guests, and went from an introductory overview to structuring real events and stories in three days. It was an intense experience filled with interesting discussions and examples, and the high calibre of our reporters made me very pleased with how the project has kicked off. Structured events from the NYC project are already pouring in and stories should be up on the website within a few days.
I am now back in L.A. for at least the next 2 months, focused primarily on supporting the reporting team in NYC. The NYC project is critical because it will determine whether the Structured Stories concept is editorially feasible – i.e. can it work on real stories in a real reporting workflow. With this project we are exploring ‘structured editorial’ issues that are new for journalism, and we may uncover many unanticipated challenges and opportunities. These are still very early days for Structured Stories, but they are increasingly busy and filled with interesting engagement!
May 5, 2015
More good news!
I am very happy to announce that I have been selected as a 2015/16 Fellow at the Reynolds Journalism Institute, focused on the evaluation of structured narratives as a new framework for journalism using the Structured Stories platform. The fellowship will give me access to the deep academic and practical expertise of RJI and the wider Missouri School of Journalism community, and will help ensure that structured narrative gets wider exposure and a thorough evaluation as a possible new medium for journalism.
The timing for the fellowship could not be better. Reporting into the Structured Stories platform will begin soon, and evaluating news consumption using the accumulated reporting will be greatly aided by the involvement of experienced journalism researchers.
The RJI fellowship is non-residential. I will remain in Los Angeles for the duration and will travel frequently to RJI in Columbia, Missouri.
April 24, 2015
New York City, here we come!
I am extremely pleased to announce that a major structured journalism reporting project, using the Structured Stories platform and titled ‘Structured Stories NYC’, will take place in New York City this summer. The project is a partnership between the Reporter’s Lab at Duke University, the WNYC newsroom in New York and Structured Stories. Reporting will be done by a team of student reporters from Duke organized by Bill Adair – the new Director of the DeWitt Wallace Center for Media and Democracy at Duke University, creator of PolitiFact and Pulitzer prize winner – and the reporting team will be led by Ishan Thakore from Duke.
The Structured Stories NYC project has been made possible by a generous grant from the Online News Association’s Challenge Fund, which was announced earlier today at the Journalism Interactive 2015 conference.
This project marks a significant milestone in the progress of Structured Stories, because it will provide an opportunity to develop real-world editorial guidelines and processes for structured journalism, and will result in a substantial body of reporting in structured form. The Structured Stories approach to local government journalism may offer a new way for citizens to quickly understand large, long-term, sprawling local government stories – the kind of stories that would otherwise need to be followed closely over a long time to be deeply understood.
A detailed description of the Structured Stories NYC project can be found on the ONA Challenge Fund website here.
Also, on a more technical note, my paper titled “Structured Narratives as a Framework for Journalism: A Work in Progress”, has been accepted for the 6th Computational Models of Narrative workshop (CMN’15), which will be meeting at Georgia Tech in Atlanta on May 26-28. The paper is a technical description of the Structured Stories technology and data structures, and of the journalistic basis of the approach. I will be presenting the paper at CMN’15 and will link to it here after it has been published.
These are very good developments for the Structured Stories project, and offer opportunities to truly explore and evaluate structured narrative as an alternative approach to journalism. If you are interested or intrigued then please get in touch!
You can reach me on email at david[at]structuredstories.com and on Twitter at @StructStories.
December 31, 2014
With just a few hours left in 2014 I figured it was time for quick recap of the year that was and a quick preview of the year to come. I also just pushed a major update to the ‘production beta’ code, including the ability to use Facebook reference IDs to define characters.
A year ago Structured Stories was just a goal, a pile of research notes, some nascent ideas and a primitive prototype. The data model and architecture stabilized in Q1 and design and coding of the application began in Q2. The early beta application launched in October and has been continuously improved since then. Today it is an increasingly stable tool with an accessible user experience and a robust API, and users can use it to consume and create structured events and structured stories. Judge for yourself here.
2015 is about transitioning the Structured Stories project from coding to journalism. The beta application will be function-complete within a few weeks, and by February I hope to have editing tools in place sufficient to enable user-entered stories to become permanent. At that point the project changes from a technical focus to a journalism focus – coding will scale back to mostly bug fixes, and the creation and growing of stories will become the primary activity. Local government news in Los Angeles remains the domain.
Focusing on journalism requires discovering, understanding and addressing a complex set of editorial challenges that will probably be at least as daunting as the technical challenges of the past year, including:
- Creating and applying editorial guidelines for the creation of event frames.
- Creating and applying editorial guidelines for the entry of events and the creation of stories.
- Building the event frame library from just over 100 frames now to several thousand frames.
- Developing an editorial process that can accommodate many contributors to stories and that can support coherent editing of events and stories.
- Understanding how story and event editing and maintenance actually work and building tools to support those activities.
- Observing and reacting to how real users use Structured Stories to create and consume stories.
- Discovering how to educate early users about Structured Stories, its functionality and utility.
The primary usefulness of the beta application in 2015 is to enable these editorial challenges and others to be clearly identified, defined and addressed. There is much to do and much to learn, but there has also been some progress. The Structured Stories concept works technically, and if it can also work editorially then it may be useful.
September 22, 2014
The first releasable version of Structured Stories is almost ready and will launch in October in five languages (English, Spanish, Swedish, Irish Gaelic and Indonesian), focused on local government news in Los Angeles. Yes, you read that correctly.
I have been demonstrating the application in Los Angeles since August and will demonstrate it to interested people at the Online News Association conference (ONA14) in Chicago in late September. I invite anyone who would like to see it to either meet me at ONA for a live demo or to get in touch for a remote demo (my email address is david[AT]structuredstories.com). In the meantime I am including a short series of screenshots below to convey the gist of the application.
After many months of immersion in the design and creation of the public-facing Structured Stories application I believe that there are at least six major novel functions that this approach can provide to news consumers:
- It makes news permanent. By enabling news to accumulate over time in a way that can be consumed intuitively and naturally, Structured Stories turns news from flow into stock – building a permanent history from news streams.
- It enables highly efficient consumption of news stories. By organizing news as narrative structures instead of as written text, Structured Stories makes it simple to navigate and understand vast and sprawling news stories that would otherwise be inaccessible without significant research. Navigating single stories at the scale of libraries becomes possible.
- It makes news universal. Structured Stories does not use language as its primary representation mechanism for news, and therefore any news can be easily consumed in any language. The prospect of a single, global news platform that is equally accessible to anyone in almost any language becomes a very real possibility.
- It enables queries of news and reasoning on news. Because Structured Stories are, well, structured, they are accessible to explicit search queries (think SQL, not Google) and are also available to computational reasoners. The application of computational tools like machine learning to all/any news events is realistic.
- It separates the method of storing news from the method of consuming news. There are many, many ways to tell the same story, and Structured Stories enables all of them, using built-in features, custom ‘discourse elements’ and through a Structured Stories API that enables unique story readers, story viewers or other story display concepts. Even video is possible.
- It publishes journalistic news events as Linked Open Data. By providing each individual news event, no matter how small, with its own unique address on the Internet, Structured Stories can open up news sharing and news mashups in lots of new and exciting ways – facilitating entirely new forms of discussion, verification and validation.
The Structured Stories approach to news also has characteristics that enable multiple novel functions for news producers. By separating reporting from writing it becomes practical as an adjunct to existing newsroom processes – a typical local government newspaper story of 3-4 core events can be entered into Structured Stories in under a minute. At the same time it also enables the possibility of ‘re-bundling’ news as value-accumulating networks and has the potential to reconcile civic journalism with professional journalism and editorial oversight in an economically sustainable way.
A demonstration is highly recommended.
Screenshots of the Structured Stories BETA application:
A detailed description of the Structured Stories technology in PDF form is available here.
May 21, 2014
The New York Times internal report on innovation, leaked last week in the aftermath of Jill Abramson’s exit, has set news innovation circles abuzz and is filled with many observations and recommendations that argue for more structure-centric journalism – including structure based on story. The business and product case for news producers to extend the ‘value half-life’ of their journalism by adding structure certainly seems to be building rapidly, and for a very comprehensive treatment of that phenomena I highly recommend Reg Chua’s Structured Journalism blog.
Structured Stories fits well with this increasing attention to structure in news production. The technology can be interpreted as a generalized mechanism for structuring news, enabling the practical application of structure beyond one-off domains like homicides, recipes, etc. to include news of any kind, in any domain. Such a generalized mechanism for fully structuring news, if productized and if successful, could be economically positive for news producers by somewhat restoring the old ‘news bundle’ in the form of interconnected and proprietary data.
It is important to keep in mind, however, that news production is just one side of the news ecosystem and that massive challenges also exist in the consumption of news. These challenges have been well described by Tony Haile, the C.E.O. of Chartbeat, in a recent article informed by Chartbeat’s unique view into the click-by-click realities of digital media consumption – realities that include 55% of clicks resulting in less than 15 seconds of user attention, and that show no discernible relationship between the sharing of content and user attention to that content. This describes a media ecosystem in which news consumers are expected to act as their own ‘do-it-yourself’ content editors; to pick through, assess and accept or reject content, one article at a time, every day. This is, as software developers say, ‘suboptimal’.
Structured Stories can help here too. By converting news into a ‘permanent record’, accessible on the consumer’s own terms and timeframe, and by providing a naturally intuitive framework by which news can be accessed, navigated and queried, Structured Stories can provide consumers with a new editorial structure for news. Replacing the text article with the structured narrative as the primary ‘unit of news’ would ground every news event as a URI on the semantic web, and therefore would provide a permanence and a degree of interconnection that is inconceivable within the current article-centric news ecosystem. Structure adds editorial value to information, and structures stories can add editorial value to news.
Update: I am currently heads-down on API design and architecture, and I hope to post a detailed technical description of that in the next few weeks. Also, I received feedback suggesting that an overview presentation describing the Structured Stories concept would be useful in interpreting the demo, and I should have that available soon.
December 23, 2013
As 2013 wraps up I have been reviewing progress since beginning the Structured Stories project some 5 months ago. The key question at this stage is whether the Structured Stories technology is currently delivering an experience of news that is genuinely different from existing digital news channels. I believe that it is, although whether others will agree with me, or whether that experience is different in a useful way, or whether that difference can feasibly be delivered at scale are all questions that can only be answered by releasing a public newsreader and then gathering and assessing analytics from actual use.
Nonetheless I think it might be useful to describe this different experience of news as perceived by a few early testers using the development UI. I cannot publish full details about the technology or about the experience that it delivers until I have some intellectual property protection in place, but I can give an overview of the general texture of the experience.
Keep in mind that the events and causal relationships that are currently stored in the prototype narrative structures are solely within the domain of Los Angeles city government during November and December 2013. Quantitatively this represents approximately 25 stories, most of which contain between 5 and 15 events – an event density that is very roughly equivalent to about 80 ‘traditional’ news articles about L.A. local government. The capture/reporting of these events is also very elementary and is done from mainstream press reports, from a daily summary document provided by the L.A. City Clerk’s office and from blog posts and press releases from key characters in the stories (primarily L.A. city council members).
Given all these caveats, my major observations are as follows:
1) Efficiency of news consumption. Readers seem to be able to consume a lot of ‘real news’ in a short time. In terms of ‘number-of-events-per-minute’ the improvement seems to be substantial – possibly 5x or more. Furthermore, this efficiency seems to be achievable with improved comprehension, as measured by an informal assessment of ability to answer key questions about the narrative. More systematic measurement of the efficiency of news consumption on a production environment is possible via formal user testing and will provide definitive metrics on this.
2) Integrated path to detail. The effect of integrated access to ancillary information or content relevant to either individual events or to narratives is quite striking. Exploring within the Structured Stories environment (e.g. quotes or other discourse elements) is useful, but it is the effortless direct access to raw sources, to knowledge graph or wikipedia entries, to a variety of discourses (articles, etc) about particular events/narratives and to other related linked information that is most powerful. Furthermore, accessing external resources from the Structured Stories UI does not feel like being ‘interrupted’ while consuming a narrative via a text article, but instead feels naturally integrated into the experience of consuming the narrative. It is not merely that this ability to access ancillary information is interesting, but it also enables one to feel in full control of the interpretation of events and narratives. Some measurement of this phenomena is probably achievable with relatively standard analytics.
3) A sense of coherence. Subjectively, consuming news via the Structured Stories environment just feels right. It feels complete, authentic and efficient – one evaluator described it as combining the ease of bullet points with the depth of long-form articles. Part of this probably comes from reader’s natural affinity for clean narrative structure, but the experience of coherence is also likely delivered through the clear visibility of events, of their relationship to other events and of their place within the overall narrative – all of which make it easy to continually ‘know where you stand’ while consuming a narrative. I am seeking a quantifiable metric for this sense of coherence, and I suspect that it may be the key enabler of the seemingly improved comprehension of narratives mentioned earlier.
4) A sense of control. The experience of consuming news within the Structured Stories prototype environment is primarily an experience of control. You are constantly making decisions about what to pursue, how much detail you require, how much ‘color’ you require, whether you need to or want to read external discourses/articles, etc. There is no sense of having to invest in consuming part of a story before knowing whether it is interesting, or of being passively fed a stream of content morsels, or of a mismatch between your required level of detail and the level of detail in the presentation, or of suspicion about the value or credibility or relevance of either content summaries or the content itself, or of regret after investing in consuming a story. It is not a search experience, because the entire experience is guided by the narrative structure, but it is ‘search-like’ in its level of control and engagement – for better or for worse.
5) Permanence. It is probably a little early to be commenting on the ‘sense of permanence’ that is engendered by the Structured Stories environment, but I think that it is important. Although completely focused on news, the Structured Stories environment is engaged with as ‘stock’ rather than as ‘flow’. Some events in the Structured Stories narratives date back years, because they are the causes of current events. Other events arise and are added to existing narratives, even thought those narratives may have already seemed ‘complete’. There are far fewer stories than articles and these stories and their constituent events are permanent artifacts – much more like Wikipedia than like digital media news streams, even for same-day breaking news. This changes how they are perceived, especially when engaging with the same story as it develops over days or weeks or, eventually, years.
There are other potentially valuable characteristics of the Structured Stories approach that are not directly related to user experience, such as its compatibility with mobile and cross-platform consumption, its generation of rich analytics and the natural suitability of narratives for sharing on social media, etc. All of that, however, is irrelevant unless the deep experience of using this approach is genuinely different – and these very early, subjective and informal observations suggest that it might be. I hope to open the demonstration site to the public by April 2014 so that anyone who is interested can judge for themselves.
November 6, 2013
It’s not news that the news industry is experiencing a period of remarkably intense change, driven by the rise of digital technology and the resulting demise of barriers-to-entry and bundled media products. There is no shortage of commentary, analysis and discussion about how this change is affecting the production and distribution of journalism – for example in the excellent report by C.W. Anderson, Emily Bell and Clay Shirky titled “Post Industrial Journalism”, published recently by the Tow Center for Digital Journalism at Columbia University. These are indeed interesting times for the production and distribution of news.
We see, however, relatively little analysis or discussion of how these changes are affecting the consumption of journalism by individuals. Questions that seem central to the very purpose of journalism remain either intuitive or unasked: Why do people even consume news? What are the fundamental elements of news, as experienced by the individual? What is the best way to organize news in terms of the individual needs that it addresses? How do the reasons for consuming news vary by individual, by context, by subject matter, etc? How well does the news environment satisfy those reasons? Are those reasons changing? The lack of obvious interest in these questions is particularly surprising given the vigorous search for innovative ways to offer value in the digital media environment, and suggests that it has been difficult to identify or articulate a ‘theory of news’ that could serve as a starting point for exploring the consumption side of the new marketplace for news.
It should come as no surprise that the hypothesis that I propose as the basis of a possible ‘theory of news’ is centered on the critical role of narrative, or story, in people’s perception of their world. While it is obvious that news is about ‘stories’ in the intuitive or colloquial sense, it is perhaps less obvious that the growing body of cross-disciplinary work on narrative can be directly and formally applied to understanding the consumption side of digital media or that such an understanding of news consumption might be used to produce useful media products.
Take, for example, an experience that is probably familiar to anyone who consumes news from many different online sources (news sites, Twitter, Facebook, personalized streams, news apps, blogs, etc.). This behavior can be interpreted as being about seeking new stories that are interesting to that consumer and seeking new developments in interesting stories that the consumer is already aware of. As the consumer develops an interest in a particular story they must therefore personally ‘de-duplicate’ events across multiple documents in order to construct the full story in their mind, and their personal experience in doing this becomes something like ‘knew that’ – ‘knew that’ – ‘knew that’ – ‘thats new!’ -‘knew that’ – ‘knew that’, etc. They are forced, in effect, to act as their own editors – a new role for consumers that was much less necessary in the pre-digital media environment. This frustrating experience is exacerbated by the very tools provided by digital media sites in an effort to simplify the discovery of interesting content, such as document-oriented topic clustering, entity-based personalization of news streams or search tools. From the consumer’s perspective navigation and consumption based on stories and their constituent events rather than on documents would probably be more satisfying.