Guardian News & Media
GNM RCS
Content processing
Functional specification
Prepared by O3 Team Limited
Authors Nigel Robson
Creation date 03/10/2013
Document Ref. GNM_RCS_Content_Processing_FS.docx
Version draft for review
.
.Introduction
Purpose
GNM publishes content in print and digital media, and reuses, redistributes and sells some of that content. RCS records the existence of content published on the website and in print, manages payments where required, and tracks the rights GNM has in this content so determining what may be done with it.
This document describes the various aspects of content management in RCS.
Scope
This document is intended as a high-level document outlining the main processing involved. It is not a detailed functional specification from which the system could have originally been developed.
Separate Technical specifications document the implementation of these functions.
.Published content
GNM publishes content in print and on the website and other digital media, with the same content often published in multiple places.
Web content
GNM continuously publishes content on www.theguardian.com: this includes content that appears daily in print as well as web-only content.
XML feed
RCS receives an XML feed of content from the website that includes rich information about the content but does not include the content itself. The information about the content includes its size (word count, area, or duration), its headline or caption, which section it was published in, and details of who produced it or supplied it to GNM. All of this information is needed to identify who owns the copyright in the content, whether a fee is owed, and whether GNM has the right to repurpose the content or resell it.
Currently RCS is advised about the publication of these types of content:
Article
Audio
Cartoon
Competition
Gallery
Interactive
Live Blog
Picture
Poll
Quiz
Table
Trail block
Video
Website content to attribute
Web content recorded in RCS without any identifiable contributor tags is put in a queue for review by the RCS administrator. The RCS administrator is able to attribute the content to an existing contributor, which makes a better assessment of the rights situation possible.
If the content is processed elsewhere in the system it is removed from the queue.
The RCS administrator has a count on the welcome screen identifying how much content is queued for review.
Print content
Most print content is published on the website, and in most cases it is published there first. Print content can therefore be thought of as a subset of the website content.
Clearly some digital formats cannot be replicated in print, such as audio, video, interactives and live blogs.
RCS extracts key data about print content from the Text Library (data which exists in the same physical database). As with web content the most important data is the size of the content (words or area), where it came from, where it was published, and its headline or caption.
RCS needs to know about every different item of published print content, irrespective of whether it appeared in all editions, so that the rights and fees for that content can be processed.
Standard content
The Guardian newspaper is printed Monday to Saturday and The Observer on Sundays, usually in multiple editions, with each edition on each day comprising standard sections/departments e.g. Leaders, Foreign, Sport etc. Each section is linked to one or more cost centres, and each user logon is allocated one or more cost centres, and thus it is possible to determine which users process which content.
Specials
Specials exist in print, and are slightly different to content published in the standard sections. Typically a special comprises a separate book of content, all on a related topic, that is published as an extra to the paper on a one-off basis.
Specials can theoretically be produced by any department, but internally within the production systems they are identified by codes that just mark them as specials – nothing to identify who produced them, and therefore should process them in RCS.
To ensure every user does not see every special in the matching screens another screen has been devised that is used to allocate each special to a specific cost centre. This can be controlled at the content format level and so, for example, the pictures from a special could be allocated to the picture desk whilst the text could be allocated to any other editorial desk.
.Content processing
Content is processed (staffed, disregarded or matched) automatically, where possible, based on rules defined within the system. In other cases the content must be processed manually. The system tries to learn from repeated user actions in order to devise new rules that can be applied automatically.
Automated processing
Internal system processes attempt to process new content as soon as it arrives in the RCS database. This process uses rules that identify content that can be ignored (i.e. disregarded), and content that can be staffed, or finally content that can be matched to subscription contracts.
Additionally, when an item of content is recorded in RCS a process will check to see if another instance of that item of content has already been processed and if it has the processing will be replicated.
The main advantages of automated processing are:
Faster processing – almost as soon as the content appears in RCS;
More accurate processing – rule based processing should limit mistakes; and
Saves administrators time manually processing content.
Manual processing
RCS also contains two screens that are designed to facilitate the manual processing of content that the automated processes cannot deal with.
By department
The Content matching screen is the original screen used to process content in RCS. The screen has a hierarchical structure: Publications → Departments, and then below each department on the left is a chronological list of unprocessed Content, and on the right lists of the relevant Commissions, Contracts, Lineage & Space Rate agreements, and also the Disregard reasons.
The user only sees departments and publications they have access to. Within each department they can scroll through the content on the left, and on the right scroll to the appropriate commission, contract or disregard reason, and then Match the content. Alternatively they can staff the content or mark an image as a by-line picture. Or they can Find a contract or commission in another department, or they can enter a new commission using the details of the published content.
Content can be multi-matched if it is co-authored, with a proportion of the total word count allocated to each match.
To simplify the processing, and prevent errors, the user chooses the content format they are concentrating on e.g. Text or Pictures etc., and the lists of content, commissions and contracts are restricted accordingly.
By cost centre
The Content matching (by cost centre) screen works in very much the same way as the Content matching screen described above, but is structured around the hierarchy Publications → Cost centres. This view makes it possible to combine departments e.g. content published in Guardian Sport, Observer Sport and the Sport section of the website can all appears together under the Sport cost centre.
.History
Content history
RCS maintains a view of every freelance contributor’s content. The user just identifies the contributor they are interested in and all of their published content is listed, provided it has been matched.
Because all staff data is processed as one there is no facility to list all the content produced by an individual staffer, except for content commissioned outside of their staff contract.
Reports have been devised, now mainly in business objects, that can be used to produce lists of content for individual staff members. This is susceptible to inaccuracies as the report relies entirely on the accuracy of by-lines.
Matching history
Content published during a specific period of time can also be examined to see how it was processed: whether it was matched to a contract or commission, staffed, disregarded, or not processed at all.
The Matching history screen shows content that is grouped by publication and editorial department. Content can be searched for using the following criteria:
Publication
Publication date range (noting web content has a time element)
Format
Headline/Caption
Contributor
URL
PicDar URN
How processed – matched, disregarded or not processed
For any item of content the user can view the content itself, and any matches.
Search for published content
A facility also exists to search for content (across departments) published during a particular period of time. The search criteria include:
Publication date range
Publication
Department
Format
Headline/Caption
Contributor
URL
PicDar URN
For any item of content the user can view the content itself, and any matches. For web content the contributor tags can be shown, and for images the IPTC header information.
.Audio visual product tracking forms
Audio-visual (AV) content is treated differently from other content, as it is invariably produced as a collaboration of several different contributors, each of whom may have different contracts or commissions with GNM.
The audio visual product tracking form gathers all relevant arrangements together in a single place for a single AV product.
Each tracking form documents the title, a working title, a synopsis, episode information (if applicable), funding party, production details, and details of the resulting GNM output. In addition to this the procurement details are also recorded – including the contributor, the role they performed, and the arrangement they are under (commission or contract), as well as the duration of their contribution in the overall product. Notes may also be added against each contribution.
A separate queue of incomplete AV product tracking forms is maintained in RCS.
.Reports
RCS has three content-related reports:
Content history report – a list of content for a selected supplier over a selected period;
Unmatched content report; and
Staffed content report.
The first report is used to assess how much content has been delivered by a freelance in a given period. It shows the same data as the Content history screen. It can be used to help assess performance under a contract, or to determine whether a commissioned contributor should be put under contract. It can be requested with a detail or summary output.
The two subsequent reports are used to check the processing of content for errors.
Additional reports are also available through Business Objects, which is beyond the scope of this document set.
End of Document
<enter keywords here>
Keywords (or tags) are important to provide accurate search results. They are vital if you have attached rather than pasted content to this page.