Guardian News & Media
GNM RCS
Web rights interface
Technical specification
Prepared by O3 Team Limited
Authors Nigel Robson
Creation date 11/12/2013
Document Ref. GNM_RCS_Web_Rights_Interface_TS.docx
Version draft for review
.Introduction
Purpose
The document GNM_RCS_System_Interfaces_FS.docx is the functional specification that describes what business functions RCS supports in relation to its interfaces with other GNM systems as well as any external integration.
This document is one of a set of technical specifications that provide details of how those functions are implemented in RCS.
Scope
This document focusses on the interface RCS exposes to make rights data available to the website with respect to content published on the website. Separate documents deal with all other inter-system interfaces that pass data between RCS/SLM and other systems.
This document is intended as a high-level technical document outlining how the relevant business functions are implemented in terms of software modules.
Importantly, this document does not aim to provide the level of detail that would be required in a programming specification in areas such as program structure, detailed business rules, data integrity, validation, locking considerations, data security, and calls to/from other software modules, performance considerations, and so forth.
For details of program logic and coding, the reader should refer to the program files themselves.
.Web rights interface
Business requirement
The website needs to be informed whether it can publish content or needs to take it down, or if it if it can publish but for a limited time. Other rights data is also required for the API including information about aggregate syndication rights and subscription databases.
[This set of rights may be extended at any time simply by confuring the system.]
Data model
The data model that supports the R2 feed is quite simple. For each item of website content there can be multiple tags, multiple keywords, and multiple contributors.
Additionally there is a queue of content that needs to be evaluated, and a log of data that has been requested by the website.
The above diagram shows these three section of the data model and they are described further below:
Tag queue
A queue of content that needs to be processed/reprocessed is held in the table WEBSITE_CONTENT_TAG_QUEUE. Content is added to this table when it first arrives in the TLIB schema, and again whenever it is matched/unmatched, staffed/unstaffed, or disregarded/reinstated. Content is also reprocessed if the rights are changed in any associated commission of contract.
All of the processing that puts content in the queue is controlled by database triggers.
An item will not appear in the queue twice, as this makes no business sense: if it is there it does not need to be added again.
Generating rights tags
When each item is processed a set of data is generated in the WEBSITE_TAGS, WBSITE_TAG_RIGHTS and WEBSITE_TAG_RIGHTS_DETAIL tables. The rights data is based on the properties that are flagged in the rights model (see separate documents on the Rights model). This data forms the basis for the rights XML that the website will request and consume.
To identify the rights in an item of content the process will check the following:
If the content is disregarded the rights profile of the disregard reason is used;
If the content is matched the rights profile of the linked Commission(s) and/or Contract(s) are examined. In the case of more than one match the least restrictive rights are assumed;
If the content has not been processed then the software tries to identify the contributor from the various contributor, by-line, IPTC header etc. fields that are available, and then examines the rights associated with that contributor’s last 10 arrangements with GNM and uses the most restrictive set of rights found; and
If none of the above have derived a set of rights tags then the default rights profile is assumed.
The above process can be time consuming, in some circumstances. The design uses a queue to take that delay off-line and thereby not slow down response times for users.
Request for latest rights
The third part of the data model records all the requests for rights tags that the website makes. This is an audit of when rights tags were sent to the website, should it ever be necessary to check.
XML feed
The production web systems make frequent requests for the latest rights tags from RCS so that the website can be updated accordingly. Each request it makes is recorded in the WEBSITE_TAG_REQUESTS table.
The software in RCS limits the number of items for which tags are returned to 1000 to avoid ever overloading either system.
The software that generates this feed has a very complicated query which embeds various standard Oracle XML functions. It is not discussed in any more detail here but the technique used is described in the RCS Software Techniques technical reference document.
The XML is generated by invoking one of the following URLs:
The API uses:
http://gnmapps:7777/plrcs/rights_feeds.website_changes_since?lastid=
Multi-media uses:
http://gnmapps:7777/plrcs/rights_feeds.multimedia_changes_since?lastid=
With the ID of the last tag successfully processed item appended to ensure this XML output is contiguous to the last XML output.
The XML that gets generated is in this format:
<?xml version="1.0" ?>
- <rcsRightsFeed>
- <tagSet tagSetId="1122566">
<contentType>Article</contentType>
<url>http://www.theguardian.com/money/2009/aug/07/debit-card-consumer-protection</url>
<cmsId>351295201</cmsId>
<storyBundleId>2312228</storyBundleId>
- <right>
<rightCode>COREBROWSING</rightCode>
<acquired>Y</acquired>
- <property>
<propertyCode>DISPLAY</propertyCode>
<value>In perpetuity</value>
</property>
</right>
- <right>
<rightCode>SYNDICATIONAGGREGATE</rightCode>
<acquired>Y</acquired>
</right>
- <right>
<rightCode>SUBSCRIPTIONDATABASES</rightCode>
<acquired>Y</acquired>
</right>
</tagSet>
End of Document
<enter keywords here>
Keywords (or tags) are important to provide accurate search results. They are vital if you have attached rather than pasted content to this page.