Why XML is Good for Publishing

Unisys Media Solutions | Insights (Published on: 11/16/2001) 

Why XML is Good for Publishing

By Massimo Barsotti, Product Manager, Unisys Global Media

With the newspaper publishing industry in flux, technological advancements such as XML and the Internet’s increasing popularity with readers, traditional publishers are looking seriously to the Internet as the new publishing format. In fact, they’re eyeing it as a potential new revenue source.

And while the push is on for print-only newspaper publishers to add Web-based news properties, doing business on the Net flies in the face of keeping a lid on costs and staffing to a healthy minimum. To use the Internet successfully, publishers should implement XML-based solutions in "unified newsrooms" where editorial staffs manage both print and online writing assignments.

XML, or extensible markup language, is touted as the can-do technology, the gateway to content repurposing. Many believe it’s the way for media companies to transition into integrated multimedia content providers.

Here’s why it works well. Say one of your editors writes an article. You want to publish it in your newspaper and your online publication. XML is the technology that can make it happen.

What is XML?

Simply put, XML is an enabling technology, not a programming language, that encodes and converts or translates content so it’s independent of how it is displayed. It’s platform-independent and uses specific rules for designing text formats that let you structure data. So it makes it easy for a computer to generate and read data and ensures the data structure is clear.

XML expresses information either in a data-centric or a document-centric model. The latter approach is applicable to publishing tasks and functions. In this model, irregular, or semi-structured, data (a document) is expressed in an XML-based language. If you apply XML to publishing functions and tasks, it labels page elements such as headlines, images, body text and captions and designates where they are positioned.

Getting the Job Done

For print content to be reused on the Web and other media, a publishing system must import or export documents in XML. In fact, the main difference between HTML, the language that’s currently adopted for the Web, and XML is that the former merges content, structure and visual layout, while the latter represents the mere structure. As such, within XML documents the content is disjoined from its visual rendering.

Effective repurposing often requires descriptive information (metadata) to be present, along with the content. Metadata is information that describes a page and can include keywords, descriptions as well as the structure of an article. XML doesn’t automatically produce metadata. However, in terms of repurposing content, the insertion of relevant metadata in a timely manner and the format of the metadata are important. XML-based content makes it easier to convert into different formats.

The beauty of XML lies in its simplicity – and the fact that it works behind the scenes. Although what it does is necessary for content reuse, it can operate in the background. While most reporters and editors won’t work with XML, they should have a fundamental understanding of what it does. However, those editors involved in editorial planning and production should have a thorough knowledge of what XML lets them do with content.

Reasons to Consider XML

Why implement XML-based editorial solutions? They are attractive because XML bridges the gap between what publishers can do today and where they need to go tomorrow and beyond.

XML offers these features that make it a publishing winner:

•    Data coded in XML is simple to read and understand and can be processed easily by computers. Standardization of the technology offers universal compatibility.
•    It’s based on self-description and uses an unlimited number of tags and attributes, which makes it easier to work with compared to HTML.
•    XML text can be read by machines, another improvement over HTML and plain text, so it promotes efficient searching and data mining.
•    XML supports multilingual documents, which can facilitate repurposing for international publishing companies.

It’s also likely that publishers will perceive the main benefit of XML as increasing the awareness of the advantages of standardized information interchange. And, it jibes with what readers are moving toward: getting their news from other sources such as PDAs and WAP (wireless application protocol) phones.

Unisys Media Solutions | Insights | Why XML is Good for Publishing

Published on: 11/16/2001  
By Massimo Barsotti, Product Manager, Unisys Global Media