Introduction‎ > ‎Document Structure‎ > ‎

Resource

A STEMMA Resource is any supporting data or object in your collection. The term is more often used in the context of electronic documents, pictures, scans, and recordings. However, in STEMMA it may also be a physical object, or artefact (i.e. an object of human creation). This could include actual letters, original documents, medals, portraits, original photographs, personal possessions and family heirlooms.

 

 

RESOURCE=

 

<Resource Key=’key’>

[ <Title> resource-title </Title> ]

<URL [ContentType=’content-type’]> url </URL>

[ <Type [Artefact=’boolean’]> resource-type </Type> ]

[ <DataControl>

{ <Copyright> copyright-notice </Copyright>

| <Permission> permission-notice </Permission>

| <Prohibition> prohibition-notice </Prohibition> }

[ NARRATIVE_TEXT ] ...

</DataControl> ]

[ <Sensitivity> sensitivity </Sensitivity> ]

[ <Params>

{ <Param Name=’name’ [Type=’type’] [Key=’key’]

[DCType=’dc-type’]  [ItemList=’boolean’]

[Optional=’boolean’]> default-value </Param> } ...

</Params> ]

[ <BaseResourceLnk Key=’key’>

 [ NARRATIVE_TEXT ] ...

</BaseResourceLnk/> ]

[ NARRATIVE_TEXT ] ...

</Resource>

 

Digital resources are located by a URL (Uniform Resource Locator). The 'scheme' prefix of the URL allows it to be applied to different file stores, e.g. file:// for local files[1] and http:// for the Internet/Intranet ones accessed via the HTTP protocol. If the digital data-type cannot be determined by the file type (e.g. by its file extension) then the ContentType attribute must be employed to specify a corresponding Internet Media Type, e.g. ‘text/plain’, ‘image/jpeg’.

 

Irrespective of whether the Resource is digital, non-digital, or both, the nature of the resource is described by the <Type> element. This includes: Award, Clothing, Document, Furniture, Letter, Map, Music, Painting, Photograph, Recording, and Video. See Extended Vocabularies for an example of defining custom resource-types.

 

Non-digital resources are indicated by the presence of the Artefact attribute. A value of ‘1’ (true) indicates you have a non-digital resource, such as an original letter, portrait, or item of clothing. The default is ‘0’ (false). If a URL is also specified then it indicates that a digitised copy is also held. This will usually be a digital image but it could also be a digital sound recording or video. Original non-digital recordings and video should be considered artefacts. When copying a collection for transmission to someone else, or for long-term storage, consideration should be given as to whether the Artefact attribute needs to be dropped if the physical items are not included.

 

This simple example provides a definition of a photographic resource and the URL through which it may be accessed, plus a sample reference to it:

 

<Resource Key=’MyPhoto’>

<Title>Photograph of myself</Title>

<URL>file:mydir/MyPhoto.jpg</URL>

</Resource>

 

<ResourceRef Key=’MyPhoto’/>

 

The possible Sensitivity levels are exactly as defined for the Sensitivity data attribute at DATA_ATTRIBUTE.

 

The DataControl element provides any notices that must be displayed to the end-user if the associated resource is copied or transmitted to another user. Software components are not expected to act on those notices themselves. They should merely be displayed in order to prevent an accidental breach of trust or copyright. Permission and Prohibition are designed for informal control, such as when a family member asks that their photographs not be passed outside of the immediate family. Copyright is a formalised type of prohibition, usually applied to works of artistic, academic, or commercial value. These concepts are discussed under Worldwide Family History Data. These settings should be honoured when bundling a collection, or part of, for transmission to another researcher.

 

A URL-based resource definition may be parameterised so that multiple resource references can reuse common information. There are two types of parameterisation available in Resource and Citation entities. The primary one (which is also valid in the resource-title and parameter values) is simple substitution using a named ${param-name} marker. For instance, the following definition and reference selects a specific family photograph from a common collection:

 

<Resource Key=’rPhotos’>

<Title>Family photograph: ${PhotoName}</Title>

<URL>file:myphotos/family/{$PhotoName}.jpg</URL>

<Params>

<Param Name=’PhotoName’/>

</Params>

</Resource>

 

<ResourceLnk Key=’rPhotos’>

<Param Name=’PhotoName’>Me</Param>

</ResourceLnk>

 

This next example goes to a hypothetical web site to retrieve census images for England and Wales. It makes use of the second type of parameterisation which employs the URL parameter mechanism.

 

<Resource Key=’rCensusImage’>

<Title>1851-1901 Census Images of England and Wales</Title>

<URL>http://www.census.com/image?series=?&piece=?&folio=?&page=?</URL>

<Params>

<Param Name=’Series’/>

<Param Name=’Piece’ Type=’Integer’/>

<Param Name=’Folio’ Type=’Integer’/>

<Param Name=’Page’ Type=’Integer’/>

</Params>

</Resource>

 

The parameter values which actually form an ordered set are applied to the ‘=?’ placeholders in the URL in the exact order that they’re defined. The number of <Param> elements must therefore match the number of placeholders. The named substitution mechanism is more general as it also applies to the resource-title. Both could be employed to achieve the same expanded URL but the difference will become evident later for the Citation entity. NB: The actual names of the <Param> elements are not directly related to the names of any query parameters in the URL string.

 

The parameter data-type values expressed by the Type attribute are currently similar to those allowed in Extended Properties, except that Measure, Enum, & EnumList are not supported, and Date only accepts ISO dates (no non-Gregorian calendars). The same ItemList approach to lists is taken as for Property values. The semantic type is indicated by the DCType attribute which uses the Dublin Core vocabulary, e.g. DCType=’DC.Title’ or DCType=’DC.Publisher.CorporateName.Address’.

 

Any <Param> element may specify a default value if necessary. The default value for the Optional attribute is ‘0’ (i.e. False) which means a non-blank value must be provided. When an ItemList parameter is substituted then the result is a comma-separated list of the component Items.

 

The BaseResourceLnk element may nominate a generic Resource from which data may be inherited by the current Resource, in much the same vein as base classes and derived classes in software programming. The previous example using a ResourceLnk with parameters could have been replaced with two distinct Resource entities: one being the generic representation of a photograph from a particular folder, and the other being a specific photograph from that folder. For instance:

 

<Resource Key=’rMyPhoto’>

<BaseResourceLnk Key=’rPhotos’/>

<Params>

<Param Name=’PhotoName’> Me </Param>

</Params>

</Resource>

 

Application of any parameter substitution must therefore occur after the inheritance process has completed. If an implementation of this mechanism creates a temporary conglomerate entity in memory by doing a physical merge then it must not be persisted back to the data file, otherwise it constitutes a data corruption.

 

Electronic resources attached to a data collection present a specific issue during data exchange, i.e. import/export. Subject to privacy controls, the relevant resources should be bundled with an exported STEMMA Document and transmitted along with it to whoever the recipients are. They cannot be included in the body of the Document in any practical way. Although this area still needs work, there are several existing document container file mechanisms available. For instance:

 

  • Open Office XML. This is a zipped, XML-based file format developed by Microsoft. Initially standardised as ECMA-376 and later as ISO/IEC 29500.

 

  • MHTML, or MIME HTML. This is a Web page archive format used to combine resources that are typically represented by external links (e.g. images) together with HTML code into a single file. It is used extensively for rich-text email messages. MHTML is a proposed standard, circulated in a revised edition in 1999 as RFC 2557.

 

  • Java archives, i.e. jar files.

 

  • ISO/IEC NP 21320-1. This standard was still under development at the time of writing. It is expected to be a standardised version of the zip compressed file encoding.

 

Some basic functional requirements include: compression, optional encryption, preservation of relative directory structure, custom name-value properties per item, ability to keep STEMMA <Resource> references valid after unpacking, and ability to address each item separately from outside the container. Some interesting discussion on this topic may be found on the BetterGEDCOM wiki at: BG Container Formats and Packaging Data.

 



[1] RFC 1738 requires an explicit host name in a URI employing the file scheme, e.g. file://host/g:/folder/file.jpg, even if the value is left blank. RFC 3986 allows this to be optional, e.g. file:/g:/folder/file.jpg. This means a relative file specification does not require a leading ‘/’ separator at all, e.g. file:folder/file.jpg.

Comments