Case Studies
By Sandra Schloen, August 2017
Background
OCHRE offline was inspired by the team at Ashkelon, determined to do real-time data capture from the field. Situated in southern Israel, a country known for its tech industry, they had Internet options for the use of an online database and research environment such as OCHRE at a time when many other projects did not. Early attempts to use OCHRE directly, online from the field were less than ideal. But their persistence led to a compromise solution, that of running OCHRE offline while out in the field, then syncing up later once they were back within reach of good Internet access.
What started out there as a compromise, and as a concession to lack of reliable Internet service, has developed into a sophisticated and full-featured, robust strategy for running OCHRE under a variety of configurations, both offline and online to varying measures. A brand-new project at Tell Keisan (northern Israel) during the summer of 2016 proved to be an ideal test bed for both offline-OCHRE as well as the newly integrated Geospatially-Enabled OCHRE (GEO) features which included spatial data integration and the use of barcode labels.
What follows is a discussion of how strategies for using OCHRE for real-time data capture during an active excavation season have been implemented at several sites as real-world examples. The experience of the Chicago-Tübingen Expedition to Zincirli during the summer of 2017 will be used as a primary case study, and this will be contrasted with variations used by other projects.
Chicago-Tübingen Expedition to Zincirli -- Summer 2017
After twelve seasons at Zincirli the team arrived with the expectation that Internet service would be adequate for working online in OCHRE much of the time and that only the morning's work in the field would be offline, by necessity. But it seemed the community's demands for internet service had out-paced the town's infrastructure, and performance from the basic land-line service was poor (typically 2 Mbps, or worse). Two cellular modems, with data plans, were purchased from the local phone company. These provided reliable service of 15-20 Mbps and were kept reserved for work purposes (by not distributing the password for casual use). But in the end, most of the time OCHRE was used offline, the rhythm of work emerging as follows:
5 am Field supervisors head out the door with offline sessions and fully-charged tablet computers.
Each field supervisor has his or her own Locations & Objects hierarchy, specially crafted to represent only his/her Square of excavation for the current season. Unfinished loci of excavation from previous season's work in the Square have been copied in and assigned a new Observation for the current supervisor in the current year.
New loci of excavation are assigned numbers (manually) by the area supervisor in the field. These are entered within the Square hierarchy and described by the square-supervisor.
New Pottery Pails (created within each locus of excavation) are assigned a number by a unique-to-hierarchy Serial number property, applied and auto-labeled by an appropriate Predefinition. A barcode label is printed and affixed to the pottery pail tag.
New small finds (Registered items) are assigned a number by a unique-to-hierarchy Serial number property, applied and auto-labeled by an appropriate Predefinition. In addition, the Predefinition applies the Event "To register" to the item. A barcode label is printed and either affixed to the item's container, or left on its backing and inserted within the container (if such container is unlikely to be used to ultimately house and store the item).
Faunal remains are collected in "bone bags" tagged as "Faunal remains collection" and labeled as "Bone" by a Derived variable applied by an appropriate Predefinition. A barcode label is printed and affixed to the bone bag.
Soil samples are taken, inserted as child items within the loci, and auto-labeled and described using an appropriate Predefinition. A barcode label is printed and affixed to the soil sample, just like for pottery pails.
All of the above-used Predefinitions are set to Default to the current user and to Default to today's date on the item's Observation.
Field office, on site
5 am - 2 pm The Registrar, working in online OCHRE over the cellular modem connection, scans the barcode of an already-registered-in-the-field item from a previous day into the Find-by-code field of the Linked Items pane; the associated database item pops up. The Registrar completes the registration of the item, adding a Description or Notes, taking appropriate measurements, and applying events representing further necessary workflow (To draw, To photograph, etc.). In addition, the Event "Processed by Registrar" is added to each item as its registration is completed, thereby fulfilling the original "To register" event. Other events ("Discarded", "Sent to Pottery pail", etc.) might also be added, as needed.
2 pm Computers are returned to the data manager as field staff head to lunch. Offline sessions are sync'd right away, thus getting the day's work on the server where it will be backed up by nightfall.
3 pm A Query is run to find all items having the "To register" event. This typically resulted in 50-80 query results representing the small finds from the morning's work in the field. To these the Event tools were used to apply the Event marking the items as "Pending registration." This special event is used to ensure that supervisors working offline in sessions that contain the small finds that they have already submitted, don't inadvertently make changes to those items offline that might jeopardize the data being entered by the Registrar online.
3-4 pm This was a brief window of opportunity where all the project data is online. It served as a time to make corrections or adjustments, tweaking offline sessions or fixing problems that had emerged from previous day's work.
4 pm Ten new offline sessions are created, one on each field computer, with care being taken to make sure the correct User is matched with the correct computer.
4:30 pm Field computers are picked up by the field supervisors for the afternoon work session, their morning's work already on the server and their new offline sessions already loaded. These offline sessions would go into the field with them the following morning, and the cycle would repeat.
4:30 - 6 pm Staff assistants use the cellular hotspots to work on OCHRE online to do pottery counting and sorting per the Zincirli conventions. This involved adding information about the pottery within the pottery pail items but lagged, procedurally, a day or two (or more) behind the current data already offline on the field computers, so there was no need to worry about data contention.
8 pm The Data Manager ensures that all field computers are in their charging stations and that the barcode label printers are ready for another day's work in the field (battery, labels, ink).
Field computers, ready for action
As each day fell into the general rhythm above, so too a weekly rhythm fell into place as the season progressed.
On Sunday morning the geospatial data manager posted new geodatabases for each Area packaging the up-to-date collection of shapefiles representing the Area's loci, as traced from the orthorectified drone photos. These would be automatically picked up by the offline sessions in the afternoon as they polled the server for any changes to the geo-files needed by the Offline specification.
On Wednesday, the aerial photographer added to OCHRE the new orthorectified photos taken by the drone and processed by Photoscan. There were typically just a few of these, per Area, added each week and so they were individually attached to the relevant Offline specifications. Since these are very large raster images (typically 100-500 MB) these were sourced from a local drive; that is, the Access/Paths of the OCHRE Resource hierarchy managing these was set to a folder on an external hard disk and referenced as the D: drive. As that afternoon's offline sessions were prepared, the sessions would be created one at a time and the external drive would need to be plugged into each computer in turn as its offline session was Processing images... . The offlining process would copy the (large) files locally to OCHRE's resource cache as a one-time operation. Once the files were available in the local cache the hard drive would not be needed again for subsequent offlining since OCHRE would find the needed files locally and trust that they hadn't been changed.
Typically once a week, or as available, the field photographer posted updates to the field photos, having loaded them into OCHRE by using the Import Utility on the Resource hierarchy that manages the season's field photographs. These were also added to Sets, by excavation Square, one Set of which was included in each Offline specification. Once the images in the Set were updated, those images would be picked up as that afternoon's offline session was created.
On Thursday, the Top Plan raster images from the week which had been returned by the field supervisor's, having been marked up in the field, would have been scanned, georeferenced, imported, and posted on the server to be made available through OCHRE. These were managed in Resource hierarchies by Area so that each offline session only had to download and offline the top plans relevant to its own session.
Additionally on Thursday, back home the OCHRE Data Service was on stand-by to receive an Excel file of typically 1,000+ Total station points to be imported during the 3-4 pm window of opportunity. This file had been prepared by the geospatial data manager from the daily files extracted from multiple Total stations used across the site. Each point had been coded to represent an excavation item present in OCHRE -- a unit of excavation, a pottery pail, or a small find. The codes were expanded using Excel formulas to the conventions used by OCHRE items and would thus match the Name of its corresponding item in OCHRE. What would have taken hours over slow Internet, took mere minutes in Chicago. A little bit of coordination ensured that all data was online before the import began, and that offlining didn't begin until the import process was complete.
By Friday, the full range of data, including field/ad-hoc photographs available for hotspotting, and geo-spatial data (top plans, drone photos, total-station points) available for Map View integration, were available for the morning's extended work session in-house.
Sample Offline Profiles
A typical Offline profile for a field supervisor is shown below where the Offline items represent the following:
The Locations & Objects hierarchy created specifically for the current excavation season, and containing still-open loci from previous seasons that will come into play in the current season. This hierarchy has linked in to its Map Options both a local basemap that prescribes the extent of the excavation square, and a geodatabase that packages any shapes of loci excavated in this square in previous seasons (for reference). The offline session will automatically pick up the local basemap and the geodatabase from the Map Options and take them into the offline session since we are requesting to Include GEO content offline.
The "Square" item in OCHRE that pertains to this square-supervisor, defining the spatial extent of the excavation area. Note that this is a single item belonging to a different Locations & Objects hierarchy ("Grid system"). The offline session will contextualize this item in a skeletal hierarchy offline, and will automatically detect and take along with it the basemap and/or geodatabase pertaining to that hierarchy.
The Predefinitions needed offline for field excavators. Note that this gives us the opportunity to create a minimal list of Predefinitions to simplify the Apply-a-Predefinition pick-list for offline use and guide the excavators as to what they are capturing and describing.
The Resources hierarchy of georeferenced top plans (just those relevant for this Square to minimize the amount of data to download/offline) to be made available for offline GEO use.
The Set of field/ad-hoc photographs being made available for hotspotting. Note that we do not check on the "Include linked images..." option, as this is a primary list of images, rather than incidentally linked images. Only thumbnails will be offlined, as these are sufficient for hotspotting, and it minimizes the amount of data to be downloaded/offlined.
Individually specified drone images relevant to this Square of excavation. These are large images, selectively added to the session as needed for offline GEO use.
Typical Offline Profile
The faunal specialist, working gamely from the storage room in the way-back, opted to work offline so as not to have to deal with potentially problematic Internet access. In this situation we queried for all "Faunal remains collection" items (that is, bone bags) from a given Area of excavation and check-listed the resulting items directly into the Offline profile list. The only additional requirement was the hierarchy of Predefinitionsused for faunal analysis. In this case, OCHRE created a skeletal Locations & Objects framework, just enough to contextualize the items included in the Offline list. This gave the specialist an offline session containing only what she needed and nothing more -- only the items she needed, in context, and only the predefinitions she needed in order to describe them. Since she was adding and describing new subitems within the bone bags, her work would not conflict with any other work being done in OCHRE online.
Offline Profile, Faunal Specialist
Variations on the Theme -- Tell Keisan, Israel, Summer 2016
Tell Keisan was a brand-new project with no existing database content. The excavation team opened two areas which were represented by two Locations & Objects hierarchies, but which included a total of five 10 m x 10 m Squares -- four in Area E and one in Area F. As such, there were four offline sessions, one for each of the four squares in Area E, which offloaded the same hierarchy, that of Area E, every day. Care needed to be taken to upload these four sessions serially, not concurrently, so that they would not be contending with each other, potentially, for the same database items. In particular, the Area E hierarchy itself, was needed by each session to insert new items such as Loci and Pottery Pails.
Stamping the Observer and Date on each database item (e.g. each locus and pottery pail) protected it against inadvertently being changed by another database user. That is, when working offline, OCHRE enforces the rule that if a User account is not identified as an observer of an item, then that User is not allowed to make changes to it. Or to state it the other way, only a user who is identified as the observer of an item can edit it offline.
The GEO features were utilized fully at Tell Keisan, with georeferenced aerial images taken by the drone being used as the backdrop for Map View, shapefiles being traced from those same images packaged as a geodatabase and made available to Map View, and customized top plans being printed from Map View as needed. Files were shared among offline sessions by syncing a local "Box" account.
The Tell Keisan team was fortunate to have high-speed Internet (typically 60-80 Mbps) available at the dig house and office compound at the kibbutz where they worked. The Registrar, Pottery specialists, and other in-house staff worked online throughout the morning. Since this work lagged a day or two, procedurally, relative to the data being captured by the field sessions, there were no issues of contention. Offline sessions were sync'd up at lunch time, and the team remained online for the afternoon's work session. After dinner the assistant data manager would prepare offline sessions for the 5 field computers which would be distributed to the respective supervisors on their way out the door the next morning at 5 am.
Assistant Data Manager, Annie Schloen, preparing 5 field computers at Tell Keisan
Variations on the Theme -- Tell Shimron, Israel, Summer 2017
The Tell Shimron team piloted the use of many new features of the OCHRE offline system during their 2017 summer season (June - July). New units of excavation were named using Serial number variables, unique-to-hierarchy, applied by an appropriate Predefinition. Items within these units of excavation -- pottery pails, small finds, etc., -- were assigned codes by scanning into their Code fields a pre-printed barcode label. Derived variables applied by appropriate Predefinitions auto-labeled the items using their assigned Code. Collections of items -- faunal remains, chipped stone -- were itemized and described using Tabular Views. As the season wrapped up, all items were tracked using OCHRE's Inventory features.
The Tell Shimron team used cellular modem hotspots in the field to sync up offline sessions at lunch time. Although Internet service was mixed, and at times problematic, it was generally adequate for most purposes. The field computers were returned to the field supervisors so they could continue their work using OCHRE online. The field supervisors, appropriately trained, were then responsible for creating their own offline sessions later in the day.
Onsite OCHRE Data Manager, Nick Schulte, uploading 11 offline sessions over lunch
The Tell Shimron geospatial team managed their own geospatial resources in ArcMap, and created daily top plans according to their own conventions, but they would catalog their raster images in OCHRE so that they were available to offline GEO sessions. They would also link in their shapefiles by Area (using the Map Options of the appropriate hierarchies), so that the shapefile data was available for OCHRE to display in Map View. This included both polygon shapefiles that displayed the extent and shape of units of excavation, appropriately identified in a Name field of the shapefile's attribute table so that OCHRE could recognize them, as well as point shapefiles (also appropriately tagged with an OCHRE Name in the attribute table) that represented elevation points. Note that in this case the points collected from the Total-station were imported to ArcMap and treated as point-shapefiles (in contrast to the Zincirli approach of importing the coded Total-station points directly into the appropriate OCHRE items as coordinate-style property values). Once the geospatial team had prepared and posted their data to the server where it was available to all, the OCHRE Data Manager would give the go-ahead, and the field supervisors were free to create their new offline sessions in preparation for the next day's work.
Variations on the Theme -- Tayinat Archaeology Project, Turkey, Summer 2017
With other more pressing matters on their plate, like dealing with this special find early in their highly focused expedition of 2017, and with less-than-favourable Internet conditions, the small but dedicated group of Tayinat team members opted to do their primary data capture using their tried-and-true paper-based methods. But while still onsite during their final days of wrap-up, and with the artifacts still at hand, they did an intensive batch entry of the data from the season. Working with targeted offline sessions to counteract the slow Internet they were able to digitally capture their observations from the active season along with the details of the artifacts amassed by their excavation. They returned home with their data satisfyingly integrated and available online, having been resourceful with their use of OCHRE offline and having made the best of their onsite technical resources.
Special find at Tel Tayinat, Summer 2017
Variations on the Theme -- Stress Conditions
We have seen offline session created under conditions of extremely slow Internet, clocking in at less than 0.5 Mbps. While this is obviously less-than-ideal, it is in fact possible. OCHRE is designed to retry if the Internet connection lapses, and is transaction-oriented at a very granular level, and so it handles reasonably well sub-optimal connections. Similarly on the upload, if the syncing up fails due to unreliable Internet the posting can simply be restarted; OCHRE will pick up and continue where it left off. Under such conditions, the best strategy is to set it running then walk away -- go have lunch, take a nap, or leave it overnight. A watched pot never boils!
We have seen specialists create an offline session then disconnect to work independently for weeks at a time before reconnecting with the online database. As long as the online content is not reorganized, and as long as there are some procedural considerations in place to avoid contention (that is, 2 faunal specialists are not working on the same faunal remains offline at the same time), there is not much scope for trouble. Even weeks later, with thousands of offline transactions to reconcile, the syncing-up process will tick-tick-tick along and post the offline work in batch. While this is not necessarily recommended -- mostly because this leaves work offline which should be protected by some backup procedure -- it is certainly within the range of options available.