Choosing and Evaluating Data Well Software

Members:
  • Timo Tuominen
  • Timo Aalto
  • Matti Lassila
  • Theodor Tolstoy

Who will (initially) use the data well? What about later on?
What are the main concerns of the company/organization implementing and hosting the data well?
Should we dive into the code of the 3rd party software used?

==========================================
Draft:
==========================================

Choosing and evaluating data wells

Misc. stuff

Versioning of the data

Advantages of data wells / repositories
  • possibility to save data inside/outside of database
  • how to deal with data integrity
  • data curation and persistence
  • possibility have separate indices for different purposes

Methods of evaluating software
  • Gather your requirements
  • Create, analyze your use cases
  • Compare your use cases with documentation, ask actual users
  • Evaluate project & development process

Criteria 
  • Social
  • Project quality, maturity: number of active contributors
  • Availability of dommercial support for the product
  • Technical
  • Platform independency
  • Scalability 
  • Modularity and extensibility
  • Variety and quality of interfaces
  • Code quality
  • Existence of out-of-box interfaces (both graphical (administrative) & APIs)

Examples of different uses / requirements
  • is the datawell operational or a long term storage (LTS) system? or both?


Possible software

### DSpace ###

Data model is quite rigid (communities, subcommunities, collections, items, bundles, bitstreams)
Not all levels of the data model have useful metadata
access control is a bit hack (originally built for complete openness)
until very recently there's has not been good API:s for systems integration (eg. separating UI/backend etc).
DSpace  has tools for maintenance work,  built specially for digital object storage
- No versioning support



### Fedora Commons###
Fedora has tools for maintenance work, built specially for digital object storage
Very flexible data model
- does versioning for objects


### General JSR-170 content repository (ie. Apache Jackrabbit) ###

## Document oriented databases? ##
like CouchDB, MongoDB etc
building on top of these things :) DIY approach

Comments

Magali Mermet - Jun 9, 2010 5:02 AM

I don't belong to this group so I just add a comment:
choosing a data well
To the question : « third parties software : hard and laborious process to adjust. We'd like to have the possibility to export data »
Proposition : how to evaluate : get user cases (company, institutions...), that seems to be close to your requirements, get in touch with who use it, see if it could fit your organisation and then give a proof test : use user's experiences not to reinvent the wheel. It will also give you the opportunity to meet a range amount of different users that are directly or not influenced (working for, getting data from...) by the software and target where the bottlenecks are, the strength and if the bottleneck would strike a department that is very important to you..."