DRMC

71days since
Next DRMC Meeting

Comparison, Federated vs. Built-in DSpace Search

Feature
 Federated(PKP Harvester) Built-in
 Implied boolean operator between words
 AND    OR (default, configurable)
Full-text  search
 No (unless full-text is included in descriptive metadata)
 First 10000 terms in document(configurable)
 Search 
multiple sites simultaneously
 yes no
 Indexed fields
 Believed to be all dublin core fields in current implementation...verifying.

dc.contributor.*,
dc.creator.*,
dc.title.*,
dc.subject.*,
dc.description.abstract,
dc.description.statementofresponsibility,
dc.relation.ispartofseries,
dc.description.tableofcontents,
dc.format.mimetype,
dc.description.sponsorship,
dc.identifier.*,
dc.language.iso
(default, configurable - changed on main DRC site so all dc.language variants are indexed to give results for example 'French' search; John is in the process of adding dc.description on main drc site)
 Thumbnails in results
 yes(customization, not currently functional on all repositories) yes
 Able to search repositories other than DSpace
 yes (prototyped with CONTENTdm, Fedora repositories - may require significant maintenance work)
 no
 Results display
 title, creator,thumbnail or logo, approximately first 200 characters of description,repository name (configurable).  Theme is currently OhioLINK generic from all sites.
 title, contributors, insertion date (default, configurable).  Themable to host repository.
 Default results ordering
researching...(appears to count total number of occurrences of search terms in record, which biases search results in favor of records with long abstracts(ETDs, for example)
researching...
 Result sort options
 none enabled
 relevance/title/submit date/issue date (ascending/descending)
 Maximum size of result set
 Limit to 4000 results per search term, result sets are then merged( configurable)
 Not limited
 Time required for addition of new items to the index
Configurable - current suggestion is to harvest new metadata every 24 hours
 immediate
 Language Support
 researching... researching...
 Metadata Schema
 dublin core , mods, marc built-in.  Can be extended and crosswalked.
Metadata schema that can be added to DSpace - so any flat (non-hierarchical) schema, qualifiers are allowed
 Automatic Spelling Corrections
 no no
 Highlight search terms in results
Not currently, but this should be simple to implement    
 no
 Suggest controlled vocabulary term to searcher 
 no no
 Maintenance costs
  
 Performance  
 Scope Limits   
 Limit to a single archive
 Limit to a single collection
 Results per page
 10 (configurable)
 10
 Post-search faceting    
 no no
 Wildcards * to match sequence of characters
 * to match sequence characters
 Advanced search
 specify any dublin core field, archive.  "Language"  field populated with current choices, but choices are not controlled.
 specify one of 8 indices or keyword, conjunction, sort order
 Allow user to revise search
 Advanced search only
 no
Grouping of words
 Use parentheses
 no (or broken)
 Exact Phrase Search
 Use double quotation marks
 Use double quotation marks
 Availability of statistics
  
 word stemming (sail v. sails v. sailing)
 no yes
 stop words
 yes - researching list
"the"

 yes - researching list
"the","and","or"
 case-sensitive
 no no

List of problems and questions