June 2017
SAFBuilder Tool for importing metadata and fulltext in DSpace through Batch Import (ZIP) tool available in DSpace WebUI.
SAFBuilder allow to use csv file for metadata and in same folder one can have fulltext of all items and the following command will create a zip file which can straight be used to import bulk data into DSpace collection. To use this tool one needs JDK, Maven, Git
https://github.com/DSpace-Labs/SAFBuilder
git clone https://github.com/DSpace-Labs/SAFBuilder.git cd SAFBuilder
./safbuilder.sh -c src/sample_data/AAA_batch-metadata.csv -z
19th October 2016
DSpace xalan error - too many files open
For the above error made following change in /etc/security/limits.conf
so, in /etc/security/limits.conf you should add this line:
# End of file
* hard nofile 65535
* soft nofile 65535
root hard nofile 65535
root soft nofile 65535
not sure whether it is correct but need to check.
DSpace migration of GIPE, Pune - 10th - 20th October 2015
GIPE had installed dspace and had upgraded it but due to low disk space external disk was added and assetstore was mounted on assetstore0 and assetstore1. Now they wanted to shift from old server to a new server where they upgraded to 16 GB RAM and 4 TB hard disk. I first asked them to install new liblivecd with dspace 5.2.
1. Then used following commands to replace existing new dspace assetstore and adding data of assetstore0 and assetstore1 on new /home/dspace/assetstore by running following commands from the following site :
https://library.osu.edu/blogs/it/merge-two-assetstores-dspace/
rsync -a –progress –stats dspace@remoteserver:/home/dspace/assetstore0/ /home/dspace/assetstore
rsync -a –progress –stats dspace@remoteserver:/home/dspace/assetstore1/ /home/dspace/assetstore
2. Then deleted new dspace postgres database dspace and user dspace and again created blank dspace user and database and then dumped postgres backup on new dspace.
3. Also copied log, solr, webapps directories from old dspace to new dspace /home/folder.
4. Changed ownership of whole /home/dspace folder to dspace:dspace
5. In web.xml and solrconfig.xml it was pointing to /home/dspace51 and then changed it to /home/dspace and then tried to restart tomcat and it showed all data properly as well as it showed all metadata with fulltext properly. Now only need to reindex data, check solr, handle server configuration and IP configuration needs to be done, small customization which was done that needs to be done and then this new server can go live.
DSpace metadata import through gui as well as through command line.
DSpace supports importing metadata in csv format through webui. For importing metadata through webui following changes are required
Before using this feature through webui you need to edit file in /home/dspace/config/modules/
bulkedit.cfg and uncomment following lines and change number from 20 to approx. 500 so that ur csv data can be imported with 500 records at a time. It can take 2000 as well (Pl. have a look at this link as well ( https://digitalriceprojects.pbworks.com/w/page/47771029/Metadata%20batch%20process ). Unhash the following lines and gui-item-limit to 500 instead of 20.
### Bulk metadata editor settings ###
# The delimiter used to separate values within a single field (defaults to a double pipe ||)
valueseparator = ||
# The delimiter used to separate fields (defaults to a comma for CSV)
fieldseparator = ,
# The delimiter used to serarate authority data (defaults to a double colon ::)
authorityseparator = ::
# A hard limit of the number of items allowed to be edited in one go in the UI
# (does not apply to the command line version)
gui-item-limit = 500
# Metadata elements to exclude when exporting via the user interfaces, or when using the
# command line version and not using the -a (all) option.
# ignore-on-export = dc.date.accessioned, dc.date.available, \
# dc.date.updated, dc.description.provenance
# Should the 'action' column allow the 'expunge' method. By default this is set to false
allowexpunge = true
Now with your csv file you need to add two columns one as id and second as collection id. In the id column you only need to add "+" sign so that import command of dspace understands it to add this row as one new item in the collection.
Once csv file is ready "Import Metadata" can be run through webui. Now all your metadata will be imported into dspace with fields selected by you. It will automatically add new url.
Metadata import can also be run through terminal by using following command
./dspace metadata-import -f /home/library/Downloads/sample_data_1.csv -e library@localhost -w -n -t
For dspace if it throws error about JAVA heap. After talking on dspace@IRC the following commands helped to reduce java heap error. Discovery index as well as tomcat when run it should be run with following
dspace@dspace#CATALINA_OPTS="-Xmx2048m -Xms1024m -Dfile.encoding=UTF-8" /home/dspace/apache-tomcat-7.0.55/bin/startup.sh
(Xmx 2GB can be added if processor RAM is 4 GB)
dspace@dspace#CATALINA_OPTS="-Xmx2048m -Dfile.encoding=UTF-8" /home/dspace/bin/dspace index-discovery -cb
index-discovery takes time to index all records if data is large.