Dspace is probably one of the most widely used digital repository software package. It has evolved into a complex software stack over the past couple of years; as of this writing (April 2011), the latest version is 1.7.1 with new features that include DSpace discovery, Mirage XMLUI theme, a curation administration UI and most importantly, the AIP Backup& Restore process.
I believe I mentioned in one of my previous posts that my research is focused arguing that the complex software stacks currently employed in most of the digital repositories tools can in actual fact be abstracted; in other words, make the overall architecture simple. Not quite sure how to interpret the statement below (it’s on the Dspace about page)
It is free and easy to install “out of the box” and completely customizable to fit the needs of any organization.
I beg to differ, it’s not “easy to install” and it sure as hell not “out of the box”….. I had to install a motherload of components; and the installation manual makes reference to specific software versions of the same components. And who ever came up with the idea of integrating it with Oracle? We all know the software is not free; well the free express edition has CPU(2 processor limit), RAM (no more than 2GB) and above all space(10g XE has a 4GGB cap….. the newer 11gR2 XE version however has a 11GB cap, but as of this writing, it’s still a beta version) limitations.
Installing the prerequisite software tools was a nightmare on Ubuntu…. luck enough, the local Ubuntu repository had most of the software. For those interested, I’ve tried to document each and every high level step I took during the installation process.
STEP 1: Download DSpace software
Download the software from here.
STEP 2: Prerequisite Software Installation
Oracle Java JDK
Apparently, only Oracle’s JDK has been tested with each release and is the only JDK know to work correctly with DSpace; but I decided to be stubborn and just used the Ubuntu Open JDK.
phiri@PHRLIG001:~$ java -version java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.7) (6b20-1.9.7-0ubuntu1) OpenJDK Server VM (build 19.0-b09, mixed mode) phiri@PHRLIG001:~$
Apache Maven 2.2.x
I didn’t have maven; so I had to install it.
phiri@PHRLIG001:~$ sudo apt-get install maven maven2 maven-debian-helper maven-ant-helper maven-repo-helper phiri@PHRLIG001:~$ sudo apt-get install maven2 [sudo] password for phiri: Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: antlr bsh bsh-gcj fop groovy ivy java-wrappers libantlr-java libavalon-framework-java libbackport-util-concurrent-java libbatik-java libbsf-java libclassworlds-java libcommons-cli-java libcommons-collections-java libcommons-configuration-java libcommons-io-java libcommons-jxpath-java libcommons-lang-java libcommons-net2-java libcommons-validator-java libdoxia-java libdoxia-sitetools-java : : Done...... phiri@PHRLIG001:~$ whereis maven2 maven2: /etc/maven2 /usr/share/maven2 phiri@PHRLIG001:~$ /usr/share/maven2/bin/mvn --version Apache Maven 2.2.1 (rdebian-4) Java version: 1.6.0_20 Java home: /usr/lib/jvm/java-6-openjdk/jre Default locale: en_ZA, platform encoding: UTF-8 OS name: "linux" version: "2.6.35-28-generic" arch: "i386" Family: "unix" phiri@PHRLIG001:~$
Configuring a Proxy
I can’t run away from this; I sit behind a bloody proxy…….. but then I’ve configured CNTLM (a transparent proxy and so the entire process is slightly less painful than it normally would; and believe me when I say it is painful). What I basically did was copy the settings.xml file from the maven2 and then clone it into the home directory.
phiri@PHRLIG001:~$ sudo find / -name settings.xml /etc/maven2/settings.xml phiri@PHRLIG001:~$ <settings> : : <proxies> <proxy> <active>true</active> <protocol>http</protocol> <host>localhost</host> <port>1955</port> <!--username>proxyuser</username--> <!--password>somepassword</password--> <nonProxyHosts>www.google.com|*.somewhere.com</nonProxyHosts> </proxy> </proxies> : : </settings>
Notice that I’ve commented out the username and password cause CNTLM handles all that crap for me.
Apache Ant 1.7 or later
phiri@PHRLIG001:~$ whereis ant ant: /usr/bin/ant /usr/share/ant /usr/share/man/man1/ant.1.gz phiri@PHRLIG001:~$ /usr/bin/ant -version Apache Ant version 1.8.0 compiled on May 9 2010 phiri@PHRLIG001:~$
Relational Database
I had no choice but to go with PostgreSQL; No known Oracle version has been certified with Ubuntu 10.10. Besides, I just wanted to use a different RDBMS. PostgreSQL doesn’t come out of the box with Ubuntu 10.10 and so I had to install it.
phiri@PHRLIG001:~$ sudo apt-get install postgresql Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: libpq5 postgresql-8.4 postgresql-client-8.4 postgresql-client-common postgresql-common Suggested packages: oidentd ident-server postgresql-doc-8.4 The following NEW packages will be installed: libpq5 postgresql postgresql-8.4 postgresql-client-8.4 postgresql-client-common postgresql-common 0 upgraded, 6 newly installed, 0 to remove and 0 not upgraded. Need to get 4,911kB of archives After this operation, 13.1MB of additional disk space will be used. : :
Servlet Engine
I settled for tomcat6; I had it installed and pre-configured prior to installing DSpace; in an event that you don’t have it installed though, the installation process is pretty trivial.
phiri@PHRLIG001:~$ sudo /etc/init.d/tomcat6 status * Tomcat servlet engine is running with pid 1379 phiri@PHRLIG001:~$ sudo apt-get install tomcat6 Reading package lists... Done Building dependency tree Reading state information... Done tomcat6 is already the newest version. : :
STEP 3: Installation Instructions
Username Creation
So I naturally settled for the Default Release (dspace-<version>-release.zip)
phiri@PHRLIG001:~$ sudo useradd -m dspace phiri@PHRLIG001:~$ cp Downloads/dspace-1.7.1-release.tar.gz /tmp/ phiri@PHRLIG001:~$ cd /tmp/ phiri@PHRLIG001:/tmp$ gunzip dspace-1.7.1-release.tar.gz phiri@PHRLIG001:/tmp$ tar -xvf dspace-1.7.1-release.tar
Database configuration
root@PHRLIG001:/tmp# su postgres postgres@PHRLIG001:/tmp$ createuser -U postgres -d -A -P dspace Enter password for new role: Enter it again: Shall the new role be allowed to create more new roles? (y/n) y postgres@PHRLIG001:/tmp$ postgres@PHRLIG001:/tmp$ createdb -h localhost -U dspace -E UNICODE dspace Password: postgres@PHRLIG001:/tmp$
Installation Package
This seems like the tricky part of the whole process.
[INFO] ------------------------------------------------------------------------ [ERROR] BUILD ERROR [INFO] ------------------------------------------------------------------------ [INFO] Failed to resolve artifact. Missing: ---------- 1) org.dspace:dspace-services-api:jar:2.0.3 Try downloading the file manually from the project website. Then, install it using the command: mvn install:install-file -DgroupId=org.dspace -DartifactId=dspace-services-api -Dversion=2.0.3 -Dpackaging=jar -Dfile=/path/to/file Alternatively, if you host your own repository you can deploy the file there: mvn deploy:deploy-file -DgroupId=org.dspace -DartifactId=dspace-services-api -Dversion=2.0.3 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] Path to dependency: 1) org.dspace.modules:xmlui:war:1.7.1 2) org.dspace:dspace-xmlui-api:jar:1.7.1 3) org.dspace:dspace-services-api:jar:2.0.3 2) org.apache.cocoon:cocoon-flowscript-impl:jar:1.0.0 Try downloading the file manually from the project website. Then, install it using the command: mvn install:install-file -DgroupId=org.apache.cocoon -DartifactId=cocoon-flowscript-impl -Dversion=1.0.0 -Dpackaging=jar -Dfile=/path/to/file Alternatively, if you host your own repository you can deploy the file there: mvn deploy:deploy-file -DgroupId=org.apache.cocoon -DartifactId=cocoon-flowscript-impl -Dversion=1.0.0 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] Path to dependency: 1) org.dspace.modules:xmlui:war:1.7.1 2) org.dspace:dspace-xmlui-api:jar:1.7.1 3) org.dspace:dspace-xmlui-wing:jar:1.7.1 4) org.apache.cocoon:cocoon-flowscript-impl:jar:1.0.0 ---------- 2 required artifacts are missing. for artifact: org.dspace.modules:xmlui:war:1.7.1 from the specified remote repositories: central (http://repo1.maven.org/maven2), sonatype-nexus-snapshots (https://oss.sonatype.org/content/repositories/snapshots) [INFO] ------------------------------------------------------------------------ [INFO] For more information, run Maven with the -e switch [INFO] ------------------------------------------------------------------------ [INFO] Total time: 13 minutes 58 seconds [INFO] Finished at: Sat Apr 16 14:35:32 SAST 2011 [INFO] Final Memory: 35M/81M [INFO] ------------------------------------------------------------------------ dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$ dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$ ant help Buildfile: build.xml does not exist! Build failed
This part was a nightmare; I even lost track of the number of times I had to run maven with the clean argument. But then eventually, things worked out.
: : 59K downloaded (commons-logging-1.1.1.jar) 109K downloaded (junit-4.1.jar) [WARNING] The following patterns were never triggered in this artifact exclusion filter: o '*:war:*' [INFO] Copying 1394 files to /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir [INFO] [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] ------------------------------------------------------------------------ [INFO] DSpace Addon Modules .................................. SUCCESS [1.816s] [INFO] DSpace XML-UI (Manakin) :: Web Application ............ SUCCESS [3:27.807s] [INFO] DSpace LNI :: Web Application ......................... SUCCESS [28.444s] [INFO] DSpace OAI :: Web Application ......................... SUCCESS [12.379s] [INFO] DSpace JSP-UI :: Web Application ...................... SUCCESS [23.363s] [INFO] DSpace SWORD :: Web Application ....................... SUCCESS [13.958s] [INFO] DSpace SOLR :: Web Application ........................ SUCCESS [1:37.196s] [INFO] DSpace Assembly and Configuration ..................... SUCCESS [1:36.467s] [INFO] ------------------------------------------------------------------------ [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESSFUL [INFO] ------------------------------------------------------------------------ [INFO] Total time: 8 minutes 2 seconds [INFO] Finished at: Sat Apr 16 15:40:50 SAST 2011 [INFO] Final Memory: 45M/141M [INFO] ------------------------------------------------------------------------ dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$
Build DSpace& Initialize Database
dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$ ant fresh_install Buildfile: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml init_installation: [mkdir] Created dir: /usr/local/dspace/bin [mkdir] Created dir: /usr/local/dspace/config [mkdir] Created dir: /usr/local/dspace/lib [mkdir] Created dir: /usr/local/dspace/etc [mkdir] Created dir: /usr/local/dspace/webapps [mkdir] Created dir: /usr/local/dspace/exports [mkdir] Created dir: /usr/local/dspace/exports/download [mkdir] Created dir: /usr/local/dspace/assetstore [mkdir] Created dir: /usr/local/dspace/handle-server [mkdir] Created dir: /usr/local/dspace/search [mkdir] Created dir: /usr/local/dspace/log [mkdir] Created dir: /usr/local/dspace/upload [mkdir] Created dir: /usr/local/dspace/reports
So my happiness was short-lived; there’s a particular file I was unable to download using the build script. See the error below.
: : copy_webapps: [copy] Copying 968 files to /usr/local/dspace/webapps [copy] Copied 130 empty directories to 6 empty directories under /usr/local/dspace/webapps [copy] Copying 6 files to /usr/local/dspace/webapps build_webapps_wars: check_geolite: init_geolite: update_geolite: [echo] Downloading: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz [get] Getting: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz [get] To: /usr/local/dspace/config/GeoLiteCity.dat.gz [get] Error getting http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz to /usr/local/dspace/config/GeoLiteCity.dat.gz BUILD FAILED /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:882: The following error occurred while executing this line: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:945: The following error occurred while executing this line: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:931: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) : :
What I did was manually download the file and then commented out the portion of the xml file that was downloading the file since I already had a copy on my hardrive. The results is below 🙂
dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$ ant update_geolite Buildfile: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml update_geolite: [echo] Downloading: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz [gunzip] Expanding /usr/local/dspace/config/GeoLiteCity.dat.gz to /usr/local/dspace/config/GeoLiteCity.dat [delete] Deleting: /usr/local/dspace/config/GeoLiteCity.dat.gz BUILD SUCCESSFUL Total time: 1 second dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$
Deploy Web Applications
I settled for Technique A; I basically copied all the webapps to the tomcat6 webapps folder.
root@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir# cp -R /usr/local/dspace/webapps/* /var/lib/tomcat6/webapps/ root@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir#
Finally!!!!
The installation process is clearly not so straight forward after all. Happy installation.