DSpace Installation on Ubuntu 10.10

Dspace is probably one of the most widely used digital repository software package. It has evolved into a complex software stack over the past couple of years; as of this writing (April 2011), the latest version is 1.7.1 with new features that include DSpace discovery, Mirage XMLUI theme, a curation administration UI and most importantly, the AIP Backup& Restore process.

I believe I mentioned in one of my previous posts that my research is focused arguing that the complex software stacks currently employed in most of the digital repositories tools can in actual fact be abstracted; in other words, make the overall architecture simple.  Not quite sure how to interpret the statement below (it’s on the Dspace about page)

It is free and easy to install “out of the box” and completely customizable to fit the needs of any organization.

I beg to differ, it’s not “easy to install” and it sure as hell not “out of the box”….. I had to install a motherload of components; and the installation manual makes reference to specific software versions of the same components. And who ever came up with the idea of integrating it with Oracle? We all know the software is not free; well the free express edition has CPU(2 processor limit), RAM (no more than 2GB) and above all space(10g XE has a 4GGB cap….. the newer 11gR2 XE version however has a 11GB cap, but as of this writing, it’s still a beta version) limitations.

Installing the prerequisite software tools was a nightmare on Ubuntu…. luck enough, the local Ubuntu repository had most of the software. For those interested, I’ve tried to document each and every high level step I took during the installation process.

STEP 1: Download DSpace software

Download the software from here.

STEP 2: Prerequisite Software Installation

Oracle Java JDK

Apparently, only Oracle’s JDK has been tested with each release and is the only JDK know to work correctly with DSpace; but I decided to be stubborn and just used the Ubuntu Open JDK.

phiri@PHRLIG001:~$ java -version
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.7) (6b20-1.9.7-0ubuntu1)
OpenJDK Server VM (build 19.0-b09, mixed mode)
phiri@PHRLIG001:~$

Apache Maven 2.2.x

I didn’t have maven; so I had to install it.

phiri@PHRLIG001:~$ sudo apt-get install maven
maven2               maven-debian-helper
maven-ant-helper     maven-repo-helper
phiri@PHRLIG001:~$ sudo apt-get install maven2
[sudo] password for phiri:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
antlr bsh bsh-gcj fop groovy ivy java-wrappers libantlr-java
libavalon-framework-java libbackport-util-concurrent-java libbatik-java
libbsf-java libclassworlds-java libcommons-cli-java
libcommons-collections-java libcommons-configuration-java libcommons-io-java
libcommons-jxpath-java libcommons-lang-java libcommons-net2-java
libcommons-validator-java libdoxia-java libdoxia-sitetools-java

:

:

Done......

phiri@PHRLIG001:~$ whereis maven2
maven2: /etc/maven2 /usr/share/maven2
phiri@PHRLIG001:~$ /usr/share/maven2/bin/mvn --version
Apache Maven 2.2.1 (rdebian-4)
Java version: 1.6.0_20
Java home: /usr/lib/jvm/java-6-openjdk/jre
Default locale: en_ZA, platform encoding: UTF-8
OS name: "linux" version: "2.6.35-28-generic" arch: "i386" Family: "unix"
phiri@PHRLIG001:~$

Configuring a Proxy

I can’t run away from this; I sit behind a bloody proxy…….. but then I’ve configured CNTLM (a transparent proxy and so the entire process is slightly less painful than it normally would; and believe me when I say it is painful). What I basically did was copy the settings.xml file from the maven2 and then clone it into the home directory.

phiri@PHRLIG001:~$ sudo find / -name settings.xml
/etc/maven2/settings.xml
phiri@PHRLIG001:~$
<settings>
:
:
<proxies>
<proxy>
<active>true</active>
<protocol>http</protocol>
<host>localhost</host>
<port>1955</port>
<!--username>proxyuser</username-->
<!--password>somepassword</password-->
<nonProxyHosts>www.google.com|*.somewhere.com</nonProxyHosts>
</proxy>
</proxies>
:
:
</settings>

Notice that I’ve commented out the username and password cause CNTLM handles all that crap for me.

Apache Ant 1.7 or later

phiri@PHRLIG001:~$ whereis ant
ant: /usr/bin/ant /usr/share/ant /usr/share/man/man1/ant.1.gz
phiri@PHRLIG001:~$ /usr/bin/ant -version
Apache Ant version 1.8.0 compiled on May 9 2010
phiri@PHRLIG001:~$

Relational Database

I had no choice but to go with PostgreSQL; No known Oracle version has been certified with Ubuntu 10.10. Besides, I just wanted to use a different RDBMS. PostgreSQL doesn’t come out of the box with Ubuntu 10.10 and so I had to install it.

phiri@PHRLIG001:~$ sudo apt-get install postgresql
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
libpq5 postgresql-8.4 postgresql-client-8.4 postgresql-client-common postgresql-common
Suggested packages:
oidentd ident-server postgresql-doc-8.4
The following NEW packages will be installed:
libpq5 postgresql postgresql-8.4 postgresql-client-8.4 postgresql-client-common postgresql-common
0 upgraded, 6 newly installed, 0 to remove and 0 not upgraded.
Need to get 4,911kB of archives
After this operation, 13.1MB of additional disk space will be used.
:
:

Servlet Engine

I settled for tomcat6; I had it installed and pre-configured prior to installing DSpace; in an event that you don’t have it installed though, the installation process is pretty trivial.

phiri@PHRLIG001:~$ sudo /etc/init.d/tomcat6 status
 * Tomcat servlet engine is running with pid 1379
 phiri@PHRLIG001:~$ sudo apt-get install tomcat6
 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 tomcat6 is already the newest version.
 :
 :

STEP 3: Installation Instructions

Username Creation

So I naturally settled for the Default Release (dspace-<version>-release.zip)

phiri@PHRLIG001:~$ sudo useradd -m dspace
phiri@PHRLIG001:~$ cp Downloads/dspace-1.7.1-release.tar.gz /tmp/
phiri@PHRLIG001:~$ cd /tmp/
phiri@PHRLIG001:/tmp$ gunzip dspace-1.7.1-release.tar.gz
phiri@PHRLIG001:/tmp$ tar -xvf dspace-1.7.1-release.tar

Database configuration

root@PHRLIG001:/tmp# su postgres
 postgres@PHRLIG001:/tmp$ createuser -U postgres -d -A -P dspace
 Enter password for new role:
 Enter it again:
 Shall the new role be allowed to create more new roles? (y/n) y
 postgres@PHRLIG001:/tmp$
postgres@PHRLIG001:/tmp$ createdb -h localhost -U dspace -E UNICODE dspace
Password:
postgres@PHRLIG001:/tmp$

Installation Package

This seems like the tricky part of the whole process.

[INFO] ------------------------------------------------------------------------
 [ERROR] BUILD ERROR
 [INFO] ------------------------------------------------------------------------
 [INFO] Failed to resolve artifact.

Missing:
 ----------
 1) org.dspace:dspace-services-api:jar:2.0.3
Try downloading the file manually from the project website.
Then, install it using the command:
mvn install:install-file -DgroupId=org.dspace -DartifactId=dspace-services-api -Dversion=2.0.3 -Dpackaging=jar -Dfile=/path/to/file
Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=org.dspace -DartifactId=dspace-services-api -Dversion=2.0.3 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
Path to dependency:
1) org.dspace.modules:xmlui:war:1.7.1
2) org.dspace:dspace-xmlui-api:jar:1.7.1
3) org.dspace:dspace-services-api:jar:2.0.3
2) org.apache.cocoon:cocoon-flowscript-impl:jar:1.0.0
Try downloading the file manually from the project website.
Then, install it using the command:
mvn install:install-file -DgroupId=org.apache.cocoon -DartifactId=cocoon-flowscript-impl -Dversion=1.0.0 -Dpackaging=jar -Dfile=/path/to/file
Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=org.apache.cocoon -DartifactId=cocoon-flowscript-impl -Dversion=1.0.0 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
 1) org.dspace.modules:xmlui:war:1.7.1
 2) org.dspace:dspace-xmlui-api:jar:1.7.1
 3) org.dspace:dspace-xmlui-wing:jar:1.7.1
 4) org.apache.cocoon:cocoon-flowscript-impl:jar:1.0.0

----------
 2 required artifacts are missing.

for artifact:
 org.dspace.modules:xmlui:war:1.7.1

from the specified remote repositories:
 central (http://repo1.maven.org/maven2),
 sonatype-nexus-snapshots (https://oss.sonatype.org/content/repositories/snapshots)

[INFO] ------------------------------------------------------------------------
 [INFO] For more information, run Maven with the -e switch
 [INFO] ------------------------------------------------------------------------
 [INFO] Total time: 13 minutes 58 seconds
 [INFO] Finished at: Sat Apr 16 14:35:32 SAST 2011
 [INFO] Final Memory: 35M/81M
 [INFO] ------------------------------------------------------------------------
 dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$
 dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$ ant help
 Buildfile: build.xml does not exist!
 Build failed

This part was a nightmare; I even lost track of the number of times I had to run maven with the clean argument. But then eventually, things worked out.

:
 :
 59K downloaded (commons-logging-1.1.1.jar)
 109K downloaded (junit-4.1.jar)
 [WARNING] The following patterns were never triggered in this artifact exclusion filter:
 o '*:war:*'

[INFO] Copying 1394 files to /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir
 [INFO]
 [INFO]
 [INFO] ------------------------------------------------------------------------
 [INFO] Reactor Summary:
 [INFO] ------------------------------------------------------------------------
 [INFO] DSpace Addon Modules .................................. SUCCESS [1.816s]
 [INFO] DSpace XML-UI (Manakin) :: Web Application ............ SUCCESS [3:27.807s]
 [INFO] DSpace LNI :: Web Application ......................... SUCCESS [28.444s]
 [INFO] DSpace OAI :: Web Application ......................... SUCCESS [12.379s]
 [INFO] DSpace JSP-UI :: Web Application ...................... SUCCESS [23.363s]
 [INFO] DSpace SWORD :: Web Application ....................... SUCCESS [13.958s]
 [INFO] DSpace SOLR :: Web Application ........................ SUCCESS [1:37.196s]
 [INFO] DSpace Assembly and Configuration ..................... SUCCESS [1:36.467s]
 [INFO] ------------------------------------------------------------------------
 [INFO] ------------------------------------------------------------------------
 [INFO] BUILD SUCCESSFUL
 [INFO] ------------------------------------------------------------------------
 [INFO] Total time: 8 minutes 2 seconds
 [INFO] Finished at: Sat Apr 16 15:40:50 SAST 2011
 [INFO] Final Memory: 45M/141M
 [INFO] ------------------------------------------------------------------------
 dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace$

Build DSpace& Initialize Database

dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$ ant fresh_install
 Buildfile: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml

init_installation:
 [mkdir] Created dir: /usr/local/dspace/bin
 [mkdir] Created dir: /usr/local/dspace/config
 [mkdir] Created dir: /usr/local/dspace/lib
 [mkdir] Created dir: /usr/local/dspace/etc
 [mkdir] Created dir: /usr/local/dspace/webapps
 [mkdir] Created dir: /usr/local/dspace/exports
 [mkdir] Created dir: /usr/local/dspace/exports/download
 [mkdir] Created dir: /usr/local/dspace/assetstore
 [mkdir] Created dir: /usr/local/dspace/handle-server
 [mkdir] Created dir: /usr/local/dspace/search
 [mkdir] Created dir: /usr/local/dspace/log
 [mkdir] Created dir: /usr/local/dspace/upload
 [mkdir] Created dir: /usr/local/dspace/reports

So my happiness was short-lived; there’s a particular file I was unable to download using the build script. See the error below.

:
 :
 copy_webapps:
 [copy] Copying 968 files to /usr/local/dspace/webapps
 [copy] Copied 130 empty directories to 6 empty directories under /usr/local/dspace/webapps
 [copy] Copying 6 files to /usr/local/dspace/webapps

build_webapps_wars:

check_geolite:

init_geolite:

update_geolite:
 [echo] Downloading: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
 [get] Getting: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
 [get] To: /usr/local/dspace/config/GeoLiteCity.dat.gz
 [get] Error getting http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz to /usr/local/dspace/config/GeoLiteCity.dat.gz

BUILD FAILED
 /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:882: The following error occurred while executing this line:
 /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:945: The following error occurred while executing this line:
 /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml:931: java.net.ConnectException: Connection timed out
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
 :
 :

What I did was manually download the file and then commented out the portion of the xml file that was downloading the file since I already had a copy on my hardrive. The results is below 🙂

dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$ ant update_geolite
 Buildfile: /tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir/build.xml

update_geolite:
 [echo] Downloading: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
 [gunzip] Expanding /usr/local/dspace/config/GeoLiteCity.dat.gz to /usr/local/dspace/config/GeoLiteCity.dat
 [delete] Deleting: /usr/local/dspace/config/GeoLiteCity.dat.gz

BUILD SUCCESSFUL
 Total time: 1 second
 dspace@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir$

Deploy Web Applications
I settled for Technique A; I basically copied all the webapps to the tomcat6 webapps folder.

root@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir# cp -R /usr/local/dspace/webapps/* /var/lib/tomcat6/webapps/
 root@PHRLIG001:/tmp/dspace-1.7.1-release/dspace/target/dspace-1.7.1-build.dir#

Finally!!!!

The installation process is clearly not so straight forward after all. Happy installation.