Greenstone Digital Library Software

So I’ve spent the past couple of weeks trying to familiarize myself with the most common open source digital repository software tools, I’ve using the Parasoft services to test them out. The thing is, I’m going to spend the next 2 years of my life researching on current existing digital libraries software architectures… and eventually (hopefully) prove that a much simpler architecture can work just as well as the current complex software stacks.

Greenstone is a legend! It’s probably because it was one of the first digital repository software package to be implemented. The other reason why it is popular is because it is developed and distributed in corporation with UNESCO.

The Linux distribution basically comes with statically linked binaries (see output below). The installation package also comes bundled with all the necessary tools(ImageMagick with JPEG200 support and Apache web server)  required for Greenstone to function appropriately. Continue reading “Greenstone Digital Library Software”

The Fourth Paradigm

So my supervisor gave me my first reading last week; “The Fourth Paradigm – Data Intensive Scientific Discovery” (The Fourth Paradigm: Data-Intensive Scientific Discovery). I just finished going through the first chapter (an edited version of the last talk – eScience Talk at NRC-CSTB meeting – given by Jim Gray in 2007 before he got lost at sea) and I must say, it is a rather interesting  book.

The foreword is especially interesting as it starts off with a classic example of how useful curated data can be; it basically talks about how Johannes Kepler discovered the laws of planetary motion using Tycho Barhe’s catalog of systematic astronomical observations. Gordon Bell then describes Data-Intensive science as being comprised of three basic activities: capture, curation, and analysis.