Books Repository for Eclipse

This is the project I am currently working on.

The Books Repository should be implemented in Java as an Eclipse plugin and its goal is to store and browse large collections of books.

A book repository is a tree like structure of resource objects. The resources managed by a book repository are:
  • Library - a book collection. Can contains a list of Books
  • Book - a book. A Book can contains Sections and Documents
  • Section - a book section. A section is a logical part of a book and may contains Documents
  • Document - a searchable text document. A document contains textual data and may contains other external documents (not necesarly text document) known as attachments
  • Attachment - a document attachment
  • Note - a note is an user comment that may be attached to a document
  • Bookmark - a bookmark is a reference to a node in a Library tree structure 
Only Note and Bookmark resources are editable by the user - the rest is read only.

The resources are described by a set of properties knows as the resource METADATA.
All the resources may defines the following METADATA fields:
  • id (string) - The local resource identifier
  • name (string) - The Library name
  • creator (string) The creator
  • date (string: YYYY-MM-DD) - The creation date
  • description (string) - The resource description
  • subject (string) - A list of searchable keywords that describes the resource
The only required fields are: id and name
Each resource type may add custom fields to their METADATA but these fields are not yet defined. Also, the format of the id field is not yet defined.

Each resource instance is globaly identified by an URI.
Local identifiers (id field) are unique id relative to a given context.

The Library structure (e.g. the resource tree) and the METADATA is described using the RDF format
At this moment the metadata element set is not yet defined.
A possibility is to use the Dublin Core Metadata Element Set to describe our resources METADATA.
(See http://dublincore.org/documents/dces/)
The RDF description will be stored in a RDF database.
As RDF engine I choosed Sesame (see http://openrdf.org) because it's the only RDF engine I've found that embeds a Java database which increase the engine portability. It also supports external databases like MySQL, PostgreSQL, Oracle etc.
For a survey of RDF engines you can check http://www.w3.org/2001/05/rdf-ds/DataStore
Another interesting RDF engine is Redland (see http://librdf.org/)

Document, Attachment and Note resources may contains text or binary data.
This data is not stored in the RDF description of the library. Instead, we store it in a data repository on the hard disk.
The data repository may be implemented using regular files/directories or using a database. For now, I will use the file system approach. Thus data will be stored in files inside a directory tree structure that will be defined later.
Document data is stored as DocBook files and Note data is stored as plain text.

There are two kinds of search we can perform on resources:
  1. Search based on resource properties (using METADATA fields)
  2. Search on Document text data (using a text search engine)
As text search engine I choosed Lucene that is already included by eclipse
(see http://jakarta.apache.org/lucene/docs/index.html)

The DocBook files will not be displayed directly by the eclipse plugin because
this requires extra work to create a DocBook editor using eclipse SWT library.
Instead, we will convert the DocBooks into HTML files that can be viewed in eclipse using the embeded web browser.

Basically, to browse a book repsoitory, we need to create a Tree Navigator View and a HTML Browser.
The Tree view is used to display the repository structure and the HTML Browser is used to display the Documents data when a Document node from the Tree View is double-clicked.
There are also views that need to be created like: Bookmarks list, Metadata view, Notes list, Context Navigation, Search result etc.

Also, a Welcome page (Home page) needs to be dispalyed the first time the user open a book repository.
The Welcome page will be implemented as an HTML page (displayable in a HTML Browser instance)
The content of the Welcome page is not yet defined.

Another requirement of the application is to implement a method to synchronize (i.e. update) repositories with a central server.
This will be implemented using a diff/patch system (based on the unix diff tool or on a XML diff tool)
Update procedure:
  1. download the update packages (a collection of diff files and attachments)
  2. Apply diff files on the local versions
  3. Rebuild the RDF database and the text search indexes

Important announcement: Join the Nuxeo team and contribute to the Nuxeo project! We have open positions in France and the UK for open source Java EE developers and sales engineers, both junior and senior.

Like this post? Share it:

Posted by Bogdan Stefanescu @ 02/02/2005 06:35 PM. - Categories: eclipse, nuxeo -  0 comments

Nuxeo Bloggers: Log in!
Nuxeo - Indesko - Nuxeo 5 Project
All content is copyrighted by their author.
CPSSkins is Copyright © 2003-2006 by Jean-Marc Orliaguet. | CPS is Copyright © 2002-2006 by Nuxeo SAS.