Turning Japanese

I really think so … the SWORD AtomPub Profile [SWORD] Version 1.3. has been translated into Japanese by Sugita Shigeki and colleagues.

http://www.swordapp.org/sword/specifications

Leave a Comment

OAI-ORE and ForeSite

Following on from the latest OAI-ORE specs being released, Richard Jones and Rob Sanderson have just announced the outputs of their Foresite project:

“The Foresite [1] project is pleased to announce the initial code of two software libraries for constructing, parsing, manipulating and serialising OAI-ORE [2] Resource Maps. These libraries are being written in Java and Python, and can be used generically to provide advanced functionality to OAI-ORE aware applications, and are compliant with the latest release (0.9) of the specification. The software is open source, released under a BSD licence, and is available from a Google Code repository:

http://code.google.com/p/foresite-toolkit/

You will find that the implementations are not absolutely complete yet, and are lacking good documentation for this early release, but we will be continuing to develop this software throughout the project and hope that it will be of use to the community immediately and beyond the end of the project.

Both libraries support parsing and serialising in: ATOM, RDF/XML, N3, N-Triples, Turtle and RDFa

Foresite is a JISC [3] funded project which aims to produce a demonstrator and test of the OAI-ORE standard by creating Resource Maps of journals and their contents held in JSTOR [4], and delivering them as ATOM documents via the SWORD [5] interface to DSpace [6]. DSpace will ingest these resource maps, and convert them into repository items which reference content which continues to reside in JSTOR. The Python library is being used to generate the resource maps from JSTOR and the Java library is being used to provide all the ingest, transformation and dissemination support required in DSpace.

Please feel free to download and play with the source code, and let us have your feedback via the Google group:

foresite@googlegroups.com

All the best,

Richard Jones & Rob Sanderson

[1] Foresite project page: http://foresite.cheshire3.org/
[2] OAI-ORE specification: http://www.openarchives.org/ore/0.9/toc
[3] Joint Information Systems Committee (JISC): http://www.jisc.ac.uk/
[4] JSTOR: http://www.jstor.org/
[5] Simple Web Service Offering Repository Deposit (SWORD):
http://www.ukoln.ac.uk/repositories/digirep/index/SWORD

[6] DSpace: http://www.dspace.org/

Leave a Comment

Knighted? SWORD at Open Repositories 2008

I’ve been at the Open Repositories 2008 conference in Southampton this week, where I gave a presentation on the first day on the SWORD project. The day before the conference, SWORD had already been discussed very positively at a Microsoft-sponsored meeting – Stuart Lewis, one the developers on the project was at that meeting a gave a presentation on SWORD. Throughout the conference, there seems to have been something of a buzz around SWORD and many people are interested in implementing it in their own repositories to support a wide range of use cases. This is very positive, especially as SWORD is about to receive a small amount of additional funding which might allow us to do some of the things people are pressing about.

What seems to have captured people is the lightweight, simplicity and web-focus of SWORD. This is pleasing, since this was our aim from the start – not to create a walled-garden standard for repositories, but something that could be used anywhere and by any system – this is a powerful thing. Sandy Payette described this as “low-barrier deposit” which I think sums this up very well.

Among ideas for what SWORD should usefully do now, are:

  • improvements to the SWORD profile and code
  • extend SWORD profile to full APP support, in particular to support update/delete
  • additional code libraries
  • extend, develop deposit tools
  • testing with ORE Resource Maps
  • … ideas?

At the very least the SWORD development team, led by UKOLN, has an opportunity to get some talented innovators together and to come up with some recommendations for the future of SWORD and deposit interoperability more generally, and of how this might be supported by JISC and the repositories community.

Leave a Comment

Research Data Management Forum

Last week I attended the first meeting of the Research Data Management Forum, jointly organised by the Digital Curation Centre (DCC) and the Research Information Network (RIN). The general aim of the forum is to “improve the quality, reliability, processing, management and accessibility of data of importance to science, technology and society”. The emphasis was on practical collaboration and the assembled group came from a wide variety of places including institutions, data centres and funders. The event kicked off with an opening keynote from Michael Jubb (RIN) which analysed some of the key issues and questions – what are the different types of collection? (research data, community-focussed collections, reference data for larger audiences) Who manages them? (institutions, funders, national services … ) Who uses them? (researchers, data curators, the public … ) What do we mean by data? (computational, experimental, observational … ). He also considered the importance of records management practices in relation to data and asked whether the cost of managing and disposing of data actually outweighs the cost of keeping all of it, and what that means for future usability. Issues of availability, accessibility and usability were seen as paramount. The presentation closed with an examination of notions of citation, credit and reward – Michael asked whether there is any concrete evidence of the value of storing data for re-use and sharing? There need to be real demonstrations of these benefits.

The second day was a mixture of presentation and breakout discussion. Martin Lewis (Sheffield) talked about forthcoming work on an analysis of the research data management community, and Mark Thorley (National Environment Research Council) focussed on providing appropriate skills and effort for data curation activities. I attended a breakout group on the latter where the 5 recommendations coming out of our discussions were broadly:
- understanding the data curation skills gap
- providing education at grass roots for researchers to improve data management practice from the ground-up
- raising awareness
- changing the mindset and culture of researchers by answering the question – what’s in it for me?
- considering industry drivers and government initiatives

Unfortunately I missed the recommendations from the other breakout and final summary, but overall my feeling is that the meeting was successful in bringing together interested parties and from this, generating further focus on strategies for solving the various specific issues raised.

Neil Jacobs, from JISC has also written a very useful summary: http://infteam.jiscinvolve.org/2008/03/21/the-research-data-management-forum/

Comments (1)

SWORD article in Ariadne

Stuart Lewis, Sebastien François and I have an article in this month’s Ariadne.

SWORD: Simple Web-service Offering Repository Deposit

It’s a cracking read! Well, hopefully it will offer people a pretty accessible overview of the SWORD project.

Leave a Comment

SWORD, getting to the point

SWORD project logoAfter 8 months of work by a fantastic bunch of developers, I am able to announce the launch of the main technical outputs from the SWORD project, which I will do after a bit of introductory preamble … skip to the end if you’ve heard it all before.

SWORD is a six-month JISC-funded project to define and develop a standard mechanism for depositing into repositories and other systems. Why? because currently there is no standard way of doing this. A standard deposit interface to repositories will allow more services to be built which can offer functionality such as deposit from multiple locations, e.g. disparate repositories, desktop drag&drop tools or from within standard office applications. SWORD can also facilitate deposit to multiple repositories, increasingly important for depositors who wish to deposit to funder, institutional or subject repositories. Other possibilities include migration of content between repositories, transfer to preservation services and many more.

Rather than develop a new standard from scratch, SWORD choose to leverage the existing Atom Publishing Protocol (APP), “an application-level protocol for publishing and editing Web resources”. APP is based on the HTTP transfer of Atom-formatted representations yet SWORD has focussed on two key aspects of the protocol – the deposit of files, rather than Atom documents, and the extension mechanism for specifying additional deposit parameters. Also worth noting is that SWORD does not specify the implementation of all of the functionality of APP, rather it supports deposit only, but that shouldn’t constrain implementers who want to support the fullness of APP.

So, to the outputs:

1) a profile of APP which implementers can use to create SWORD deposit clients or SWORD interfaces into repositories, where the client will ‘do’ the deposit and the interface will accept it: SWORD Profile.

2) test implementations of the SWORD interface in DSpace, EPrints, IntraLibrary and Fedora to demonstrate the efficacy of the SWORD approach: SWORD Implementations

3) two demonstration clients which can be used to deposit into the implementations at 2) or into any other SWORD-compliant implementations: SWORD clients

4) code for use with DSpace, Fedora, EPrints and the demonstration client: SWORD downloads

There are still some things to follow from SWORD – case studies from repositories and clients intending to implement SWORD will be produced over the next few months, as well as a final report, but for now, I think that’s quite enough to be getting on with.

  |
Ooo:<sword/>
  |

Leave a Comment

SWORD Profile – finally out there

From my corner of the world I’ve actually gone and achieved something this afternoon. I’ve released the official version 1.0 of the SWORD profile.

http://www.ukoln.ac.uk/repositories/digirep/index/SWORD_APP_Profile_1.0

and I’ve done it with only one spelling message in my announcement!

The document is a profile of the Atom Publishing Protocol (APP) . APP is an application-level protocol for publishing and editing Web resources which is based on HTTP transfer of ATOM-formatted representations. It’s currently an Internet Draft but nearing standard status and has generated quite a bit of interest, for example from Google who have defined their own GDATA profile of it.

The SWORD Profile specifies a subset of elements from the APP for use in depositing content into information systems, such as repositories. The Profile also specifies a number of element extensions to APP, defined to adhere to the extensions mechanism outlined in APP. This profile also makes use of the Atom Syndication Format (ATOM) as used in APP, with extensions.

The SWORD project is testing implementation of this profile within Eprints, DSpace, Fedora and IntraLibrary and developing a reference client to demonstrate remote deposit. This work is nearing completion and outputs will be released soon …

Phew. Much credit to the team of developers I’ve been working with on SWORD.

Comments (1)

accelerating ORE

For three days in August I was at Cornell University in Ithaca, New York attending a meeting of the Object Reuse and Exchange acceleration project (ORE). This meeting followed on from the two-day technical committee meeting in New York City in May and means that I have now spent 5 full days discussing this ORE stuff face-to-face alongside a bunch of hours reading, discussing by phone and trying to explain the project to other people. The main ORE project has been running since October last year and I don’t think it would be unkind to say that the technical committee has been struggling with defining the scope of the project and answering the question “just what is it that you are trying to do here?”. The acceleration project, funded by Microsoft and coming at the project’s mid-way point, is probably best described as an injection of steroids to kick start the real technical work that will occupy the next 12 months. Its aim is to produce an alpha set of technical documentation by the end of September which will then be used. The focussed effort of these three days, following on from the two technical meetings beforehand, has been successful in coming up with some concrete use cases, requirements, agreements about the ORE abstract model and ideas about how to serialise this model. Over the next few weeks, a small group will be working very hard to put the results of this meeting into a set of documents that will be ratified by the technical committee.

My take on what ORE is trying to do is that it is creating a bridge between the document-centric scholarly digital library world and the data-centric semantic web community. It’s primary aim from the start has been to provide a mechanism for describing compound information objects, in particular to provide a mechanism for describing their boundaries and identifying the types and relationships of their components on the Web, in a way that can facilitate re-use and exchange in a machine-to-machine way. After much discussion ORE is coming towards consensus on a layered approach that will allow a range of implementers with different requirements and different levels of technical skill to describe and share their resources in more meaningful ways. These range from from creating a simple manifest listing of information components, to a rich description that expresses the types of components, relationships between them and other resources, alongside essential metadata to aid discovery and re-use. Examples of use cases include a digitized book with a hierarchy of chapters, sections and individual pages made available in different document formats; a scholarly paper with a range of different versions available; and a youtube page containing an embedded video clip, user comments and references to related resources. What was gratifying about the meeting was that the discussions and disagreements ultimately helped all of us reach a point of tentative consensus, understanding and agreement. Some of the main agreements that I think could be usefully summarised – as a minimum a resource map should list the components of a compound object in a simple manifest; HTTP URIs should be used to identify the resource map and the compound object; where ‘good’ URIs already exist these can and should be used, but where new URIs are needed these should be created; the resource map may provide richer information about resource types, relationships and other metadata. One thing I was particularly pleased about was the movement away from re-invention, towards using existing standards and formats as much as possible, such as ATOM, RDF/XML, XHTML and Dublin Core.

After so much time spent involved in the detail of this project I find that I’m more excited about what it can offer than ever. I’ve heard various questions – what is it going to do for scholarly communication, really? is it offering anything new? isn’t this just RDF? but I think what is key for ORE is more about the focus it brings to the ‘compound objects’ (or whatever the chosen terminology is in the end!) issue than what it will ‘produce’. Of particular importance, for me, is the need for making resources available via the Web using pre-existing standards and formats, for encouraging adoption of standard mechanisms for sharing, describing and re-using objects, and doing this in way that allows traditional ‘documents’ and semantic ‘data’ to co-exist and benefit each other. This fits well with the work we’re doing with SWORD and with the wide use of RSS/ATOM. Once the lightweight bootstrap ORE specification is available, it will be up to repositories and other scholarly communications tools and systems to enrich and profile ORE for their own communities.

Leave a Comment

Older Posts »