Tuesday 31 July 2012

Open Repositories 2012 - highlights and take aways

One of the many things I always find when attending repository community events is the repository envy that usually accompanies it.  OR2012 was no exception.  My lovely, still pretty new, DSpace repository is as close to out-of-the-box as it's possible to be, with only a few minor tweaks.  This was partially intentional, with IT colleagues stretched and managing big systems needing time and resource, an open source repository (and its slightly unflattering interface in particular) is more than they can handle at the moment.  This is something I try to accept with the knowledge that most people find their way to content in the repository via Google or some other search engine, or via a harvester, so they're unlikely to see the homepage or care what the stylesheet looks like, ever.

What I did find this time though was my content envy kicking in (too many in this rather long post, but more to follow in due course).  Not just in terms of the quantity of content, but the types and diversity of content that some repositories (or digital libraries/commons') contained.  There is scope here for QM, with rare collections and Archives still only partially digitised, we have lots of potential to expand the type and quantity of our digital collections if we can find the time and resource for digitisation... but before I get lost down the 'if-only' rabbit hole, a quick look at some of the technological things that really stuck with me and some musings on how they might be useful.

Page turner plug-in

Embedded players for audio and video content have always been high on my agenda for our repository (should we ever be presented with with audio or video content I want to be prepared), but the page turner plug-in which made a fleeting appearance in Eric Robert James' (Yale University) presentation at RF6 - Lessons Learned, was really interesting and posed an opportunity for those with technical brains.  Is there potential not just to allow page turning within the repository interface, but also to embed this as a viewer within other webpages (such as publications lists) to enable viewers to go straight into reading the paper rather than having to follow the link to the repository and then download the file.  What impact would this have on viewing/download stats?  Does this also fit into the idea of the invisible repository that I have heard mention of during the course of the conference and attributed to William Nixon?

Metrics

One thing that many repository managers tell me is that they find researchers will be more engaged if they can get hold of use metrics, mentions, etc. of their work, once it has been made open access.  We've been making lots of efforts at QM to get our usage statistics in a row and get them out there for researchers to see, and I have also been following the discussion around Altmetric with interest and some ideas for implementation not just onto the repository but elsewhere where the content is exposed within the institution.

There was a positive and far-reaching response to Melissa Terras' recent blogging and tweeting at QM, with researchers wanting to both find out 'how she did it' and also wanting to get in on the act. If there's one thing that researchers will respond to it's another researcher demonstrating how successful a new innovation can be.  One of the many issues researchers identify as barriers to open access uptake is the loss of control over where their research is posted and downloaded from.  This isn't purely control of access to content and ideas (all tied up with concerns about copyright and plagiarism), but also about concerns over dilution of impact if the content is dispersed over a wide variety of sources.  As repository managers we have a responsibility to help researchers to find out not only where their content has been downloaded from, but also where it is being mentioned and how often.  Metrics have been on the agenda for a long time, with PIRUS and PIRUS2 looking to create article level metrics for institutional repository content, failing to get publisher buy-in (really, you do surprise me) and therefore going forward with IRUS - now part of the UK Repository Net+ initiative.  So there is still an enthusiasm for being able to deliver this information, and opportunities for repository managers and researchers to feed into these initiatives exactly what they would like them to be able to do.


eThesis submission with Sword 2.0

Managing submission of eTheses to institutional repositories was the focus of the project presented by Kristian Roberto Salcedo and Richard Jones for SWORDv2 solution for Norwegian master's thesis submission portal.  At QM we rely heavily on our colleagues in the Research Degrees Office to supply us with the finalised copies of the ethesis and then manually upload and create metadata for it.  Managing this process, allowing for the file to uploaded with minimal metadata by the student would make this much simpler but there is resistance here due to concerns over security and the need to manage embargo and sensitive data precautions.  Combined with better curation task management (see below) and bitstream embargo management (again below), this could be improved considerably.  The particular element to the Norwegian initiative that interested me was connecting the repository to the national student portal allowing students to submit their thesis in its varying iterations both prior to and post examination before finally making the thesis available when appropriate.  A fascinating piece of work both technically, and in terms of future policy making.

Curation tasks and bitstream management

Lots of us out there would like to be able to do more curation of the content that we have, deciding what is accessible, what metadata can be harvested, batch control and tidying of records...  It's a pretty lengthy list.  DSpace certainly doesn't have the full range of curation functionality that I personally would like, and implementing simple functions like management of embargoes and managing individual bitstream embargoes is difficult and bitty, in some cases it is impossible.  Batch management and tidying of metadata is already available for technical brains, but the expertise in the quality control rarely lies with technical heads and more likely rests with us repository managers (yes, many of us are Librarians and that's what we do best).  I was therefore encouraged to see a while back a tool developed by @Mire providing batch metadata management as an add-on to the user interface.    But there still seemed to be lacking the concept of managing multiple variations in embargo for file bitstreams associated with the same record, or that those multiple bitstream statuses could need to change over time.

All hail therefore two pieces of work presented at OR2012 that together could resolve these curation task issues.  The first was a presentation by Yanan Zhao, Kim Shepherd, Yin Yin Latt, S. Leonie Hayes facilitated by Elin Stangeland (University of Cambridge) from Auckland University, 'Curation Tasks for Repository Managers : Staying in the Light and have a Dark Side', demonstrating curation tools for managing the visibility and harvesting of individual bitstreams from within the admin interface to allow three statuses: Accessible, Accessible on campus only, Not accessible, and allowing these statuses to not only relate to specific bitstreams, but to be changed over time.  Some really impressive work and something I'd definitely like to look into further.

The second piece of work actually formed a much larger piece that was very impressive and has many features that would be of real interest down the line, but it was this specific part that interested me.    Marc Goovaerts presented on the 'AgriOcean DSpace' a 'Customized version of DSpace for agricultural and aquatic networks in parallel with developments at Hasselt University'.  This customised version allowed bitstream level setting of embargoes, something that would seriously improve the curation and management of content.  Unfortunately, in order to develop many of the functions now available within their AgriOcean DSpace (AOD) they can to hard code a lot of the these changes, meaning that they are not easily made shareable (drat!).  

What was really reassuring with all of these technical enhancements was that, DSpace is truly alive and kicking out there and people are doing some really exciting things with it, we just need to find ways for those things to become usable by others in the community (a topic for another post I think).

No comments:

Post a Comment