Tuesday 31 July 2012

Open Repositories 2012 - highlights and take aways

One of the many things I always find when attending repository community events is the repository envy that usually accompanies it.  OR2012 was no exception.  My lovely, still pretty new, DSpace repository is as close to out-of-the-box as it's possible to be, with only a few minor tweaks.  This was partially intentional, with IT colleagues stretched and managing big systems needing time and resource, an open source repository (and its slightly unflattering interface in particular) is more than they can handle at the moment.  This is something I try to accept with the knowledge that most people find their way to content in the repository via Google or some other search engine, or via a harvester, so they're unlikely to see the homepage or care what the stylesheet looks like, ever.

What I did find this time though was my content envy kicking in (too many in this rather long post, but more to follow in due course).  Not just in terms of the quantity of content, but the types and diversity of content that some repositories (or digital libraries/commons') contained.  There is scope here for QM, with rare collections and Archives still only partially digitised, we have lots of potential to expand the type and quantity of our digital collections if we can find the time and resource for digitisation... but before I get lost down the 'if-only' rabbit hole, a quick look at some of the technological things that really stuck with me and some musings on how they might be useful.

Page turner plug-in

Embedded players for audio and video content have always been high on my agenda for our repository (should we ever be presented with with audio or video content I want to be prepared), but the page turner plug-in which made a fleeting appearance in Eric Robert James' (Yale University) presentation at RF6 - Lessons Learned, was really interesting and posed an opportunity for those with technical brains.  Is there potential not just to allow page turning within the repository interface, but also to embed this as a viewer within other webpages (such as publications lists) to enable viewers to go straight into reading the paper rather than having to follow the link to the repository and then download the file.  What impact would this have on viewing/download stats?  Does this also fit into the idea of the invisible repository that I have heard mention of during the course of the conference and attributed to William Nixon?

Metrics

One thing that many repository managers tell me is that they find researchers will be more engaged if they can get hold of use metrics, mentions, etc. of their work, once it has been made open access.  We've been making lots of efforts at QM to get our usage statistics in a row and get them out there for researchers to see, and I have also been following the discussion around Altmetric with interest and some ideas for implementation not just onto the repository but elsewhere where the content is exposed within the institution.

There was a positive and far-reaching response to Melissa Terras' recent blogging and tweeting at QM, with researchers wanting to both find out 'how she did it' and also wanting to get in on the act. If there's one thing that researchers will respond to it's another researcher demonstrating how successful a new innovation can be.  One of the many issues researchers identify as barriers to open access uptake is the loss of control over where their research is posted and downloaded from.  This isn't purely control of access to content and ideas (all tied up with concerns about copyright and plagiarism), but also about concerns over dilution of impact if the content is dispersed over a wide variety of sources.  As repository managers we have a responsibility to help researchers to find out not only where their content has been downloaded from, but also where it is being mentioned and how often.  Metrics have been on the agenda for a long time, with PIRUS and PIRUS2 looking to create article level metrics for institutional repository content, failing to get publisher buy-in (really, you do surprise me) and therefore going forward with IRUS - now part of the UK Repository Net+ initiative.  So there is still an enthusiasm for being able to deliver this information, and opportunities for repository managers and researchers to feed into these initiatives exactly what they would like them to be able to do.


eThesis submission with Sword 2.0

Managing submission of eTheses to institutional repositories was the focus of the project presented by Kristian Roberto Salcedo and Richard Jones for SWORDv2 solution for Norwegian master's thesis submission portal.  At QM we rely heavily on our colleagues in the Research Degrees Office to supply us with the finalised copies of the ethesis and then manually upload and create metadata for it.  Managing this process, allowing for the file to uploaded with minimal metadata by the student would make this much simpler but there is resistance here due to concerns over security and the need to manage embargo and sensitive data precautions.  Combined with better curation task management (see below) and bitstream embargo management (again below), this could be improved considerably.  The particular element to the Norwegian initiative that interested me was connecting the repository to the national student portal allowing students to submit their thesis in its varying iterations both prior to and post examination before finally making the thesis available when appropriate.  A fascinating piece of work both technically, and in terms of future policy making.

Curation tasks and bitstream management

Lots of us out there would like to be able to do more curation of the content that we have, deciding what is accessible, what metadata can be harvested, batch control and tidying of records...  It's a pretty lengthy list.  DSpace certainly doesn't have the full range of curation functionality that I personally would like, and implementing simple functions like management of embargoes and managing individual bitstream embargoes is difficult and bitty, in some cases it is impossible.  Batch management and tidying of metadata is already available for technical brains, but the expertise in the quality control rarely lies with technical heads and more likely rests with us repository managers (yes, many of us are Librarians and that's what we do best).  I was therefore encouraged to see a while back a tool developed by @Mire providing batch metadata management as an add-on to the user interface.    But there still seemed to be lacking the concept of managing multiple variations in embargo for file bitstreams associated with the same record, or that those multiple bitstream statuses could need to change over time.

All hail therefore two pieces of work presented at OR2012 that together could resolve these curation task issues.  The first was a presentation by Yanan Zhao, Kim Shepherd, Yin Yin Latt, S. Leonie Hayes facilitated by Elin Stangeland (University of Cambridge) from Auckland University, 'Curation Tasks for Repository Managers : Staying in the Light and have a Dark Side', demonstrating curation tools for managing the visibility and harvesting of individual bitstreams from within the admin interface to allow three statuses: Accessible, Accessible on campus only, Not accessible, and allowing these statuses to not only relate to specific bitstreams, but to be changed over time.  Some really impressive work and something I'd definitely like to look into further.

The second piece of work actually formed a much larger piece that was very impressive and has many features that would be of real interest down the line, but it was this specific part that interested me.    Marc Goovaerts presented on the 'AgriOcean DSpace' a 'Customized version of DSpace for agricultural and aquatic networks in parallel with developments at Hasselt University'.  This customised version allowed bitstream level setting of embargoes, something that would seriously improve the curation and management of content.  Unfortunately, in order to develop many of the functions now available within their AgriOcean DSpace (AOD) they can to hard code a lot of the these changes, meaning that they are not easily made shareable (drat!).  

What was really reassuring with all of these technical enhancements was that, DSpace is truly alive and kicking out there and people are doing some really exciting things with it, we just need to find ways for those things to become usable by others in the community (a topic for another post I think).

Monday 30 July 2012

Open Repositories 2012 - Crowdvine, serial tweeting and the app deluge

So, still wrangling the notes from the various sessions I attended, and some of the ideas and projects I saw demonstrated, I thought I'd start with the technology.

OR2012 introduced Crowdvine and live blogging into the mix for this OR Conference, both of which are new to me, with no time beforehand to investigate the technology and familiarise myself with the interface I didn't really come to these until late in the conference - Thursday I think.  Luckily, Crowdvine turned out to be really easy to use, really easy to navigate, perhaps too easy... since I immediately posted a new discussion thread  - Highlights and things you're going to take home

Throughout the event, social media has played a significant role in allowing delegates, both physical and virtual, to keep track of the interesting things going on in parallel sessions, point to innovations of interest or things to highlight (oh and poke fun at each other but that's another post altogether).  A little competition is really healthy, and no more so than when some bright spark (@WilliamJNixon I believe) starts a Tweetpository and starts tracking the output of conference tweeters!  Finding out who had the most tweets for the event was no great surprise (@MrNick) but it was fun to find out how many serial tweeters are out there, and interesting to discover just how ubiquitous social media has become, oh and that I am not so serial a tweeter as I first thought having not made it onto the OR2012 Wordle.

The liveblogging was perhaps one innovation too far for me on this occasion, but it is something I'm open to and it was very entertaining to read the live blogs by colleagues in parallel sessions and find out what you've been missing.  Managing my tweeting and note-taking whilst still remembering to listen to the sessions was hard enough without the added stress of trying to live blog my random ramblings.  This is something to consider for next time though as it was a dynamic, if somewhat stream of consciousness style of note-taking.  Well done to those that did it, great effort.

This all reminds me that I am seriously behind in learning about new web-based technologies.  With so many developments and so much to learn it's hard to stay an 'early adopter', something I have always been proud to consider myself.  Unfortunately, with so many technologies and the plethora of apps and services out there now, we're not so much at risk of a data deluge as an app deluge...

But, the things I did successfully take up?  Taking my notes using Evernote.  This was actually a really good experience, with my notes syncing between laptop and iPad to make keeping everything together really easy; the ability to group notes together into a notebook, already organised by date, is making the process of pulling together notes, links and other 'stuff' from my week in Edinburgh really simple - though nothing has made it easy to digest all the things I heard about and thought about into things to talk about with my colleagues!  Formatting, bulletpoint-making and numbering your notes on the fly might not get many people excited, but I was one very happy lady on Monday morning when everything was ready and waiting to be exported.

I think I only scribbled down 1 page of handwritten notes the whole week - on the first day when coordinating my technology with my suitcase and other baggage was one step too far!!!  I'd had a 4am start - that's my excuse.




Monday 9 July 2012

Open Repositories 2012

So, it's Open Repositories conference time again, and this year I actually managed to get here!  In previous years I have either been too late to apply, on maternity leave, or just unable to fit it into the mad schedule.

This year's conference is being held at University of Edinburgh, a favourite city for me, and with a packed programme looks to be a really good event.  I'll be tweeting from my attendance at various workshops, follow me @moragm23

First up today is an afternoon of Research Data Management with the DCC.