Tuesday, September 8, 2015

On Institutional Repository Success: Discovery, Search, Metadata

Over the summer I was asked to talk about institutional repositories and how to define what makes them successful as part of a job interview in an academic library. The text of what I said, along with some of the accompanying images, is below.


I've been asked to present my thoughts on what it means for a digital repository, an institutional repository, to be successful, and how to measure that success.

Very few people I know go into an institutional repository (IR) to look for something. It's not the way that search and discovery work. What I propose we do is to link the IR to our current search and discovery workflows, that is, link the IR to things that people already use.

It's not about making the repository more visible, it's about making the stuff in the repository more visible.

The IR is nothing without the things inside it; we need to have things that people want, and people need to know that they want those things, those items. Those items need to be where people can find them.

Don't have an IR just for the sake of having one. I turn here to one of my favorite library and information science theorists, Frank Zappa.

Thanks, Zappa estate.
Zappa once said that if a country wanted to be taken seriously, it needed two things: a beer and an airline. For Zappa, these are symbols of modernity. I want to make sure that an institutional repository isn't just a symbol of modernity, that we don't have one just because everyone else does, or because it's what academic libraries "should" have, but because it will be used. And for sure, having one is nice. On its own, an IR sends a positive signal concerning open access initiatives to faculty, to an academic community, and that's good, but it shouldn't be the main reason for having one.

Furthermore, we shouldn't have an IR because it's seen as a solution to non-existent or undefined problems. In organization theory, this is known as the "garbage can model" of decision making.

Not sure why PBS hosts this smushed image.
If we're going to have an IR, it should solve existing problems. It should help, not hinder, and it shouldn't exist for its own sake.

So with that in mind, we have an IR here, and an open access initiative and policy. We can improve the IR, and more importantly the stuff in it, in two ways, discovery and search.

For discovery, there are a few options. At my former place of work, we used widgets as well as a tab in our discovery search box.

Note the widgets, circled. (And yes, this is called burying the lede.) 
If possible, add a facet in the discovery layer search results. We already teach the use of these facets, may as well make the IR, and thus the stuff inside, more visible.

Note: no IR facet here. 
Results can also be expressed such that the IR is more visible. In "bento box" results, there could be an IR section of results, for example.

And of course if we don't have strong metadata for items in an IR, this won't matter. Application Platform Interfaces (APIs), Omeka has one, for example, are a good way to bring robust metadata into discovery. Digital Commons uses Open Authentication Interface, which is also workable. There's certainly room for collaboration with vendors here.

Metadata is also important in searching outside the library. Plenty of us, and faculty, use Google Scholar. With a link resolver we can bring faculty back to the library site, to the IR.

What success can look like. 
The library isn't a gateway, isn't always a starting point, so we need to bring what we have to where our users are. The library may not function as publisher, but it can certainly act as distributor.

Why is metadata so important here? Because Google Scholar works better with some schemas, some formats, than others. It doesn't play nicely with Dublin Core, for example. Without that robust metadata, we might come across our friend the paywall.

We've all seen one of these before, right? 
Ahhhh, the paywall, simultaneously too expensive, "you want how much for that paper?," and insultingly inexpensive given all that work that goes into research and publishing. Poor metadata will send people to a paywall instead of an IR for the same paper.

So discovery and search are two ways to build on IRs, to expand their capabilities. But if these methods work, how will we know? How can we track the output and measure the impact of an IR?

Traditionally, we use bibliometrics: citation tracking, pageviews, downloads, and the like. Our good friend COUNTER fits the bill. As the number of digital-only items grows, altmetrics become more important. Are articles being shared on LinkedIn or twitter? I know that one organization has tried to measure the effects of "#icanhazpdf," article sharing on social media, with mixed results. And increasingly, the line between biblio- and altmetrics are blurring.

Return on investment is also an opportunity to measure IR success, albeit crudely. Back to that paywalled article; we know that Elsevier thinks it's worth $36. Could we then write, in an annual report, that we added x-number of articles to our IR in 2015, or a fair market value of x times whatever the median article value is? That might be effective in terms of telling a story to academic administration.

Qualitative methods could also prove useful. Interview faculty, either individually or in focus groups, ask how IRs work, or don't, for them.

Speaking of faculty, this doesn't work without buy-in from them. It's why open access policies and initiatives are so important. Open access papers tend to get cited, get read, and get used more than those that are paywalled. Academic publishing looks like a moral hazard at times; faculty publish stuff and then we in the library have to buy it back from publishers.

Want one? Buy one!
We're asking a lot from faculty here, with the open access policy and the repository. We're asking them to trust us with their research, their work, and we librarians need to continually earn that trust. And that trust is part of success.

So to recap, institutional repository success is, to me, when you find the stuff, whether you notice the repository or not. When the repository is
  • Easy to use. 
  • Useful.
  • Interoperable, in that it works with what we have in terms of discovery platforms and search.  
  • Smooth and seamless, reducing friction so we don’t have to search in multiple places. That is, the IR can be unseen and still work! 
  • Branding/marketing can be useful: be consistent.
Thank you very much for the opportunity to present, and I look forward to your questions and comments. 

Take this with a grain of salt because I did not get the job.

No comments:

Post a Comment