[DPLAsteering] Repox Discussion notes with the North Carolina Digital Heritage Project
Reese, Terry P.
reese.2179 at osu.edu
Wed Feb 3 10:35:26 EST 2016
Oregon is joining the Mountain West project as an avenue for contributing content (at least, UO and Oregon State are). As a state, I think that is likely going to be the path forward. And if I remember correctly, the Mountain West uses CONTENTdm’s multi-site (or use to) to aggregate the metadata.
--tr
From: Stephen Hedges [mailto:stephen at oplin.ohio.gov]
Sent: Wednesday, February 3, 2016 10:29 AM
To: Reese, Terry P. <reese.2179 at osu.edu>
Cc: dplasteering at lists.oplin.org
Subject: Re: [DPLAsteering] Repox Discussion notes with the North Carolina Digital Heritage Project
Very, very interesting, Terry. Also a little disturbing. I wonder how many of the state hubs are held together with duct tape like this?
We know New York is struggling with Repox. We know Pennsylvania is trying to put something together with Hydra. DPLA itself seems to be working on Hydra in a Box. What is Oregon doing?
--
Stephen
614-728-5250
On Wed, Feb 3, 2016 at 9:39 AM, Reese, Terry P. <reese.2179 at osu.edu<mailto:reese.2179 at osu.edu>> wrote:
Let me know if you have questions…..
****************************************************************************************
Repox Discussion with Lisa Gregory and Stephanie Williams; North Carolina Digital Heritage Center
Lisa Gregory; Interim Director
Stephanie Williams; Programmer
I had the opportunity to speak to members of the North Carolina Digital Heritage Center on 2/2/2016 – specifically around their implementation of Repox, the time it took initially for them to setup the project, and their long-term support with the project. Here are the highlights:
• As an organization, the North Carolina Digital Heritage Center’s DPLA operations are run on a skeleton budget. When the NCDH joined the DPLA, they did it as an extension of their already existing program. They are funded by the State Library of North Carolina, have a mandate to support digitization and collection hosting, and are funded via the State Library and LSTA funds. When Jenn Riley began working with the DPLA as part of the pilot, the organization took the project on without additional funding (and still, to this day, runs in this capacity). This meant that the NCDH has made some very specific program decisions:
1. They don’t recruit content into their portal. They work with the people that are highly motivated and interested in the project. They also work with folks that are technically capable of working with them. If an institution doesn’t want to share their metadata, they won’t try to convince them. If an institution doesn’t have the technical capacity to share their collections, they may see if that institution would like the NCDH to take on hosting responsibilities – but generally, they accept content only from organizations that can provide them with a valid, easy to understand, OAI-PMH feed.
2. As an organization, they do zero (well, almost zero…they strip some data and add a field to identify the organization an item came from) metadata remediation, and require no remediation from their partners. There are two main areas of thoughts behind this:
• DPLA isn’t providing any funding to support metadata remediation at the Hub level – and given that this is something being done essentially by part-time staff, there just aren’t the resources.
• Repox does what it does well (creating a data aggregation), and doesn’t do much else. Organizations that have struggled with Repox have struggled because they have tried to shoehorn functionality like metadata remediation into their process. Repox doesn’t do that…easily.
• As a group, the NCDH felt that asking partners to change their metadata from past collections wasn’t sustainable. They provide their partners best practices, allow organizations to see the results of not having specific metadata when rendered in DPLA (i.e., if geographic headings are not standard, your content doesn’t show up in a geographic search) – but as an organization, they are a hands off metadata shop.
• They have no formal agreements with data partners. The NCDH works with members that ask to have their metadata aggregated. They do no education around what that means (CC0), and they assume that their partners understand the terms DPLA requires when making metadata available. As such, they don’t ask data providers to sign formal agreements.
• Finally, I was interested in DPLA’s response to the lack of metadata remediation. Apparently, DPLA has spoken with them a number of times, but their attitude is that as long as they are maintaining the aggregation, DPLA will just have to live with what they get.
3. Technically, the NCDH utilizes Repox – specifically version 2.2.7 (this is the old development branch).
• They did note some trepidation around using repox. Apparently, there were a couple of years when no one was supporting it, and the project website disappeared. The new github page is new, and also based on a different architecture. This concerns them a little, but not enough to consider a technology switch. The minimal support work is what keeps them on the version they are using.
• Information about the Repox Versions:
o Version 2.2.7 – Java client/application with a LAMP backend.
o Version 3.x – Java web application utilizing a LAMP backend with Jersey as the interface framework
• Initial setup:
• NCDH noted that the initial up and running time for the project was approximately 4 weeks. This included developer time to work on the initial set of 6 XSLTs for the initial metadata harvests, and a system admin to get Repox running within their environment.
• They run Repox on windows. They did this because they had significant trouble getting Repox secured on their Linux infrastructure. They didn’t elaborate – but the issue threatened to derail them, so they run Repox on a standalone windows server, only accessible by IP address by NCDH staff and the single DPLA harvester.
• Long-term:
• Repox is managed as part of their normal infrastructure. The software is managed by their programmer, which said she spends ~1-2 hours a month with the program. A system admin just maintains the Windows Server.
• New members take ~1 week programmer time to get ingested into the aggregation. This time is spent creating the XSLT that generates the MODs feed DPLA requests.
o New members take ~2 weeks for the program manager, as she does the initial metadata discussions and profiling with the interested member.
Things that they had for us to think about:
1. If DPLAOH wants to do anything beyond simply creating an aggregation, Repox may not be the best fix. It certainly has the lowest barrier to entry – but it does one thing very well, and that is about it. If we need to do more than that, we’ll find very quickly that we will be fighting with the toolset.
2. Repox long-term support is still up in the air. It’s definitely supported by Eurpeana and others – but in the time they have used it, 2/3 of that time, the project website simply disappeared. They’re biggest worry is that support may not be available long-term; especially for the development branch that they are most comfortable with as the 3.x branch would require more hands on treatment. What they like about the 2.2.7 branch is that it’s a completely self-contained application.
[The Ohio State University]
Terry Reese
Head of Digital Initiatives
University Libraries
320F 18th Avenue Library, 175 West 18th Avenue, Columbus, OH 43210
614-292-8263<tel:614-292-8263> Office / 614-407-4998<tel:614-407-4998> Mobile
reese.2179 at osu.edu<mailto:reese.2179 at osu.edu> / http://library.osu.edu<http://library.osu.edu/> / http://reeset.net<http://reeset.net/>
_______________________________________________
DPLAsteering mailing list
DPLAsteering at lists.oplin.org<mailto:DPLAsteering at lists.oplin.org>
http://lists.oplin.org/mailman/listinfo/dplasteering
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oplin.org/pipermail/dplasteering/attachments/20160203/886bb9ff/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4503 bytes
Desc: image001.png
URL: <http://lists.oplin.org/pipermail/dplasteering/attachments/20160203/886bb9ff/attachment-0001.png>
More information about the DPLAsteering
mailing list