[DPLAsteering] Repox Discussion notes with the North Carolina Digital Heritage Project
Stephen Hedges
stephen at oplin.ohio.gov
Wed Feb 3 10:29:04 EST 2016
Very, very interesting, Terry. Also a little disturbing. I wonder how many
of the state hubs are held together with duct tape like this?
We know New York is struggling with Repox. We know Pennsylvania is trying
to put something together with Hydra. DPLA itself seems to be working on
Hydra in a Box. What is Oregon doing?
--
Stephen
614-728-5250
On Wed, Feb 3, 2016 at 9:39 AM, Reese, Terry P. <reese.2179 at osu.edu> wrote:
> Let me know if you have questions…..
>
>
>
>
> ****************************************************************************************
>
>
>
> *Repox Discussion with Lisa Gregory and Stephanie Williams; North Carolina
> Digital Heritage Center*
>
> Lisa Gregory; Interim Director
>
> Stephanie Williams; Programmer
>
>
>
> I had the opportunity to speak to members of the North Carolina Digital
> Heritage Center on 2/2/2016 – specifically around their implementation of
> Repox, the time it took initially for them to setup the project, and their
> long-term support with the project. Here are the highlights:
>
>
>
> · As an organization, the North Carolina Digital Heritage
> Center’s DPLA operations are run on a skeleton budget. When the NCDH
> joined the DPLA, they did it as an extension of their already existing
> program. They are funded by the State Library of North Carolina, have a
> mandate to support digitization and collection hosting, and are funded via
> the State Library and LSTA funds. When Jenn Riley began working with the
> DPLA as part of the pilot, the organization took the project on without
> additional funding (and still, to this day, runs in this capacity). This
> meant that the NCDH has made some very specific program decisions:
>
> 1. They don’t recruit content into their portal. They work with
> the people that are highly motivated and interested in the project. They
> also work with folks that are technically capable of working with them. If
> an institution doesn’t want to share their metadata, they won’t try to
> convince them. If an institution doesn’t have the technical capacity to
> share their collections, they may see if that institution would like the
> NCDH to take on hosting responsibilities – but generally, they accept
> content only from organizations that can provide them with a valid, easy to
> understand, OAI-PMH feed.
>
> 2. As an organization, they do zero (well, almost zero…they strip
> some data and add a field to identify the organization an item came from)
> metadata remediation, and require no remediation from their partners.
> There are two main areas of thoughts behind this:
>
> § DPLA isn’t providing any funding to support metadata remediation at
> the Hub level – and given that this is something being done essentially by
> part-time staff, there just aren’t the resources.
>
> § Repox does what it does well (creating a data aggregation), and
> doesn’t do much else. Organizations that have struggled with Repox have
> struggled because they have tried to shoehorn functionality like metadata
> remediation into their process. Repox doesn’t do that…easily.
>
> § As a group, the NCDH felt that asking partners to change their
> metadata from past collections wasn’t sustainable. They provide their
> partners best practices, allow organizations to see the results of not
> having specific metadata when rendered in DPLA (i.e., if geographic
> headings are not standard, your content doesn’t show up in a geographic
> search) – but as an organization, they are a hands off metadata shop.
>
> § They have no formal agreements with data partners. The NCDH works
> with members that ask to have their metadata aggregated. They do no
> education around what that means (CC0), and they assume that their partners
> understand the terms DPLA requires when making metadata available. As
> such, they don’t ask data providers to sign formal agreements.
>
> § Finally, I was interested in DPLA’s response to the lack of metadata
> remediation. Apparently, DPLA has spoken with them a number of times, but
> their attitude is that as long as they are maintaining the aggregation,
> DPLA will just have to live with what they get.
>
> 3. Technically, the NCDH utilizes Repox – specifically version
> 2.2.7 (this is the old development branch).
>
> § They did note some trepidation around using repox. Apparently, there
> were a couple of years when no one was supporting it, and the project
> website disappeared. The new github page is new, and also based on a
> different architecture. This concerns them a little, but not enough to
> consider a technology switch. The minimal support work is what keeps them
> on the version they are using.
>
> · Information about the Repox Versions:
>
> o Version 2.2.7 – Java client/application with a LAMP backend.
>
> o Version 3.x – Java web application utilizing a LAMP backend with
> Jersey as the interface framework
>
> § Initial setup:
>
> · NCDH noted that the initial up and running time for the project
> was approximately 4 weeks. This included developer time to work on the
> initial set of 6 XSLTs for the initial metadata harvests, and a system
> admin to get Repox running within their environment.
>
> · They run Repox on windows. They did this because they had
> significant trouble getting Repox secured on their Linux infrastructure.
> They didn’t elaborate – but the issue threatened to derail them, so they
> run Repox on a standalone windows server, only accessible by IP address by
> NCDH staff and the single DPLA harvester.
>
> § Long-term:
>
> · Repox is managed as part of their normal infrastructure. The
> software is managed by their programmer, which said she spends ~1-2 hours a
> month with the program. A system admin just maintains the Windows Server.
>
> · New members take ~1 week programmer time to get ingested into
> the aggregation. This time is spent creating the XSLT that generates the
> MODs feed DPLA requests.
>
> o New members take ~2 weeks for the program manager, as she does the
> initial metadata discussions and profiling with the interested member.
>
>
>
>
>
> Things that they had for us to think about:
>
> 1. If DPLAOH wants to do anything beyond simply creating an
> aggregation, Repox may not be the best fix. It certainly has the lowest
> barrier to entry – but it does one thing very well, and that is about it.
> If we need to do more than that, we’ll find very quickly that we will be
> fighting with the toolset.
>
> 2. Repox long-term support is still up in the air. It’s definitely
> supported by Eurpeana and others – but in the time they have used it, 2/3
> of that time, the project website simply disappeared. They’re biggest
> worry is that support may not be available long-term; especially for the
> development branch that they are most comfortable with as the 3.x branch
> would require more hands on treatment. What they like about the 2.2.7
> branch is that it’s a completely self-contained application.
>
>
>
>
>
>
>
>
>
>
>
>
>
> [image: The Ohio State University]
> *Terry Reese*
> Head of Digital Initiatives
> University Libraries
> 320F 18th Avenue Library, 175 West 18th Avenue, Columbus, OH 43210
> 614-292-8263 Office / 614-407-4998 Mobile
> reese.2179 at osu.edu / http://library.osu.edu / http://reeset.net
>
>
>
> _______________________________________________
> DPLAsteering mailing list
> DPLAsteering at lists.oplin.org
> http://lists.oplin.org/mailman/listinfo/dplasteering
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oplin.org/pipermail/dplasteering/attachments/20160203/e4548f24/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 4503 bytes
Desc: not available
URL: <http://lists.oplin.org/pipermail/dplasteering/attachments/20160203/e4548f24/attachment-0001.png>
More information about the DPLAsteering
mailing list