Handling Custom Modifications to Curate Engine
We needed to make custom modifications of the Curate Engine. So we:
- forked Curate to the Hesburgh Libraries of Notre Dame organization.
- created a branch for localized changes
I did some local git configuration changes:
git remote set-url --push origin "Thou shalt not push"
git remote add ndlib https://github.com/ndlib/curate.git
Below is the resulting
[remote "origin"] url = email@example.com:projecthydra/curate.git fetch = +refs/heads/*:refs/remotes/origin/* pushurl = Thou shalt not push [remote "ndlib"] url = https://github.com/ndlib/curate.git fetch = +refs/heads/*:refs/remotes/ndlib/*
Then I pushed my new branch to ndlib (
git push ndlib branch-name).
Sidebar: By setting the push url for origin to “Though shalt not push”, I can no longer push changes to the projecthydra-labs/curate repository.
Making Modifications to CurateND
The next step was to update CurateND’s reference to Curate.
- I made a branch in CurateND (eg
- I updated the CurateND Gemfile to reference the new Curate version.
gem 'curate', github: 'ndlib/curate', branch: 'update-article-metadata' # gem 'curate', path: '../curate' # Sometimes I want the local reference for # # but we deploy the remote branch
On this branch we did what we needed to get the application working.
Using Curate Reference Implementation Data
For this project, Notre Dame is hosting a reference implementation. We acknowledged that for the QA process, we wanted to use a small subset of data. The reason for the small set of data was that we were planning to regenerate the QA data as we iterated on the solution.
So we opted to grab all of the data from the reference implementation and copy that to a QA application.
This required copying:
- Fedora Commons data
- SOLR index
- Application’s mySQL database schema and data
And here I encountered an annoying pair of gotchas. Our reference implmenetation had three differences between CurateND.
userstable in mySQL was different. CurateND uses CAS whereas the reference implementation uses a local database-backed registration model. We didn’t want to drag our partners through the process of getting institutional credentials.
- The expected work types were different. CurateND has no concept of a GenericWork work type.
- The Fedora namespace of the objects were different. (eg. ‘und’ vs. ‘sufia’)
Handling the Difference
To resolve this, I created another branch off of
This new branch’s purpose was to contain the necessary changes for addressing the three gotchas from above. I then squashed those changes into one commit.
I created the various forks and branchs to separate concernsk.
We needed isolate:
- localized changes to Curate.
- changes to CurateND to get Curate working.
- modifications for using reference implementation data with production-oriented code-base.
In keeping these concerns as separate branchs, I’ve been able to make sure that I can:
- grab Curate changes from upstream
- modify production-oriented code without fear of mixing in the reference implementation
- iterate through what is needed to allow access of the reference data via production code
I’ve also kept an eye on isolating commits so that I can push Curate changes upstream via pull requests.