RSS Aggregator

By Jason Rhody Several projects funded by NEH’s Office of Digital Humanities have been featured in t [...]

Users are now able to create audio and video playlists with WordPress 3.9. They work as a shortcode, [...]

The Art & Science of Curation is a project which explores ideas around Curation and the role of [...]

Botanics Magazine (Royal Botanic Garden, Edinburgh) Journal of the Marine Biological Association of [...]

One of the themes covered in CFM’s TrendsWatch 2014 report is the power of big data and data analyti [...]

Online Resources for Archaeological Research Workshop: This workshop will introduce archaeological r [...]

An Analytical Onomasticon to the Metamorphoses of OvidWILLARD MCCARTY, with MONICA MATTHEWS, AARA SU [...]

PhD Studentship in Digital Humanities / Information ManagementSalary Range: Stipend and fees Post ty [...]

Guest post by Maria Drabczyk, National Audiovisual Institute, Poland A pilot competition for the bes [...]

ieldran, the Early Anglo-Saxon Cemetery Mapping Project, is officially live and can be found here: i [...]

On 12 February 2014, the UKB has promised her support to the SCOAP3 project. UKB is the Dutch consor [...]

Dear Melbourne DH folk, Following the DHA 2014 conference in Perth, a group of us are keen to start [...]

Taylor & Francis Green OA Self-Archiving Policy is just fine for OA needs: 3.2 Retained rights I [...]

The following is a guest post from Jane Mandelbaum, co-chair of the National Digital Stewardship All [...]

Heidelberger historische Bestände: Archäologische Literatur – digital Zu den Beständen des Sondersam [...]

Top Subscribed RSS

Top Contributors

Week 48: A SCAPE Developer Short Story

It’s been two weeks since the internal SCAPE developer workshop in Brno, Czech Republic. It was a great workshop. We had a lot of presentations and demos, and were brought up to date on what’s going on in the other corners of the SCAPE project. We also had some (loud) discussions, but I think we came to some good agreements on where we as developers are going next. And we started a number of development and productisation activities. I came home with a long list of things to do next week (this ended up not at all being what I did last week, but I still have the list, so next week, fingers crossed). Tasks for week 48:

  • xcorrSound
    • make versioning stable and meaningful (this I looked at together with my colleague in week 48)
    • release new version (this one we actually did)
    • finish writing nice microsite
    • tell my colleague to finish writing small website, where you can test the xcorrSound tools without installing them yourself
    • write unit tests
    • introduce automatic rpm packaging?
    • finish xcorrSound Hadoop job
    • do the xcorrSound Hadoop Testbed Experiment
      • Update the corresponding user story on the wiki
      • Write the new evaluation on the wiki
    • finish the full Audio Migration + QA Hadoop job
    • do the full Audio Migration + QA Hadoop Testbed Experiment
      • Update the corresponding user story on the wiki
      • Write the new evaluation on the wiki
    • write a number of new blog posts about xcorrsound and SCAPE testbed experiments
    • new demo of xcorrsound for the SCAPE all-staff meeting in February
  • SCAPE testbed demonstrations
    • define the demos that we at SB are going to do as part of testbed (this one we also did in week 48; the actual demos we’ll make next year)
  • FITS experiment (hopefully not me, but a colleague)
  • JPylyzer experiment (hopefully me, but a colleague)
  • Mark FFprobe experiment as not active
  • … there are some more points for the next months, but I’ll spare you…

So what did I do in week 48? Well, I sort of worked on the JPylyzer experiment, which is on the list above. In the Digital Preservation Technology Development department at SB we are currently working on a large scale digitized newspapers ingest workflow including QA. As part of this work we run JPylyzer from Hadoop on all the ingested files, and then validate a number of properties using Schematron. These properties come from the requirements to the digitization company, but in SCAPE context these properties should come from policies, so there is still some work to do for the experiment. But running JPylyzer from Hadoop, and validating properties from the JPylyzer output using Schematron now seems to work in the SB large scale digitized newspapers ingest project :-)

And for now I’ll put week 50 on the above list, and when I have finished a sufficient number of bullet points I’ll blog again! This post is missing links, so I hope you can read it without.

Preservation Topics: 


metadata entry

Contribution: BoletteJurik

Name: BoletteJurik

URL: link to the original post


Language: English

Format: text/html