/
Radio Haiti Project Plan

Radio Haiti Project Plan


Project Summary

Objective

Primary:  Launch Radio Haiti digital audio, video and tri-lingual metadata in DDR-Public.

Secondary:  Allow easy access to audio/video from smartphones (for the Haitian audience)
Dependencies/wiki/spaces/DDR/pages/11599922, OHMS-like interface, hi-fi / lo-fi derivatives
Out of Scope


Captioning
Timeline

Requirements freeze by February 1, 2017

Launch in June 2017
Size of Collection

Total project size: ~5360 audio and video files (master and derivative), 8.5 TB total

Total to be launched in June: TBD

Working Description spreadsheethttps://docs.google.com/spreadsheets/d/1FDdviJHoLIR2ju3I-Akdoe23JLQSSty-EPLIIsln064/edit?usp=sharing
Project Team

Craig Breaden (Unlicensed) and Laura Wagner, Co-Champions

Laura Wagner:  Processing collection and creating metadata

Molly Bragg (old account) (Unlicensed), Project Manager

Ginny Boyer (Deactivated), Software Development Manager

Maggie Dickson (old account) (Unlicensed), Metadata Architect

Content Ingest Specialists and Will Sexton (old account) (Unlicensed), Ingest files into DDR

Enterprise Services Development Team Cory Lown (Deactivated), Sean Aery (Old Account) (Unlicensed), Jim Coble (Unlicensed), David Chandek-Stark (old account) (Unlicensed), Ayse Durmaz (old) (Unlicensed), Jack Hill (Unlicensed), Developers

Alex Marsh (old account) (Unlicensed), Digitization Specialist Video

Zeke Graves (old account) (Unlicensed), Digitization Specialist Audio

Cutting Corp (Aaron Coe), Audio digitization vendor

National Endowment for the Humanities:  Granting Agency and stakeholder


Detailed Project Information


Content Analysis

  • Material to be digitized:

    • Collection(s):  Radio Haiti Recordings, 1957-2003

    • Format: audio reels, audio cassettes, VHS, betacam

    • Number of files:

      • ~5300 master audio files

      • 32 mov, 32 mp4

  • What is an item?

    • A file is an item (1 cassette may have more than 1 item)

  • Rights issues (see rights details below for more info):

    • Permission granted by Radio Haiti to digitize and publish Radio Haiti materials.

    • Recordings include a fair amount of 3rd party owned content and project team has been exercising due diligence to investigate and clear these.   Often the content is intermixed.  

    • The video content is the raw footage of the Agronomist (film by Jonathan Demme).

      • Will need to confer with Dave Hansen on rights and appropriate rights statements. 

  • Scope:

    • All unique A/V items will or have been digitized.

    • Some items will not be available for public access.

  • Notes:  


Digitization and File Details

Audio

  • Vendor: Cutting Corporation

  • Data produced:  5 TB

  • Digitization began in 2015?

  • File naming convention:  [collection number][rr or cs][item 4 digits]_[side 2 digits]

    • RL10059CS0001_01

    • Files from vendor include hyphens:  RL10059-CS-0001_01 and can be converted.

      • Do we have a preference as to which format we use?

        • Should be consistent w/ in the collection - so use hyphens or not hyphens - make it match whatever it is supposed to be.  

  • File Formats, specs and numbers:

    • Archival masters:  5300 Wav at 24 bit 96 khz in stereo

    • Mezzanine:  mp3s at 24 bit 48 khz in stereo

      • These are working copies only - will not be preserved or served up.

      • Not all files have a mezzanine copy.

      • Edited version with dead air removed.

      • Timecodes are all generated from the mezzanine.

  • Derivatives:  5300 mp3 at 64 kbs in mono

    • Some of the derivatives are edited versions of the masters.

    • Timecodes are generated from the derivatives.

  • Location pre-ingest:  Cifs 14 (masters there as of 11/1 - derivatives will be soon)


Video

  • Digitized in the DUL DPC, more details available on video project plan.

  • Data produced:  3.5 TB

  • Duration:  September 2016

  • File naming convention:  DPC file naming

  • File Formats, specs and numbers:

    • Archival masters: 32 uncompressed .mov files

    • Derivatives 1: 32 mp4 at 720 x 480, 2300 kbps (following DPC standard)

    • Derivatives 2: 32 mp4 at 320 x 240, 8 bit 1000 kbps

      • only provide access to derivatives 2?

    • Location pre-ingest:  DPC-Work\derivatives\rhv


Total Data footprint: ~8.5 tb


Relationships between items

  • Programs that span files/cassettes:

    • These items are identified as Is_Part_Of English/Creole/French in the google spreadsheet.

    • In the interface, related items will be clickable from the bottom of an item’s page and labeled as “related items”.  

      • Per meeting 12/7/16, LES will implement mock-up v. 1 (with the blocks - see below).  Expected behavior:  on mouse-over display additional metadata for item.


  • The trial which spans multiple cassettes and dates:

    • These items are identified as Is_Part_Of English/Creole/French in the google spreadsheet

    • In the interface, display 3 related items (per v. 1 mockup), and show a link to the rest of the items as a search result.  The results will display in order assuming the following.

      • Laura will title the trial items consistently and numerically, ex: Title (1), Title (2)

      • Cory will fix numeric title sorting.

        (will be changed to show only 3 results then a link)


  • Items that are identical to other items:

    • These items are identified as “is format of” in the google spreadsheet.

    • Copies of items will display just as related items do except they will be labeled as “other versions”.  See copies.png v. 1 below.



Ingest

  • Ingest Lead(s):  Susan Ivey and Moira Downey
  • Timeline:  
    • 1st Batch - 349 items, ingest May 2017
    • Subsequent batches - TBD

Accessibility


Low Bandwidth Accessibility:

  • Ideally collection will be accessible via smart phone over low bandwidths for use by Haitian public. Haitian public uses smart phones and accesses content in 15 minute chunks. They use apps predominantly, especially YouTube.

    • LES team to implement new tools for A/V in 2017.  Once the new system is in place, Radio Haiti team will need to assess if accessibility to DDR-Public remains an issue.  

    • If accessibility remains an issue, Radio Haiti team will discuss uploading the low-resolution derivatives to YouTube with Haitian metadata.


Closed Captions / transcripts

  • Collection will not be transcribed or have closed captions in the immediate future owing the cost of transcribing/translating Haitian language materials.


Metadata

Timeline


Metadata Profile and Functionality 

  • Title

    • Titles will be translated in English, French and Creole

    • Craig and Laura will indicate which version of any given item’s title is the main title.  Main title will display as title, the other 2 titles in alternate title fields.

      • See mock-up: alt-title-item and result-alt-title below



  • Alternative Title

    • Will include alternate translations of the main title

  • Description

    • Will include Program_Description_Creole, Program_Description_French, and Program_Description_English from the spreadsheet.

    • When all three languages are present, they will be displayed as 3 different blocks of text. Some items include timecodes for use later in an OMHS-like interface. .  


  • See alt-title-item for a mock up of the 3 descriptive paragraphs above.


  • Subject

    • Subject_Topic, Subject_Name, and Subject_Place from the spreadsheet will be mapped to subject.

    • Topics contain a mix of terms in various languages.

    • Topics and subjects are separate in the Radio Haiti Google sheet, but will be combined into one field in DDR.

  • Speaker

    • Speakers from the spreadsheet.

    • Stored in dc:creator field; displays as ‘Speaker’

  • Language

    • We will store the language codes but display the human-readable language terms.

  • Rights

    • Rights information will be available in English, French and Creole.  

    • Recordings rights fall into roughly 3 categories:

      • Radio Haiti exclusive rights

      • Radio Haiti not exclusive - rights have been cleared

      • Radio Haiti not exclusive - rights not cleared

      • Small amount of copyrighted material (just a few seconds)

      • Not yet determined

    • determine if we can use CC licenses for some items, and which rights statement applies to other statements.  


Metadata wireframe

                        

Search and Display functionality in DDR-Public

  • French and Creole language metadata should be searchable

    • Question:  can the interface pick up versions of words with and w/ out accents.  For example, metadata is correct w/ accents, but users may not enter the accents. Side note: Laura has found that googling in French works with and w/ out accents.

    • Creole orthography (how it is spelled) has changed, so words that sound the same can be spelled in different ways and with different apostrophes:

      • M p ap ka ale -- now the correct version

      • M’ p’ap ka ale -- older version

      • M pap ka ale -- not correct, but this is how people spell things

      • Laura has experienced issues w/ Google where she enters a song name, but it is actually written differently than the way she googles.  

  • DDR-Public can store and display multi-lingual metadata

  • Search functionality in DDR-Public enables searching with and w/out accents.

Time-coded Metadata

  • Some items will have sub-item description that will correspond to time-codes in the item.

  • Sub item metadata is in the description field of the metadata interface.  It is coded for easy parsing later.

    • Previously this data has been crosswalked into XML for OHMS.  OHMS doesn’t work w/ SRT captions.  

Displaying Time-coded metadata

  • Ultimately, this metadata should be displayed in an “OHMS-like interface”

  • NOTE:  Project stakeholders would rather have the right “OMHS-like interface” even if it comes after June 2017.  They do not want to repeat the work.


Portal Configuration

  • Configured - details to be added following 5/9/17 email to champions


Facets:

  • Program Name (category);

  • Speaker (creator);

  • Date (date);

  • Program Type (this will still be format - Maggie is not splitting them out);

  • Subject (subject);

  • Location (spatial)

  • Language (language - should be configured in time for a pre-RH release so we will display language name and not just code)


List of fields that should be displayed on items:

  • See metadata profile above


Customization of metadata for display:

  • See metadata profile above


Collection Configuration

Non-public items:

  • Some items in the collection should not be launched for public access.  

  • Non-public items are indicated by [what] in the google spreadsheet.


Collection highlights (supplied by champion)

  • [list 2 or more images file names to display in the collection banner]

  • [list 4 or more image file names to highlight in a grid -- optional]

Collection Level Metadata (supplied by champion)

  • Summary Capsule: [should describe the part of the collection which is digitized - will be supplied by champion]

  • “About the collection”: will come from finding aid abstract

Collection Preview with Champion


Preview Date:

  • Attendees:


Changes required:

  • [note any last minute changes requested during collection preview]