Section A (with 2023 edits)
Project Documentation Links:
- Github documentation (for starter guides and more): https://github.com/noahgh221/sectionA_project
- 2023 Metadata and Launch Process (Section A)
- Section A Google Sheet: https://docs.google.com/spreadsheets/d/1JcgnNaBJVyzR6No8TGOQxXs9zaNO57DP-v8XDb3IvI4/edit?usp=sharing
- Project Plan (see below)
- RLRS Sec. A Review Documentation.docx
- Section A digitzation guide workflow.docx
- /wiki/spaces/DDRKB/pages/202375245
Project Plan Table of Contents
Project Summary
Objective | Digitize, preserve, and launch individual collections within Section A (including vault Section A). |
---|---|
Dependencies | |
Out of Scope | Boxes 182+ (for now), Oversize Section A, itemizing contents of collections, configured portals, fragile and/or overly complicated formats will not be digitized, no new original cataloging (collections w/ no description are not being digitized). |
Timeline |
|
Project Team | Kate Collins and Brooke Guthrie, Co-Champions Giao Luong Baker, Digital Production Services Manager Molly Bragg, Project Manager Maggie Dickson, Metadata Architect Svea Elisha, Digitization Specialist Madina Grace, DDR Ingest Erin Hammeke, Conservator Jen Jordan, DDR Ingest Alex Marsh, Digitization Specialist Mary Mellon, Metadata Archivist |
Operating Principles |
|
Detailed Project Information
Content Analysis
Material to be digitized:
182 boxes, approx 85 linear feet
Estimated 194,000 pages minimum
Most collections are 1 folder in size.
Collection(s): Approx 3880+ small, discrete collections.
Format: Misc. bound and loose manuscript material
What is an item?
An item is a folder.
Each of the ~3880 collections will have its own collection in DDR
Most of the collections will have a single item, which is presented as a folder.
- What is a Collection in DDR?
- Each Section A collection will be ingested as its own collection.
- There will be no artificial "section A" or other collection grouping within DDR.
Rights
Rights issues:
Project team anticipates that all collections contain unpublished items, therefore the copyright period is the death of the creator + 70 years. If author is unknown, the period is the creation date + 120 years.
Per consultations with Dave Hansen, materials dated up to 1896 do not pose rights issues. Items dated between 1897 and 1920 require vetting by RL Research services (see below)
- Rights Statements:
- Collections dated pre-1850 will be assigned Creative Commons public domain mark.
- Collections dated 1850-1897 will be assigned No Copyright US Rights Statement
- Published collections dated pre-1923 can be No copyright US
- Unpublished Collections post 1897 will require further vetting by RL Research Services.
- Follow Rights Management Metadata guidelines for details about which rights notes and URIs to use.
- Managing Rights info in metadata:
- Use the "rights statement" column in SecA Working google sheet to record appropriate rights URI.
- The URIs and notes when appropriate will be ingested along with descriptive metadata.
Digitization Scope
- Collections created before 1920 will be digitized pending a rights review.
- Collections created after 1920 will not be digitized.
- Photocopies:
- RLRS will remove photocopies from collections for the most part.
- Photocopies within a folder (mixed w/ regular stuff): photocopies can be intermixed for a variety of reasons and should not be digitized. RLRS will clip all of the items together and put "do not digitize" note.
- Split Collections:
- Collections that have manuscript material located in Section A as well as other boxes will be re-united and therefore not digitized as part of this project.
- If a collection contains a section A folder and bound items located elsewhere (example: oversized ledger), then the 2 sets of material will not be re-united and therefore the Section A portion WILL be digitized.
- Oversize Section A: will not be digitized.
- Vault section A: will be digitized.
- If RL Research Services determines that a collection does not possess high research value, it will not be digitized.
- If RL Tech services determines that a collection should not be digitized due to arrangement or description, it will not be digitized.
- If Conservation Services determines that a collection is too fragile for digitization, it will not be digitized.
- Folded items: If they don't cover each other - unfold them. If they cover each other, shoot the page as is
- Scan blank pages
- Scan transcripts
- Bound Volumes
- By default, bound items will be scanned as part of the folder item.
- If specific collections and/or bound items warrant further itemization, RLRS and/or RLTS will label these "do not digitize" and they will become out of scope for the current Sec A project.
Workflow
Overview and Tracking Progress
Boxes will be handled in small batches (1-5 boxes), and will pass through the following departments in the following general order*:
- RL Research Services
- Conservation Services
- Digitization
- Ingest
- RL Technical Services
- Metadata
- Launch
*The workflow may not always be linear.
Throughout the workflow, folders should not be separated from boxes. For example, if one folder requires extra conservation measures, then the full box will stay in conservation until it is ready.
Production Rate:
- Production Rate will fluctuate throughout the project
Tracking Progress
- Project team will use a google spreadsheet for box and status tracking: https://docs.google.com/spreadsheets/d/1JcgnNaBJVyzR6No8TGOQxXs9zaNO57DP-v8XDb3IvI4/edit?usp=sharing
Status codes will be used to note where a collection is in the process.
Each project participant will update status column as he/she finishes their part of the process.
Status Definition RLRS Review RLRS Review is IN Progress RLRS Done-Zo RLRS is DONE reviewing boxes Needs Copyright Review Waiting for Dave Hansen to review for copyright RLTS Remediation RLTS is actively working on remediation MARC Review Done-Zo RLTS is done w/ reviewing MARC records Conservation in Progress Conservation is actively currently reviewing / mending collection Conservation Done-Zo Ready for digitiziation assuming digitization guides are ready Digitization in Progress DPC is engaged in digitization, QC, and/or finalization Ready for Ingest Digitization complete, checksums generated Ingested Collection has been ingested into DDR, and is ready for metadata Metadata Ingested Metadata has been ingested into DDR Launched Collection has been launched in DDR - Public (yay!) X-Out of Scope - Copyright Collection should not be digitized due to Copyright X-Out of Scope - Split Collection Collection should not be digitized in the Section A workflow and will be reunited with the rest of the collection X-Out of Scope - Fragility Collection is too fragile to digitize X-Out of Scope - Photocopies Collection consists of photocopies and should not be digitized X-Out of Scope - TS says so Collection requires original cataloging or other significant descriptive work and should not be digitized X-Out of Scope - Deaccessioning Collection does not possess outstanding research value and should not be digitized.
RL Research Services
Collection Review Workflow:
- Using links in the spreadsheet, review collection catalog records for split collections and other anomalies.
- If there are items that will never be re-united because they are different formats (rare book and 1 manuscript folder or 1 folder and 1 A/V) then move forward to digitization.
- If there is an extraordinary research value regardless of the split then move forward to digitization.
- If the Section A portion is part of a large manuscript collection that needs to be re-united then mark out of scope split collection on the spreadsheet. Any folders that will not be digitized should be noted with a do not digitize flag.
- If RLRS finds issues w/ the catalog record, note the issue in the “notes” section of the google sheet.
- Check creation date of collection
- If created between 1896 - 1919, evaluate for copyright risk (see documentation link below).
- If some of the material was created post 1920, then mark whole collection out of scope - copyright.
- Any folders that will not be digitized should be noted with a do not digitize flag.
- Check for photocopies and non-collection material
- Non-collection material will be removed (to CCF or to discard).
- Photocopies will either be removed or consolidated and clipped together with a do not digitize flag.
- Research Value: If collections are not split, are in copyright, but do not possess any research value and/or should be deaccessioned for other reasons, they will not be digitized and given an out of scope - deaccession status on the spreadsheet. Any folders that will not be digitized should be noted with a do not digitize flag.
- Ensure that the order of collections in the box matches the order of collections listed on the spreadsheet.
- RLRS will note when they finish their work on a given box in the tracking spreadsheet.
Following the RLRS review, status will be updated on the google sheet and the box can be transferred to conservation, and RLTS will begin their assessment.
RL Technical Services
Scope of TS work:
- RLTS will re-unite split collections and make any necessary changes to catalog records.
- RLTS will remediate MARC records and make them RDA-compliant (fix titles, etc.)
- RLTS sometimes changes the title of a collection, they change it on the digitization guide and the google sheet.
- RLTS will crosswalk remediated MARC to EAD and import into ASpace to create new Resource Records.
- RLTS will create starter digguides based on ASpace record - 1 box per spreadsheet.
- Link to Section A Github project: https://github.com/noahgh221/sectionA_project
- Project contains all files needed to perform metadata crosswalk
- Alex, Mike, Noah have commit access to project
- Noah uses project to crosswalk MARC to EAD and to produce starter guides
- DPC picks up starter guides here, scans, and places completed digguide in /completed_digguide directory. This digguide should be used as the basis for DDR metadata
- RLTS will note when they finish their work as needed on the tracking spreadsheet
- Set to "MARC Review Complete"
- Notify DPC of new starter digguides
RLTS Detailed Documentation:
- MARC Remediation documentation: https://docs.google.com/document/d/1HkPQVHVkSo1Fq1k7zmu9xyrS2Ic7smPt3FYEKBI_5-E/edit
- Metadata Crosswalk Documentation (detailed): https://docs.google.com/document/d/1h-CaFHsyN3rciROCE5WiADGm8f9ZP6dwnsuKsauBL0U/edit
Following RLTS review, box should be ready for conservation services review, though these two phases may overlap.
Conservation
- Erin (and/or others from Conservation Services) will review materials and re-house and make repairs as necessary.
- If items w/ in a folder are too fragile, Erin will flag it, and make a note on the spreadsheet.
- If a collection contains items that should be imaged on the camera (not the Zeutschel), Erin will flag the folder as camera only.
- Work between RLTS and conservation may overlap, so Erin and Meghan will maintain communication via email as necessary in addition to updating google sheet.
Following RLTS and Conservation review, collection will be ready for digitization.
Digitization
Digitization Workflow:
- Retrieve Starter digitization guide:
- Noah will export 1 starter digitization guide per box (1 box = multiple collections).
- Download starter digitization guide from GitHub (here)
- Import .txt file to Excel and convert to CSV.
- Add physical_order (column A) and ‘barcode’ to digitization guide (column U (or the first empty column))
- Add standard DPC fields after ‘’Barcode’ (columns V-AI)
- Record the physical order of the material in the spreadsheet (the folders must stay in their original order) and sort it so the spreadsheet order matches the physical order.
- Folders designated as ‘Out of Scope’ will not be included in the Github spreadsheet
- Add rows for folders that are ‘out of scope’ and highlight them red. Inserting these rows and highlighting them red enables a scanner operator to return the ‘Out of Scope’ folders to the correct location should they get misfiled.
- Update tab name to reflect the box number ‘sec_a_box_1’
- When digitization of a box has been completed update the file name from ‘sec_a_digguide_box_1’ to ‘sec_a_digguide_box_1_c’ and post to Github.
- Digitize per the following guidelines:
- Check to see if folders in the box are in the same order as spreadsheet
- Digitization specialist will change collection titles on the physical folder if need be per changes in RLTS.
- Use the same capture device for a full collection/folder (if one item in the collection needs to be shot on camera, shoot entire collection on camera).
- In the case of scrapbook like items and fold outs, only digitize what is easy to see; do not worry about shooting each flap or fold out separately. If a fold out doesn’t cover other items, unfold and shoot. If a fold out covers other things, shoot the page folded.
- All papers in a folder count as 1 item, even bound items.
- Following digitization of each folder, remove flags and re-box.
- Use the DPC naming convention:
The DPC normally follows a standard file naming convention. Due to the number of collections in this project (+3,000) the file naming convention has been slightly modified.
All collections begin with ‘secst’. The collections are numbered sequentially starting at ‘0001’ so that secst0001 identifies the first collection within Section A. The majority of collections in Section A consist of one folder but there are a few that have two folders. It is not necessary to represent a folder in the file name so that part of the file name has been removed.
- Structure: sec/st/0001/001 (section a/type/collection/page)
- Collection ID is 9 characters long and can accommodate 9,999 collections
- Perform QC:
- QC should be completed 1 box at a time, as soon as the box has been digitized. Box should be returned to RL following QC.
- Each collection within Section A should be stored separately and have its own checksum file.
Detailed Digitization Guidelines and Instructions:
Ingest
- Each collection will be ingested as its own collection in DDR
- Ingest 1 collection at a time
- Eventually bulk ingest feature could facilitate ingesting multiple collections in one ingest (no timeline as of 6.14.17)
- Repository Services Analysts will ingest Section A collections to DDR.
Metadata
- Metadata fields and ingest sequence is described here: Ingest, Metadata and Launch Process (Section A)
- Rights statements
- Collections will utilize rights statements - see Rights section above.
Collection should be read for launch configuration following metadata ingest.
Launch
- Portal Configuration:
- All collections will be implemented with the unconfigured portals.
Use finding aid slug as EAD ID (Finding aid slug should be on collections' digitization guide)
Review collection and launch by adjusting roles
Following launch:
Mary to add Ark links to catalog records