See discussion in
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
TL;DR – Some “derivatives” in the DDR are created automatically by the system, and some are manually uploaded. Currently, the file information recorded in the repository does not reflect these different practices, and that could lead to accidental loss of data/work.
See discussion in
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
Our file model tracks four attributes: digest
(SHA1), original_filename
, media_type
(a.k.a. MIME type), and file_identifier
, which is reference to the storage location of file.
My proposal is to add a new attribute attributes to supplement this information and help track files better.
derived_from source - string (optional) The purpose of this attribute would be twofold – first, to signal, by its presence, that a file was derived through an automated process from another file in the repository. Although the source file is often the original “content” file, this is not always the case, so my thought is to be explicity by encoding - The resource/file from which this file is derived
For files derived (either manually or automatically) from other repository files, the value of source
could be a full reference to the source repository file as a URN, for example:
Code Block |
---|
urn:uuid:30ba2ca2-0ab1-4617-9c34-d5f8a9103c6f#content |
(We use the URN format to refer to resources in structural metadata already; this usage just adds the file reference as a sub-location.[1].)
creator - string (optional) - The entity responsible for the creation of the file
This would be the key attribute with respect to automated derivative generation, so it’s important that there be a reserved value to represent the repository itself. In the context of preservation events, we have used the term SYSTEM
.
In cases where the file is created manually outside the repository, it would be the curator’s responsibility to assign values to these attributes, if desired. Various batch ingest and update processes, as well as certain admin UI elements may need to be updated to support this functionality.
...
[1] https://datatracker.ietf.org/doc/html/rfc8141#section-2.3.3