Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use case: published data - include suggested procedure #24

Merged
merged 1 commit into from
May 16, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion Governance/retire/retire-published.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ However, there are times when we need to delete the old data because it was seri
If data is to be unpublished, we need to consider whether a copy of the old version can and should be archived on a tape store (e.g. NCI's MDSS) or if it is safe to delete the data. **In general, a copy of all published data should always be kept by the publisher, even if it contains errors**.


It is not permissible to simply remove a DOI, though it may be necessary to redirect to a "tombstone page" which resolves to a page that explicitly states that the dataset is no longer here.
It is not permissible to simply remove a DOI, though it may be necessary to redirect to a ["tombstone page"](https://support.datacite.org/docs/tombstone-pages) which resolves to a page that explicitly states that the dataset is no longer here.

Every new version of a dataset (e.g. due to changes in the data) requires a new DOI.

Expand All @@ -15,3 +15,11 @@ When practicable to do so, retired datasets should be archived to a deep store,

Most data facilities (including NCI and CSIRO) have tape archives where data can be backed up or stored when it is no longer required. However, there is still some cost associated with this storage, and it is not infinite in scale, so it is not always an appropriate solution, but it should be used when viable to do so.

## Suggested procedure

1. Create errata statement or description of new dataset version explaining reason for retirement of this dataset
2. Circulate notice of retirement to registered users (e.g. project members)
3. Update DOI landing page with errata statement and notice of removal/new version
4. Quarantine data, removing read access but leaving a README document including the errata statement
5. Create a deep archive copy of the dataset
6. After some pre-defined period, data can be removed from the filesystem, though as we are the original ublisher, if at all possible the deep archive copy must be retained.