Skip to content

Commit

Permalink
added markdown version of the SIP spec
Browse files Browse the repository at this point in the history
  • Loading branch information
jmaferreira committed Sep 14, 2018
1 parent 721a7ef commit d42af69
Show file tree
Hide file tree
Showing 51 changed files with 710 additions and 6 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.DS_Store
41 changes: 35 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,43 @@
# E-ARK-SIP
E-ARK SIP specification
# E-ARK General SIP specification

This page describes the SIP package structure and minimum set of required metadata for SIP delivery to the archive. It is fully compliant with the Common Specification for Information Packages.
This Git repository aims to describes the E-ARK SIP package structure and minimum set of required metadata for SIP delivery to the archive. It is fully compliant with the Common Specification for Information Packages.

## Target audience

The target group for this document are records creators, archival institutions and software providers creating or updating their SIP format specifications.



About history:
## The specification

### Final versions

Final versions of the specification are conveniently published at the [DILCIS Board web site](http://dilcis.eu/specifications/sip) on PDF format.


### Draft versions

The most up-to-date version of the SIP specification is being managed in markdown format in this GitHub repository.

This is a draft version of the specification that is being collaboratively edited by multiple experts.

An HTML version of the E-ARK Submission Information Package Specification is available on the
[specification folder](./specification/) of this repository.

See [Markdown documentation ](https://guides.github.com/features/mastering-markdown/) for a deeper understanding on how to edit Markdown documents.



## Previous versions of the specification

Previous versions of the specification are available on the [archive](./archive/) folder.


## History

In 2014, the E-ARK project conducted a survey and published a report on available best practices. The report provided, among other outcomes, an overview of SIP formats used in memory institutions and supported by tools.

In 2014, the E-ARK project conducted a survey and published a report on available best practices. The report provided, among other outcomes, an overview of SIP formats used in memory institutions and supported by tools. The E-ARK project analysed the formats and then delivered a first version of a harmonised SIP format based on that - Deliverable 3.2 E-ARK SIP Draft Specification. That deliverable gave an overview of the structure and main metadata elements for the SIP and provided initial input for the technical implementations of pre-ingest and ingest tools. It was followed by Deliverable 3.3 which extended the previous one by providing a revised version of the D3.2 content, adding more details relevant for tool development and implementation, and describing specific profiles for the transfer of relational databases, electronic records management systems (ERMS) and simple file system based records (SFSB). The version 0.14 is based on the deliverable 3.3 and the feedback received from pilot projects.
The E-ARK project analysed the formats and then delivered a first version of a harmonised SIP format based on that - Deliverable 3.2 E-ARK SIP Draft Specification.

That deliverable gave an overview of the structure and main metadata elements for the SIP and provided initial input for the technical implementations of pre-ingest and ingest tools. It was followed by Deliverable 3.3 which extended the previous one by providing a revised version of the D3.2 content, adding more details relevant for tool development and implementation, and describing specific profiles for the transfer of relational databases, electronic records management systems (ERMS) and simple file system based records (SFSB).

The version 1.4 is based on the deliverable 3.3 and the feedback received from pilot projects.
3 changes: 3 additions & 0 deletions archive/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# General SIP Specification

In this folder you will find the published versions of the General SIP specification.
Binary file added archive/v1.4/General_SIP Specification_v1.4.docx
Binary file not shown.
Binary file added archive/v1.4/General_SIP Specification_v1.4.pdf
Binary file not shown.
Binary file added archive/v1.4/images/image1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image14.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image5.emf
Binary file not shown.
Binary file added archive/v1.4/images/image5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image6.emf
Binary file not shown.
Binary file added archive/v1.4/images/image6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image7.emf
Binary file not shown.
Binary file added archive/v1.4/images/image7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image8.emf
Binary file not shown.
Binary file added archive/v1.4/images/image8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/images/image9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added archive/v1.4/~$neral_SIP Specification_v1.4.txt
Binary file not shown.
15 changes: 15 additions & 0 deletions specification/00.01-authors/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Authors
-------

| Name | Organisation |
| -------------------------------- | -------------------------------------------------- |
| Tarvo Kärberg | National Archives of Estonia |
| Anders Bo Nielsen | Danish National Archives |
| Björn Skog | ES Solutions |
| Gregor Zavrsnik | Slovenian National Archives |
| Hélder Silva | KEEP SOLUTIONS |
| Karin Bredenberg | National Archives of Sweden |
| Kathrine Hougaard Edsen Johansen | Danish National Archives |
| Levente Szilágyi | National Archives of Hungary |
| Phillip Mike Tømmerholt | Danish National Archives |
| Miguel Ferreira | KEEP SOLUTIONS |
38 changes: 38 additions & 0 deletions specification/00.02-history/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
Revision History
----------------

| Revision No. | Date | Authors(s) | Organisation | Description |
|--------------|------------|----------------------------------|------------------------|-----------------------------------------------------------------------|
| 0.1 | 20.10.2014 | Tarvo Kärberg | NAE | First draft. |
| 0.2 | 13.11.2014 | Tarvo Kärberg | NAE | Updating content. |
| 0.3 | 02.12.2014 | Tarvo Kärberg | NAE | Updating content. |
| 0.4 | 17.01.2015 | Tarvo Kärberg | NAE | Updating content. |
| 0.5 | 21.01.2015 | Karin Bredenberg | ESS | Updating content. |
| 0.6 | 23.01.2015 | Anders Bo Nielsen | DNA | Updating content. |
| 0.7 | 23.01.2015 | Kathrine Hougaard Edsen | DNA | Updating content. |
| 0.71 | 26.01.2015 | Björn Skog | ESS | Updating content. |
| 0.72 | 27.01.2015 | Hélder Silva | KEEPS | Updating content. |
| 0.8 | 27.01.2015 | Angela Dappert | DLM/UPHEC | Quality assurance and proof-reading. |
| 0.9 | 29.01.2017 | Kuldar Aas | NAE | Quality assurance and proof-reading. |
| 0.91 | 30.01.2015 | David Anderson | UPHEC | Quality assurance and proof-reading. |
| 1.0 | 30.01.2015 | Tarvo Kärberg | NAE | Final version (D3.2). |
| 0.1 | 11.05.2015 | Karin Bredenberg | ESS/NAS | Updating content. |
| 0.2 | 30.06.2015 | Tarvo Kärberg | NAE | Updating content. |
| 0.3 | 27.07.2015 | Tarvo Kärberg | NAE | Updating content. |
| 0.4 | 23.10.2015 | Tarvo Kärberg | NAE | Updating content, synchronising with the SMURF profile. |
| 0.41 | 17.11.2015 | Tarvo Kärberg | NAE | Integrating the feedback. |
| 0.42 | 07.12.2015 | Tarvo Kärberg | NAE | Updating content. |
| 0.5 | 12.01.2016 | Tarvo Kärberg | NAE | Updating content, synchronising with the Common Specification. |
| 0.6 | 15.01.2016 | Anders Bo Nielsen | DNA | Updating content. |
| 0.61 | 15.01.2016 | Gregor Zavrsnik | SNA | Updating content. |
| 0.62 | 18.01.2016 | Tarvo Kärberg | NAE | Updating content. |
| 0.63 | 20.01.2016 | Phillip Mike Tømmerholt | DNA | Updating content. |
| 0.64 | 25.01.2016 | Phillip Mike Tømmerholt | DNA | Updating content. |
| 0.7 | 26.01.2016 | Sven Schlarb | AIT | Quality assurance and proof-reading. |
| 0.8 | 27.01.2016 | Kuldar Aas | NAE | Quality assurance and proof-reading. |
| 0.9 | 29.01.2016 | Andrew Wilson and David Anderson | University of Brighton | Quality assurance and proof-reading. |
| 1.0 | 29.01.2016 | Tarvo Kärberg | NAE | Final version (general part of D3.3) |
| 1.1 | 14.07.2016 | Tarvo Kärberg | NAE | Incorporating agreements made in the Common Specification work group. |
| 1.2 | 12.12.2016 | Tarvo Kärberg | NAE | Incorporating agreements made in the Common Specification work group. |
| 1.3 | 13.01.2017 | Tarvo Kärberg | NAE | Small updates. |
| 1.4 | 31.01.2017 | Tarvo Kärberg | NAE | Finalising the specification. |
16 changes: 16 additions & 0 deletions specification/00.03-summary/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Executive summary

According to the Open Archival Information System Reference Model (OAIS) every submission of information to an archive by a producer occurs as one or more discrete transmissions of submission information packages. Unfortunately there is currently no central SIP format which would cover all national and business needs as identified in the E-ARK Report on Available Best Practices. The E-ARK project acknowledged this problem and developed a solution in the form of the SIP format which is described in this document.

The first outcome of this work was Deliverable 3.2: E-ARK SIP Draft Specification. This gives an overview of the structure and main metadata elements for the SIP and provides initial input for the technical implementations of pre-ingest and ingest tools. It was followed by Deliverable 3.3 which extends the previous one by providing a revised version of the D3.2 content, adding more details relevant for tool development and implementation, and describing specific profiles for the transfer of relational databases, electronic records management systems (ERMS) and simple file system based records (SFSB).

The target group for this document are records creators, archival institutions and software providers creating or updating their SIP format specifications. This document is also important for electronic records management systems (ERMS) providers as it presents a standardised profile for exporting records and metadata out of their systems.

This document provides an overview of:

- **The general structure for Submission Information Packages.**
This chapter explains how records creators should construct/structure their SIPs in order to meet the requirements of the SIP specification and achieve interoperability by following the common rules for all information packages (SIPs, AIPs, DIPs) as described in the Common Specification for Information Packages .
- **General SIP metadata.** This chapter provides a detailed overview of metadata sections and the metadata elements in these sections. The tables with all metadata elements could possibly be of interest to technical stakeholders who wish to implement the SIP.
- **Content Information Type Specifications.** This section introduces profiles for SMURF (Semantically Marked Up Records Format) and relational databases. The profiles themselves are separate documents.
- **The submission agreement.** This chapter provides an overview of submission agreement usages and recommended metadata elements.

Binary file added specification/01-introduction/image1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions specification/01-introduction/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# 1. Introduction


## 1.1. Scope and purpose

This document is a core / general SIP specification which is guided by the following hierarchical model (see Figure 1):

![Relations between specifications](image1.png)


- Common Specification for Information Packages (CSIP) identifies and standardises the common aspects of information packages (SIP/AIP/DIP) which are equally relevant and implemented by any of the functional entities of the overall digital preservation process (i.e. pre-ingest, ingest, long-term preservation and access). CSIP is a separate document. Therefore, the current specification does not aim largely repeating the information presented there – only the information that is absolutely necessary to understand the SIP specification will be mentioned here.
- General SIP Specification. This is the current document which describes the SIP package structure and minimum set of required metadata for SIP delivery to the archive.
- Content Information Type Specifications are content-dependent specifications which include detailed information on how content, metadata, and documentation for specific content types (for example ERMS or relational databases) can to be handled within the SIP. At the moment, there are 3 such specifications:
- SIARD 2.0 for relational databases (The SIARD 2.0 specification for relational databases can be found at http://eark-project.com/resources/specificationdocs/32-specification-for-siard-format-v20)
- SMURF ERMS for electronic records management systems (The SMURF profile for ERMS can be found https://github.com/DLMArchivalStandardsBoard/SMURF/tree/master/spec.)
- SMURF SFSB for simple file system based records (The SMURF profile for SFSB can be found at https://github.com/DLMArchivalStandardsBoard/SMURF/tree/master/spec.)


## 1.2. Related work

This document is based on or influenced by the following documents and best practices:

- **Deliverable D3.1** - E-ARK Report on Available Best Practices, 2014, http://eark-project.com/resources/project-deliverables/6-d31-e-ark-report-on-available-best-practices
D3.1 was one of the inputs to the deliverable D3.2 and the D3.2 to the D3.3.
- **Deliverable D2.1** - General pilot model and use case definition, 2014, http://eark-project.com/resources/project-deliverables/5-d21-e-ark-general-pilot-model-and-use-case-definition.
We have developed the SIP specification to support the workflows defined in the general model.
- **FGS package structure**, 2013, https://riksarkivet.se/Media/pdf-filer/Projekt/FGS_Earkiv_Paket.pdf
This specification was one of the main inputs for the first draft SIP specification. The newest version (https://riksarkivet.se/Media/pdf-filer/doi-t/FGS_Paketstruktur_RAFGS1V1.pdf) was also investigated in the SIP definition process.
- **Reference Model for an Open Archival Information System** (OAIS), 2012, public.ccsds.org/publications/archive/650x0m2.pdf
We have used the same terminology as introduced in the OAIS model and also the same division of information package types: Submission Information Package (SIP), Archival Information Package (AIP), Dissemination Information Package (DIP).
- **Producer-Archive Interface Methodology Abstract Standard** (PAIMAS), 2004, public.ccsds.org/publications/archive/651x0m1.pdf
We have looked at the four phases (Preliminary, Formal Definition, Transfer, Validation) of PAIMAS, their aims and expected results and decided to support the phases as far as possible with the current specification. Furthermore, the requirements for the submission agreement were influenced by the PAIMAS standard.
- **Producer-Archive Interface Specification (PAIS)** – CCSDS, 2014, public.ccsds.org/publications/archive/651x1b1.pdf
We have investigated the structure of a SIP presented in PAIS, but as the implementation of this specification is far from comprehensive (only few prototypes exist), we decided to rely more on the best practices introduced in the best practice report.
- **e-SENS** (Electronic Simple European Networked Services) project, http://www.esens.eu/
We have investigated the e-Delivery and e-Documents related work in e-SENS and made sure that our work is neither duplicating the work done there nor producing any conflicts between deliverables.
- **Deliverables D3.2** - E-ARK SIP Draft Specification, 2015, http://eark-project.com/resources/project-deliverables/17-d32-e-ark-sip-draft-specification and D3.3 E-ARK SIP Pilot Specification, 2016, http://eark-project.com/resources/project-deliverables/51-d33pilotspec


Binary file added specification/02-general_structure/image2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added specification/02-general_structure/image3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit d42af69

Please sign in to comment.