Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
PDF for archival preservation
suggest a fix or extension
Abbreviation: PDF/A
Identifiers:
Type Id
SIS ID fPDFA Copy ID to clipboardSIS ID copied
LOCLibrary of Congress fdd000318
Media type(s):
File extension(s): .pdf
Format family: PDF
Functional domains:
  • Contextual Data
  • Documentation
  • Image Source Language Data
  • Other
  • Textual Source Language Data
Recommendations:
Centre Domain Level Comments
DANS DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended See more info from DANS
DANS OtherAny other function that cannot be included in an existing domain. The content of this domain will be periodically examined for potential patterns that may give rise to new domains. acceptable See more info from DANS
DANS Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended See more info from DANS
EKUT Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). recommended
EKUT Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
FIN-CLARIN DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
IDS DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
IDS Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). recommended
LAC Contextual DataImages (photos or drawings) or documents relevant to the communicative event or text but not part of the source language data. recommended
MI Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). recommended
MI Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
SAW DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
Sprakbanken Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). recommended
ZIM Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). recommended
ZIM Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
Description:

PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking (as opposed to font embedding) and encryption. Note that "PDF/A" is actually a collection of formats:

  • PDF/A-1: "Part 1: Use of PDF 1.4" (2005-09-28)
  • PDF/A-2: "Part 2: Use of ISO 32000-1" (2011-06-20)
  • PDF/A-3: "Part 3: Use of ISO 32000-1 with support for embedded files" (2012-10-15)
  • PDF/A-4: "Part 4: Use of ISO 32000-2" (2020-11)

Centres should note that Part 1 references an obsolete version of PDF, while parts 2 and 3 reference the fully open PDF 1.7.

VeraPDF is an open-source validator for PDF/A 1-3 formats.

Keywords: document format, binarized TextualData
Related Standard(s):
Relations
Legend:

isDefinedBy