PDF for archival preservation

Abbreviation: PDF/A

Identifiers:

Type	Id
SIS ID	fPDFA	Copy ID to clipboardSIS ID copied
LOCLibrary of Congress	fdd000318

Media type(s):

application/pdf

File extension(s): .pdf

Format family: PDF

Functional domains extracted from the recommendations:

Recommendations:

Centre	Domain	Level	Comments
CLARIN:EL	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended	scanned images
CLARIN:EL	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	recommended	Formatted/Encoded
CLARINO_Bergen	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended
DANS	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended	See more info from DANS
DANS	Other Any other function that cannot be included in an existing domain. The content of this domain will be periodically examined for potential patterns that may give rise to new domains.	acceptable	See more info from DANS
DANS	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	recommended	See more info from DANS
EKUT	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
EKUT	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	recommended
FIN-CLARIN	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended
IDS	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended
IDS	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
LAC	Contextual Data Images (photos or drawings) or documents relevant to the communicative event or text, but not part of the source language data.	recommended
MI	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
MI	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	recommended
OTA	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
PORTULAN-CLARIN	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	acceptable
SAW	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended
Sprakbanken	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
UdS	Documentation Unstructured documentation of the resource and its parts such as corpus or annotation guidelines.	recommended
UdS	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	acceptable
ZIM	Image Source Language Data Digitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions).	recommended
ZIM	Textual Source Language Data Written unstructured/plain text or originally structured text (e.g. HTML), without linguistic or other mark-up added for research purposes.	recommended

Description:

PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking (as opposed to font embedding) and encryption. Note that "PDF/A" is actually a collection of formats:

PDF/A-1: "Part 1: Use of PDF 1.4" (2005-09-28)
PDF/A-2: "Part 2: Use of ISO 32000-1" (2011-06-20)
PDF/A-3: "Part 3: Use of ISO 32000-1 with support for embedded files" (2012-10-15)
PDF/A-4: "Part 4: Use of ISO 32000-2" (2020-11)

Centres should note that Part 1 references an obsolete version of PDF, while parts 2 and 3 reference the fully open PDF 1.7.

VeraPDF is an open-source validator for PDF/A formats.

Keywords: document format, binarized TextualData

Related Standard(s):

SpecPDFAPDF/A

Relations

Legend:
	isDefinedBy

Home
Centres
Format Recommendations
	Data Deposition Formats
	Functional Domains
	File Extensions
	Media Types
	Statistics
		Popular Formats
		Centre Statistics
		Relevant KPIs
	Sanity Check
		Keywords
Standards Watchtower
	Standard Bodies
	Topics
	Search
API
About / F.A.Q.