Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
Plain text
suggest a fix or extension
Abbreviation: plainText
Identifiers:
Type Id
SIS ID fTextPlain Copy ID to clipboardSIS ID copied
Media type(s):
File extension(s): .txt
Format family: Plain.Running
Functional domains:
  • Audiovisual Annotation
  • Contextual Data
  • Documentation
  • Text Annotation
  • Textual Source Language Data
Recommendations:
Centre Domain Level Comments
ACDH-ARCHE DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
ACDH-ARCHE Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
BBAW DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
BBAW Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
CLARIN-DK-UCPH DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. acceptable
CLARIN-DK-UCPH Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
CLARIN.SI Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
CLST Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
DANS DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended Encoded as UTF-8/16/32, see more info from DANS
DANS Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended Encoded as UTF-8/16/32, see more info from DANS
EKUT Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
FIN-CLARIN Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
FIN-CLARIN Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended UTF-8 encoded
FIN-CLARIN DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended e.g. as README.txt
IDS Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
IDS DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
IDS Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable ohne Mark-up
ILC4CLARIN Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
LAC Contextual DataImages (photos or drawings) or documents relevant to the communicative event or text but not part of the source language data. recommended UTF-8 encoding
MI DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
MI Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
MPI-PL Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
ORTOLANG Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
SAW Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
SAW DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
SAW Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
Sprakbanken Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
Sprakbanken DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
Sprakbanken Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable without markup
UdS Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
ZIM Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
Description:

Plain text is a pure sequence of character codes. (...) Plain text represents character content only, not its appearance. (...) Plain text must contain enough information to permit the text to be rendered legibly, and nothing more. (Unicode 6.1, section 2.2)

See: Unicode 6.1, Wikipedia article for a broader context.

Parameters important to plain text are, among others, its encoding and the platform-dependent end-of-line markup. Higher-level parameters include text directionality and further, e.g., the natural language that sequences of characters are meant to represent.

Whitespace characters in plain text are often used to provide rough structural markup (e.g. double whitespace after sentence-final punctuation; end-of-line and multiples thereof to signal division into paragraphs). When treated as a data format, plain text uses whitespace to signal division into columns -- this is how it is related to formats such as TSV and column-based formats in general (with CSV, where tabs are replaced by commas, being a close relative).

Keywords: plain text format
Related Standard(s):
Relations
Legend:

isUsedBy