Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
FoLiA: Format for Linguistic Annotation
suggest a fix or extension
Abbreviation: FoLiA
Identifiers:
Type Id
SIS ID fFoLiA Copy ID to clipboardSIS ID copied
Media type(s):
File extension(s): .xml, .folia.xml
Format family: XML
Functional domains:
  • Textual Source Language Data
Recommendations:
Centre Domain Level Comments
CLARIN.SI Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
CLST Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
Description:

FoLiA, an acronym for Format for Linguistic Annotation, is a data model and file format to represent digitised language resources enriched with linguistic annotation, e.g. linguistically enriched textual documents or transcriptions of speech. The format is intended to provide a standard for the storage and exchange of such language resources, including corpora and to promote interoperability between Natural Language Processing tools that use the format.

See https://folia.readthedocs.io/en/latest/index.html for details.

Keywords: annotation format, corpus encoding