Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
ISO/TEI Transcriptions of Spoken Language
suggest a fix or extension
Abbreviation: TEISpoken
Type Id
SIS ID fTEISpoken Copy ID to clipboardSIS ID copied
Media type(s):
File extension(s): .tei
Format family: TEI
Functional domains:
  • Audiovisual Annotation
Centre Domain Level Comments
CLARIN.SI Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
FIN-CLARIN Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended See format description.
HZSK Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
IDS Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended See format description.
LAC Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
SAW Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
Sprakbanken Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended See format description.
ZIM Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended

This format is a serialization of the ISO/TEI Transcriptions of Spoken Language.

ISO/TEI transcriptions of spoken language will be identified by the MIME type application/tei+xml;format-variant=tei-iso-spoken. A parameter tokenized=0/1 can be added to indicate whether (=1) or not (=0) the respective TEI file is tokenized (i.e. has <w> markup).

For more information, see Thomas Schmidt, “A TEI-based Approach to Standardising Spoken Language Transcription”, Journal of the Text Encoding Initiative [Online], Issue 1 | June 2011, Online since 08 June 2011, connection on 21 September 2021. URL:; DOI:

Please feel welcome to supply the description of this format file via GitHub: either as an issue report, or as a pull request after forking or browsing the code under the 'formats' branch.

Keywords: annotation format, corpus encoding
Related Standard(s):
  • TSL
