CLARIN.SI Language Technology Centre
Abbreviation: CLARIN.SI
Research infrastructure:
- CLARIN (B-centre)
Warning: The recommendations have not been curated yet.
Functional domains:
- Audiovisual Annotation
- Audiovisual Source Language Data
- Catalogue Metadata
- Documentation
- Geodata
- Image Source Language Data
- Language Description
- Lexical Resource
- Statistical Data
- Text Annotation
- Textual Source Language Data
- Tool Support
Format recommendations:
Format | Domain | Level | Comments |
---|---|---|---|
AIFF | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
AVI | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
BPF | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
CMDI | Catalogue MetadataBasic structured information for discoverability and general description, to be openly provided for harvesting. | recommended | |
CoNLL-U | Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | recommended | |
CSSClick to add or suggest missing format information | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
CSV | Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) | recommended | |
DTDClick to add or suggest missing format information | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
EAF | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
EXB | Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. | recommended | |
EXS | Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. | recommended | |
FLAC | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
FLN | Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. | recommended | |
FoLiA | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
GIF | Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | |
GZIP | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
HTML | DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. | recommended | |
JPEG | Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | |
JS | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
KML | GeodataInformation on geographic locations. | recommended | |
LispClick to add or suggest missing format information | Language DescriptionStructured or unstructured descriptions of linguistic varieties or phenomena, typological databases etc. | recommended | |
LMF | Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) | recommended | |
M2JClick to add or suggest missing format information | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
M4AClick to add or suggest missing format information | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
MP3 | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
MP4 | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
MPEG-1 | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
MPEG-2 | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. | recommended | ||
Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | ||
Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | discouraged | ||
Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | discouraged | ||
PerlClick to add or suggest missing format information | Language DescriptionStructured or unstructured descriptions of linguistic varieties or phenomena, typological databases etc. | recommended | |
plainText | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
PNG | Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | |
Praat | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
RClick to add or suggest missing format information | Statistical DataData from surveys and tests in numeric formats. | recommended | |
RAW | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
RDFXMLClick to add or suggest missing format information | Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | recommended | |
SVG | Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | |
TAR | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
TCFClick to add or suggest missing format information | Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. | recommended | |
TEI | DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. | recommended | |
TEI | Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | recommended | |
TEISpoken | Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. | recommended | |
TIFF | Image Source Language DataDigitized images of analogue sources of written language data for research purposes (e.g. facsimiles, scans of handwriting, photos of inscriptions). | recommended | |
TigerClick to add or suggest missing format information | Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | recommended | |
TRS | Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. | recommended | |
TSV | Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) | recommended | |
VRTClick to add or suggest missing format information | Language DescriptionStructured or unstructured descriptions of linguistic varieties or phenomena, typological databases etc. | recommended | |
WAVE | Audiovisual Source Language DataAudio or video recordings providing spoken/multimodal or signed language data for research purposes. | recommended | |
XML | DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. | recommended | |
XML | Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. | recommended | |
XSDClick to add or suggest missing format information | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
XSLTClick to add or suggest missing format information | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended | |
ZIP | Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) | recommended |
Last update commit-id: 00fb8bd3