Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
Format Recommendations

This page presents formats of data depositions that various CLARIN centres are ready to accept. Each format, for each centre, can be "recommended", "acceptable" or "discouraged" in the context of several domains that represent the functions that the deposited data can play. The level of recommendation should always be viewed as relative to the profile of the given centre.

  • "recommended" should be interpreted as meaning that the centre in question will in most cases be able to process the data without much manipulation and that it is likely that the data will be preserved long-term in that format (the specifics are up to that centre);
  • "acceptable" should be interpreted as meaning that the centre may need to spend some time and resources on the up-conversion of the data, and that the data may be preserved in one of the recommended formats instead;
  • "discouraged" should be understood as indicating that the centre may find it problematic to up-convert the data.

Use the dropboxes to select the particular domain, centre, and/or level of recommendation. Columns can be sorted, and your results can be downloaded as XML.

The exported XML files for a specified centre can be used to extend or modify the recommendations for that centre, by an authorised person. In order to aid in the process, please consult the separate lists of all available file formats and of the functional groupings of formats (functional domains).

As of mid-2022, not every centre with depositing services has submitted the information to the SIS; in some cases, the information had to be unreliably mapped from lists provided on centre homepages onto the feature matrix offered by the SIS (created on the basis of the SIS functional domains and levels of recommendation). If you think you see an error, please kindly help us get it right.

Format Centre Domain Recommendation
CoNLL-U CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
PDF CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
RDFXMLClick to add or suggest missing format information CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TEI CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TigerClick to add or suggest missing format information CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML CLARIN.SI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
CoNLL PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged should use CoNLL-X or CoNLL-U
CoNLL-U PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
CoNLL-X PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
DOCX PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
JSON PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
PDF PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
TEI PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
plainText PORTULAN-CLARIN Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
TEI COCOON Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
KorAPXMLClick to add or suggest missing format information IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
JSON IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable regular and structured; consider using JSONLD with a schema
CoNLL-U IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
ALTO IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable Conversion to a suitable TEI-based format is expected, per Empfehlungen des DFG-Fachkollegiums 104 “Sprachwissenschaften" (Oct. 2019)
DTABf IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
I5 IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended See the format description.
PAULA IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
TEI IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended with ODD or other schema
plainText IDS Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
TigerClick to add or suggest missing format information CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable Used internally for our tools and services.
TEI CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable More specific dialects/customizations using ODD-documents to specify/extend. Consider reusing existing dialects (e.g. Menota) over creating your own.
CoNLL-U CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable Well known and defined standards of XML-formats are preferred. When depositing non-standard, less known formats consider depositing also schema documents,(ODD, XSD, DTD or RelaxNG), guidelines and documentation to improve usability.
CoNLL-X CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable consider using CoNLL-U instead
Menota CLARINO_Bergen Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended TEI extensions for Medieval Nordic texts
JSON Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable regular and structured; consider using JSONLD with a schema
CoNLL-U Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
ALTO Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable Conversion to a suitable TEI-based format is expected.
TEI Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended with ODD or other schema
plainText Sprakbanken Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
CoNLL-U SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
CoNLL-X SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable consider using CoNLL-U instead
JSON SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
JSON-LD SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
plainText SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
PDF SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
TEI SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
RDFXMLClick to add or suggest missing format information SAW Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TBTClick to add or suggest missing format information MPI-PL Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML MPI-PL Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TEI ACDH-ARCHE Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML ACDH-ARCHE Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
JSON-LD MI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
NTriplesClick to add or suggest missing format information MI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
1/2 >>