Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
SAW Leipzig
Suggest a fix or extension
Abbreviation: SAW
Link: https://centres.clarin.eu/centre/4
Research infrastructure:
  • CLARIN (B-centre)
  • Text+ (Lexical Resources, Operations)
Curation:
Description:

Das Repositorium der Sächsischen Akademie der Wissenschaften zu Leipzig bietet die langfristige Sicherung digitaler Ressourcen und ihrer Metadaten. Der Auftrag des Repositoriums ist es die Verfügbarkeit und langfristige Sicherung von Forschungsdaten sicherzustellen, Forschungsergebnisse zu sichern, den Wissenstransfer in neue Fachbereiche zu erleichtern und neuartige Methoden und Ressourcen in den universitären Lehrplan zu integrieren. Ein besonderer inhaltlicher Fokus liegt auf lexikalischen Ressourcen und Sprachressourcen für sogenannte "unterrepräsentierte" Sprachen.

Falls kein empfohlenes, standardisiertes und dokumentiertes Format verwendet wird, muss eine umfassende Dokumentation zur Syntax und Semantik der Daten bereitgestellt werden (z. B. bei Datenbank-Dumps: Namen von Tabellen und Spalten; Spezifikationen und Beispiele zum Inhalt jeder Spalte; Beispiele zum Abrufen verschiedene Arten von Daten). Diese Dokumentation (Englisch, PDF) wird zusammen mit den Daten und Metadaten im Repositorium gespeichert und allen zur Verfügung gestellt, welche die Ressource herunterladen bzw. auf sie zugreifen möchten.

Auf dem Webportal des Repositoriums sind weiterführende Informationen zu Datenhosting und Metadatenanforderungen zu finden.

Functional domains:
  • Audiovisual Annotation
  • Image Annotation
  • Text Annotation
  • Catalogue Metadata
  • Contextual Information
  • Documentation
  • Metadata
  • Language Description
  • Lexical Resource
  • Textual Source Language Data
  • Tool Support
Format recommendations:
Format Domain Level Comments
CMDI Catalogue MetadataBasic structured information for discoverability and general description, to be openly provided for harvesting. recommended
CMDI Contextual InformationStructured information on the communicative event or text and its creators (i.e. participants or authors) relevant for analysis. recommended
CMDI MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. recommended
CoNLL-U Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
CoNLL-U Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
CoNLL-X Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable consider using CoNLL-U instead
CoNLL-X Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable consider using CoNLL-U instead
CSV MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. acceptable
CSV Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) recommended
DC XML MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. recommended
DOCX Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
DOCX DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. discouraged
DOCX MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. discouraged
DOCX Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
HTML DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
HTML Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
JSON Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. acceptable
JSON MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. acceptable regular and structured; consider using JSONLD with a schema
JSON Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
JSON-LD Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
JSON-LD MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. recommended
LMF Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) recommended
Markdown DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
Markdown Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
PDF Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
PDF DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. acceptable consider using PDFA instead
PDF Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. acceptable
PDF/A DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
plainText Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. discouraged
plainText Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. discouraged
plainText DocumentationUnstructured documentation of the resource and its parts such as corpus or annotation guidelines. recommended
plainText Textual Source Language DataWritten unstructured/plain text or originally structured text (e.g. HTML) without linguistic or other mark-up added for research purposes. recommended
RDFXMLClick to add or suggest missing format information Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TEI Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
TEI Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
TEIHeader MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. recommended
TEISpoken Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
TSV Lexical ResourceStructured (item-based) resources for lexical and/or conceptual information on units of language (e.g. wordlists, lexicons, WordNets etc.) acceptable
XML Audiovisual AnnotationAnnotations of audiovisual sources, usually including a basic rendering of the spoken content (transcription) and sometimes further annotation. recommended
XML Image AnnotationAnnotations of image sources. recommended
XML Text AnnotationAnnotations of textual sources/written text, with the original text included or as stand-off. recommended
XML Catalogue MetadataBasic structured information for discoverability and general description, to be openly provided for harvesting. acceptable
XML Contextual InformationStructured information on the communicative event or text and its creators (i.e. participants or authors) relevant for analysis. acceptable
XML MetadataComprehensive structured information including descriptive, structural and administrative metadata. See the for further hints. acceptable
XML Language DescriptionStructured or unstructured descriptions of linguistic varieties or phenomena, typological databases etc. recommended
ZIP Tool SupportTool-related formats required for specific functionality of the tool or reliable reuse of resources (e.g. tagsets, annotation schemes, vocabularies, language models, parameter files, and other specifications or settings) recommended
Last update commit-id: 8468186f