The SIS does not restrict the content of recommendations to the range of formats actually described in the system -- centres can mention any format that they are actually prepared to support, by creating and systematically using a format ID, which generally consists of the character "f" followed by a potentially mnemonic name. In this way, two major classes of broadly understood formats must be distinguished:
- formats that are part of the SIS inventory, equipped with descriptions, keywords, potentially references to standards that define or use them, etc.
- formats that are referenced inside centre recommendations, by means of an ID.
These two classes overlap, resulting in a tripartite division:
- formats that are mentioned in recommendations and are at the same time described in the SIS (111 'described formats'); these are listed at the bottom of this page;
- formats that are mentioned in recommendations and are not (yet) described in the SIS (58 'missing formats'); they are the ones that have a "+" symbol in recommendation lists and that link to predefined GitHub issues;
- formats that are described in the SIS but are not mentioned by any recommendation (9 'orphaned formats'); these are mostly either "hub" format categories, or formats once supported by centres but at least temporarily not in the scope of interest.
The present page lists the first category of formats, together with some of the properties that are identified in their descriptions. The other two categories have been delegated to the sanity checker page.
Formats described in the SIS (111)
The name of the format links to its description, sometimes rather stubby (you are welcome to help us extend the list and/or the descriptions, either by submitting an issue at GitHub containing suggested text or corrections, or by editing or adding the relevant format file and submitting a pull request).
By clicking on the icon next to the format name, you can copy the format ID, which may be useful for editing or adding centre recommendations.
Format | MIME types | File Extensions |
---|---|---|
AG XML (Annotation Graphs XML Format)Copy ID to clipboardFormat ID copied | text/xml | .xml |
AIFF (Audio Interchange File Format)Copy ID to clipboardFormat ID copied | audio/aiff, audio/x-aiff | .aif, .aiff |
ALTO (Analyzed Layout and Text Object)Copy ID to clipboardFormat ID copied | application/xml | .xml |
ANVIL (Anvil annotation file)Copy ID to clipboardFormat ID copied | text/xml | .anvil |
ArcGIS.gdb (Esri File Geodatabase)Copy ID to clipboardFormat ID copied | application/x-filegdb | .gdb |
ArcGIS.mxd (ArcGIS project file format)Copy ID to clipboardFormat ID copied | application/octet-stream | .mxd |
ASCII Grid (ArcGIS ASCII grid format)Copy ID to clipboardFormat ID copied | text/plain | .asc, .txt |
AutoCAD DXF-R12 (AutoCAD Drawing Interchange Format, v. R12 (ASCII))Copy ID to clipboardFormat ID copied | image/vnd.dxf | .dxf |
AVI (Audio Video Interleaved)Copy ID to clipboardFormat ID copied | video/avi | .avi |
BMP (Device-independent bitmap)Copy ID to clipboardFormat ID copied | image/bmp | .bmp, .dib |
BPF (BAS Partitur Format)Copy ID to clipboardFormat ID copied | text/plain-bas | .par |
CHAT (Codes for the Human Analysis of Transcripts)Copy ID to clipboardFormat ID copied | text/plain;format-variant =clan-cha , text/x-chat | .cha |
CHAT-XML (XML serialization of CHAT)Copy ID to clipboardFormat ID copied | application/xml;format-va riant=x-chat , text/xml | .xml |
CMDI (Component Metadata)Copy ID to clipboardFormat ID copied | application/x-cmdi+xml | .cmdi, .xml |
Coma (EXMARaLDA Corpus Manager)Copy ID to clipboardFormat ID copied | text/xml | .coma |
CoNLL (CoNLL unqualified)Copy ID to clipboardFormat ID copied | ||
CoNLL-U (CoNLL-U (Universal Dependencies))Copy ID to clipboardFormat ID copied | ||
CoNLL-U Plus (CoNLL-U Plus (extended Universal Dependencies))Copy ID to clipboardFormat ID copied | ||
CoNLL-X (CoNLL-X)Copy ID to clipboardFormat ID copied | ||
CSV (Comma-separated values)Copy ID to clipboardFormat ID copied | text/csv | .csv |
CWB-VRT (Corpus Workbench Verticalized Text)Copy ID to clipboardFormat ID copied | .vrt | |
DC XML (Dublin Core XML Metadata)Copy ID to clipboardFormat ID copied | application/xml | .xml |
DGD-XML (DGD XML Metadata)Copy ID to clipboardFormat ID copied | text/xml | .xml |
DICOM (Digital Imaging and Communications in Medicine)Copy ID to clipboardFormat ID copied | application/dicom | .dcm, .dic, .dicom |
DOCX (Microsoft Word/Office Open XML)Copy ID to clipboardFormat ID copied | application/vnd.openxmlfo rmats-officedocument.word processingml.document | .docx |
DTABf (Deutsches Textarchiv Basisformat)Copy ID to clipboardFormat ID copied | application/tei+xml;forma t-variant=dta , application/tei+xml;forma t-variant=dta;tokenized=[ 0,1] | .xml |
DXF (AutoCAD Drawing Interchange Format)Copy ID to clipboardFormat ID copied | image/vnd.dxf | .dxf |
EAF (ELAN EAF)Copy ID to clipboardFormat ID copied | text/x-eaf+xml, text/xml | .eaf |
EMU (Emu Speech Database)Copy ID to clipboardFormat ID copied | ||
Erdas.img (ERDAS IMAGINE File Format)Copy ID to clipboardFormat ID copied | application/octet-stream | .ige, .img |
EXB (EXMARaLDA Basic Transcription Format)Copy ID to clipboardFormat ID copied | text/xml | .exb |
EXS (EXMARaLDA Segmented Transcription Format)Copy ID to clipboardFormat ID copied | text/xml | .exs |
F4 (f4transkript file format)Copy ID to clipboardFormat ID copied | text/plain | .txt |
FLAC (Free Lossless Audio Codec)Copy ID to clipboardFormat ID copied | audio/flac | .flac |
FLEx (SIL FieldWorks Language Explorer (FLEx))Copy ID to clipboardFormat ID copied | text/xml | .xml |
FLExText (SIL FieldWorks Language Explorer Interlinear Text)Copy ID to clipboardFormat ID copied | text/xml | .flextext |
FLN (FOLKER)Copy ID to clipboardFormat ID copied | text/xml | .fln |
FoLiA (FoLiA: Format for Linguistic Annotation)Copy ID to clipboardFormat ID copied | application/xml | .folia.xml, .xml |
GeoJSON (Geographic JSON)Copy ID to clipboardFormat ID copied | application/geo+json | .geojson, .json |
GeoTIFF (Geographic Tagged Image File Format)Copy ID to clipboardFormat ID copied | image/tiff | .gtif, .tif, .tiff |
GIF (Graphics Interchange Format)Copy ID to clipboardFormat ID copied | image/gif | .gif |
GML (Geography Markup Language)Copy ID to clipboardFormat ID copied | application/gml+xml, application/x-gmz | .gml, .xml |
GrAF (Graph Annotation Format)Copy ID to clipboardFormat ID copied | application/xml | .xml |
GZIP (GZIP File Format)Copy ID to clipboardFormat ID copied | application/gzip | .gz |
HDF5 (Hierarchical Data Format, Version 5)Copy ID to clipboardFormat ID copied | application/x-hdf5 | .h5 |
HTML (Hypertext Markup Language)Copy ID to clipboardFormat ID copied | text/html | .htm, .html |
I5 (DeReKo archiving format)Copy ID to clipboardFormat ID copied | application/tei+xml | .i5, .xml |
JP2 (Joint Photographic Experts Group 2000)Copy ID to clipboardFormat ID copied | image/jp2, image/jpx | .jp2, .jpx |
JPEG (Joint Photographic Experts Group)Copy ID to clipboardFormat ID copied | image/jpeg | .jpeg, .jpg |
JS (JavaScript)Copy ID to clipboardFormat ID copied | application/ecmascript, application/javascript | .cjs, .es, .js, .mjs |
JSON (JavaScript Object Notation)Copy ID to clipboardFormat ID copied | application/json | .json |
JSON-LD (JavaScript Object Notation for Linked Data)Copy ID to clipboardFormat ID copied | application/ld+json | .jsonld |
KML (Keyhole Markup Language)Copy ID to clipboardFormat ID copied | application/vnd.google-ea rth.kml+xml , application/vnd.google-ea rth.kmz | .kml, .kmz |
LMF (LMF Lexical Markup Framework)Copy ID to clipboardFormat ID copied | text/x-lmf+xml | .lmf |
MapInfo.mif (MapInfo interchange format)Copy ID to clipboardFormat ID copied | .mid, .mif | |
MapInfo.tab (MapInfo native format)Copy ID to clipboardFormat ID copied | .tab | |
MapInfo.wor (MapInfo workspace file)Copy ID to clipboardFormat ID copied | .wor | |
Markdown (Markdown)Copy ID to clipboardFormat ID copied | text/markdown | .markdown, .md, .mdown, .mkd |
MP3 (MPEG Audio Layer III)Copy ID to clipboardFormat ID copied | audio/mpeg | .mp3 |
MP4 (MPEG 4 video)Copy ID to clipboardFormat ID copied | video/mp4 | .mp4 |
MPEG-1 (MPEG-1 Video Coding (H.261))Copy ID to clipboardFormat ID copied | video/mpeg | .mpeg, .mpg |
MPEG-2 (MPEG-2 Video Encoding (H.262))Copy ID to clipboardFormat ID copied | video/mpeg | .mpeg, .mpg |
MPEG-4 AVC (MPEG-4, Advanced Video Coding (Part 10) (H.264))Copy ID to clipboardFormat ID copied | video/mp4 | .mp4 |
NIST SPHERE (NIST SPHERE)Copy ID to clipboardFormat ID copied | audio/x-nist | .nist |
OCFL (Oxford Common File Layout)Copy ID to clipboardFormat ID copied | ||
PAULA (Potsdamer AUstauschformat Linguistischer Annotationen)Copy ID to clipboardFormat ID copied | application/xml | .xml |
PDF (Portable Document Format)Copy ID to clipboardFormat ID copied | application/pdf | |
PDF/A (PDF for archival preservation)Copy ID to clipboardFormat ID copied | application/pdf | |
PDF/A-1 (PDF for archival preservation, 2005)Copy ID to clipboardFormat ID copied | application/pdf | |
PDF/A-2 (PDF for archival preservation, 2011)Copy ID to clipboardFormat ID copied | application/pdf | |
PDF/A-3 (PDF for archival preservation with support for embedded files, 2012)Copy ID to clipboardFormat ID copied | application/pdf | |
PDF/A-4 (PDF for archival preservation, 2020)Copy ID to clipboardFormat ID copied | application/pdf | |
PhonDat1 (PhonDat Data Format #1)Copy ID to clipboardFormat ID copied | ||
PhonDat2 (PhonDat Data Format #2)Copy ID to clipboardFormat ID copied | ||
plainText (Plain text)Copy ID to clipboardFormat ID copied | text/plain | .txt |
PNG (Portable Network Graphics)Copy ID to clipboardFormat ID copied | image/png | .png |
Praat (Praat TextGrid)Copy ID to clipboardFormat ID copied | text/plain, text/praat-textgrid | .TextGrid |
QGIS.qgs (QGIS project file format)Copy ID to clipboardFormat ID copied | .qgd, .qgs, .qgz, .qlr, .qml | |
QuickTime (QuickTime File Format)Copy ID to clipboardFormat ID copied | video/quicktime, video/x-quicktime | .mov, .qt |
RAW (Raw Audio Format)Copy ID to clipboardFormat ID copied | audio/raw | .raw |
SAM (SAM Format)Copy ID to clipboardFormat ID copied | text/plain | .txt |
SAS.sas (Statistical Analysis System (SAS) standard save file)Copy ID to clipboardFormat ID copied | application/x-sas | .sas |
SAS.sd2 (Statistical Analysis System (SAS) Dataset File Format)Copy ID to clipboardFormat ID copied | .sas7bdat, .sd2 | |
SAS.xpt (Statistical Analysis System (SAS) Transport File Format)Copy ID to clipboardFormat ID copied | application/x-sas-xport | .xport, .xpt |
Shapefile (ESRI Arc/View ShapeFile)Copy ID to clipboardFormat ID copied | application/vnd.shp, x-gis/x-shapefile | .shp |
SPSS (Statistical Product and Service Solutions - uncategorized formats)Copy ID to clipboardFormat ID copied | .dat, .por, .sav, .sps, .spv | |
SPSS.data+setup (Statistical Product and Service Solutions (data and setup))Copy ID to clipboardFormat ID copied | text/plain | .dat, .sps |
SPSS.por (Statistical Product and Service Solutions (portable format))Copy ID to clipboardFormat ID copied | application/x-spss-por | .por |
SPSS.sav (Statistical Product and Service Solutions (standard dataset save format))Copy ID to clipboardFormat ID copied | application/x-spss-sav | .sav, .spv, .zsav |
SPSS.spv (Statistical Product and Service Solutions (statistics output files))Copy ID to clipboardFormat ID copied | .spo, .spv | |
STATA (STATA - uncategorized formats)Copy ID to clipboardFormat ID copied | .DO, .dat, .dta | |
STATA.data+setup (STATA (data and setup))Copy ID to clipboardFormat ID copied | text/plain | .DO, .dat |
STATA.dta (STATA (standard save format))Copy ID to clipboardFormat ID copied | .dta | |
SVG (Scalable Vector Graphics)Copy ID to clipboardFormat ID copied | image/svg+xml | .svg, .svgz |
TAR (Tape Archive File Format)Copy ID to clipboardFormat ID copied | application/x-tar | .tar |
TEI (Text Encoding Initiative)Copy ID to clipboardFormat ID copied | application/tei+xml | .tei, .xml |
TEIHeader (TEI Header elements)Copy ID to clipboardFormat ID copied | application/tei+xml | .tei, .xml |
TEISpoken (ISO/TEI Transcriptions of Spoken Language)Copy ID to clipboardFormat ID copied | application/tei+xml;forma t-variant=tei-iso-spoken , application/tei+xml;forma t-variant=tei-iso-spoken; tokenized=[0,1] | .tei |
TIFF (Tagged Image File Format)Copy ID to clipboardFormat ID copied | image/tiff | .tif, .tiff |
Toolbox (SIL Toolbox)Copy ID to clipboardFormat ID copied | text/plain | .tbt |
Transana (Transana XML format)Copy ID to clipboardFormat ID copied | text/xml | .xml |
TRS (Transcriber)Copy ID to clipboardFormat ID copied | text/xml | .trs |
TSV (Tab Separated Values)Copy ID to clipboardFormat ID copied | text/tab-separated-values | .tsv |
WAVE (Waveform Audio File Format)Copy ID to clipboardFormat ID copied | audio/vnd.wave, audio/wav, audio/wave, audio/x-wav | .wav, .wave |
Worldfile (Esri World File)Copy ID to clipboardFormat ID copied | text/plain | .wld |
Worldfile.jpgw (JPEG World File)Copy ID to clipboardFormat ID copied | text/plain | .jgw, .jpgw |
Worldfile.tifw (TIFF World File)Copy ID to clipboardFormat ID copied | text/plain | .tfw, .tifw |
XHTML (EXtensible HyperText Markup Language)Copy ID to clipboardFormat ID copied | application/xhtml+xml | .html |
XLSX (Microsoft Excel/Office Open XML)Copy ID to clipboardFormat ID copied | application/vnd.openxmlfo rmats-officedocument.spre adsheetml.sheet | .xlsx |
XML (eXtensible Markup Language)Copy ID to clipboardFormat ID copied | application/xml | .xml |
ZIP (ZIP File Format)Copy ID to clipboardFormat ID copied | application/zip | .zip |