Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
International Organization for Standardization (ISO)

The International Organization for Standardization (ISO) was founded in February 1947 and is a non-government federation of national standards bodies from various national standards organizations. Meanwhile over 150 countries have been represented in ISO. ISO works on developing and publishing worldwide proprietary, industrial and commercial standards. In other words, the standards describe the requirements and guidelines accepted by the industry members.

The interests of Germany in the ISO are represented by the German Institute for Standardization (DIN), which has been a national member of the ISO since 1951.

Since the use of automatically generated and processed linguistic resources dramatically increased, standardization of linguistic resources is needed. Besides, standardization of other fields of linguistic research, such as syntax, semantics and pragmatics, are needed. The standardization would allow the linguistic resources to be applied to any purpose, to be reusable and to be exchanged between various applications.

The subcommittee ISO/TC 37/SC 4 (“Language resource management”) was founded within the technical committee ISO/TC 37 (“Terminology and other language and content resources”) for the purposes of natural language resource management.

The subcommittee ISO/TC 37/SC 4 works on developing international standards and guidelines for effective language resource management in mono- and multilingual applications. Furthermore, it develops principles and methods for representations and annotations of data, creation of categories for thesauri, ontologies, morphological, syntactical, analysis and lots more.

ISO/TC 37/SC 4 is divided into the following Working Groups:

  • WG 1: Basic descriptors and mechanisms for language resources (Convenor: Nancy Ide)
  • WG 2: Semantic Annotation (Convenor: Kiyong Lee)
  • WG 3: Multilingual information representation (Convenor: under searching)
  • WG 4: Lexical resources (Convenor: Calzolari Nicoletta)
  • WG 5: Workflow of language resource management (Convenor: Key-Sun Choi)
  • WG 6: Linguistic Annotation (Convenor: Witt Andreas)

The other subcommittee, which has an important role for the linguistic resource is ISO/IEC JTC 1/SC 34 (“Document description and processing languages”) within the joint technical committee ISO/IEC JTC1, which is a collaborative effort of the ISO and the IEC (International Electrotechnical Commission). The subcommittee ISO/IEC JTC 1/SC 34 develops standards and guidelines for description of document structures, languages, data types, processing and handling of compound and multimedia/hypermedia documents and much more.

ISO/IEC JTC 1/SC 34 is divided into the following Working Groups:

  • WG 1: Markup Languages (Convener: Alex Brown)
  • WG 2: Information Presentation (Convener: Yushi Komachi)
  • WG 3: Information Association (Convener: Patrick Durusau)
  • WG 4: Office Open XML (Convener: Makoto Murata)
  • WG 5: Document Interoperability (Convener: Jaeho Lee)
  • WG 6: OpenDocument Format (Convener: Francis Cave)
Specifications standardized by this body:
  1. Language resource management — Feature structures
  2. Language resource management — Morpho-syntactic annotation framework
  3. HyperText Markup Language
  4. Language resource management — Linguistic annotation framework
  5. Information Processing — Text and Office Information Systems – Standard Generalized Markup Language
  6. Semantic role markup language
  7. Document management — Electronic document file format for long-term preservation
  8. Language resource management — Semantic annotation framework
  9. Data Category Registry
  10. REgular LAnguage for XML Next Generation
  11. Dublin Core Metadata Element Set
  12. Language resource management — Word segmentation of written texts
  13. Language resource management — Lexical markup framework
  14. Language resource management — Persistent identification and sustainable access
  15. Component Metadata Infrastructure
  16. Segmentation Rules eXchange
  17. Distributed Ontology Language
  18. Country Codes
  19. Computer applications in terminology — Terminological markup framework
  20. Systems to manage terminology, knowledge and content — Design, implementation and maintenance of terminology management systems
  21. Language Resources Management — Multilingual Information Framework
  22. Presentation/representation of entries in dictionaries — Requirements, recommendations and information
  23. Transcription of Spoken Language
  24. Information and documentation — Thesauri and interoperability with other vocabularies
  25. Language resource management — Syntactic annotation framework
  26. TermBase eXchange
  27. Document Style Semantics and Specification Language
  28. Language resource management — Simplified natural language — Part 1: Basic concepts and general principles
  29. Extensible HyperText Markup Language
  30. Dialogue Act Markup Language
  31. Corpus Query Lingua Franca
  32. Universal Coded Character Set
  33. Ontology Integration and Interoperability
  34. Markup Language for events and temporal expressions in natural language
  35. Information Technology — Topic Maps
  36. Codes for the representation of names of languages
  37. Portable Document Format
  38. Information technology — Hypermedia/Time-based Structuring Language