Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
Data Category Registry
Abbreviation: DCR
Scope: Data Categories for Language Resources
Topic: Data Categorization
Standard body: ISO
Keywords: ISOcat, linguistic categories, tag sets, linguistic annotation, linguistic terminology

Data Category Registry (DCR) was developed within ISO TC 37. The motivation of developing DCR was to provide a list of linguistic concepts and data categories covering a wide range of linguistic domains, such that it can be used for various applications, for instance linguistic annotation, design of dictionaries, meta-data description, text markup etc. By means of the data categories, DCR allows interoperability between DCR-based tools and resources.

DCR provides the possibility to use existing data categories. It specifies principles to extend the existing data categories, or to create new data categories. The specification is composed of three main parts: administrative, descriptive and linguistic part. These parts should guarantee that the data categories are interoperable and proper for developing new applications or improving existing applications.

Since 2009 the Max Planck Institute for Psycholinguistics in Nijmegen has been developing a web-based open source reference of the ISO 12620 standard, which is called ISOcat (“Data Category Registry for ISO TC 37”). The ISOcat describes the data model and procedures for DCR. It is mainly beneficial for creating specifications of DCR data category and management.

Related Standard(s):
Other standards in the same topic(s):

Version Title: Terminology and other language and content resources — Specification of data categories and management of a Data Category Registry for language resources
Abbreviation: DCR-2009 [not official, only for reference in this website]
Version Number: ISO 12620:2009
Status: final
Release Date: 2009-12-10
  1. ISO/TC 37/SC 3/ WG1
Related Standard(s):
  • MAF-2012

    The morpho-syntactic tag sets of Morpho-syntaktic Framework (ISO/DIS 24611) are based on data categories from DCR.

  • TBX-2012
  • TMF-2009

    Some of the data categories from ISO 16642:2003 are used in this standard version.

  • WordSeg-1-2010
Used in CLARIN centre(s):