Show all info regardless research infrastructures. Switch to CLARIN environment and show only relevant info to CLARIN, e.g. format recommendations by CLARIN centres. Switch to Text+ environment and show only relevant info to Text+, e.g. format recommendations by Text+ centres. Switch to DARIAH environment and show only relevant info to DARIAH, e.g. format recommendations by DARIAH centres.
Segmentation Rules eXchange
Abbreviation: SRX
Scope: Standard for text segmentation for translation and other language-related processes
Topic: Segmentation
Standard body: LISA
Keywords: segmentation, text segmentation, computer-aided translation, translation memory

The Segmentation Rules eXchange (SRX) is an XML-based standard, which defines how an XML vocabulary for text segmentation can be transformed into meaningful units such as sentences or paragraph for translation and other language-related processes.

Originally the standard was developed within the Container/Content Allowing Reuse (OSCAR) committee at the Localization Industry Standards Association (LISA). It was built as an addition to TXM (Translation Memory eXchange) and enables an interoperability among translation memory management systems.

In 2004 for the first time the SRX has received the status of an official recommendation and was published by LISA. Due to the insolvency of LISA 2011, SRX and other OSCAR standards have been put under Creative Commons license and the specifications moved to new work items within other standardisation organisations such as ISO, GALA.

Currently SRX is still being developed as a Working Draft by the ISO/TC 37/ SC 4. The ISO standard provides the XML vocabulary of rules definition for the segmentation of text. The segmentation rules are created using regular expressions to specify patterns before and after a break or a non-break position. In this manner, the defined rules are powerful and flexible.

The standard describes the general structure of an SRX document. Each SRX file specifies the segmentation rules and describes the language they belong to. The SRX standardisation allows an interoperability between linguistic tools that use pattern-based segmentation methods to declare their methods. In this way, other tools can access information about how the text has been segmented.

Related Standard(s):
Other standards in the same topic(s):

Version Title: SRX Specification
Abbreviation: SRX-2008 [not official, only for reference in this website]
Version Number: 2.0
Release Date: 2008-04-07
  1. LISA
Related Standard(s):
Used in CLARIN centre(s):
Version Title: Language resources management — Segmentation Rules eXchange
Abbreviation: SRX-2011 [not official, only for reference in this website]
Version Number: ISO/CD 24621
Status: Committee Draft
Release Date: 2011-06-21
  1. ISO/TC 37/SC 4
Related Standard(s):
  • XML

    SRX is an XML-based format.