DIN 91379
The DIN standard DIN 91379: "Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM"[1] defines a normative subset of Unicode Latin characters, sequences of base characters and diacritic signs, and special characters for use in names of persons, legal entities, products, addresses etc. The standard defines a normative mapping of Latin letters to base letters A-Z according to the recommendations of ICAO.[2]
Languages and scripts supported
The subset supports all official languages of European Union countries as well as the official languages of Iceland, Liechtenstein, Norway, Switzerland, and also the German minority languages. To allow the transliteration of names in other writing systems to the Latin script according to the relevant ISO standards all necessary diacritic signs are provided.
In addition to the normative characters the standard defines subsets of extended characters that contain modern Greek letters for Greece and Cyprus, Cyrillic letters for Bulgaria and special characters for names of products and legal entities.
Conforming applications may support additional characters, however for interface agreements or registers it may be appropriate to support only a final subset of characters and sequences based on this standard.[3]
The text of the former standard, DIN SPEC 91379,[4] explanations and lists of characters and sequences as Excel and XML files can be found in Koordinierungsstelle für IT-Standards (KoSIT).[5] This reference contains also an XML schema file with patterns to check conformance of text to subsets defined in this standard. Lists of characters and sequences of DIN SPEC 91379 and DIN 91379 as plain text files are available via GitHub in DIN 91379 Characters and Sequences.[6] The DIN contains few additional characters and sequences.[6][1]
Compliance
To be compliant to this norm, it is required to
- support all normative letters and sequences at all processing stages,
- use the encoding UTF-8 at interfaces, and
- normalize the characters according to Unicode normalization form C (NFC).[1]
Continuous text is not in the scope of this norm.[1]
The compliance to this standard will be mandatory for German authorities and organisations in the exchange of data between authorities or with citizens and business from Nov 1, 2024.[7]
The architecture guideline for German federal IT demands the usage of the standard DIN SPEC 91379.[8]
Current results of the standardization process include the specification DIN SPEC 91379 in March 2019 and final DIN standard in August 2022. Efforts are being made to further develop it into a European CEN standard.[5]
Software supporting DIN 91379
References
- "DIN 91379:2022-08: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM" (in German). Beuth Verlag. August 2022.
- "Doc 9303, Machine Readable Travel Documents, Part 3 — Specifications Common to all MRTDs" (PDF). ICAO. Retrieved 2022-03-22.
- "PROJECT Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM". DIN. Retrieved 2022-03-19.
- "DIN SPEC 91379:2019-03: Characters in Unicode for the electronic processing of names and data exchange in Europe; with digital attachment" (in German). Beuth Verlag. March 2019. Retrieved 2022-03-19.
- Koordinierungsstelle für IT-Standards (KoSIT). "String.Latin+ 1.2: eine kommentierte und erweiterte Fassung der DIN SPEC 91379. Inklusive einer umfangreichen Liste häufig gestellter Fragen. Herausgegeben von der Fachgruppe String.Latin. (zip, 1.7 MB)" [String.Latin+ 1.2: Commented and extended version of DIN SPEC 91379.] (in German). Retrieved 2022-03-19.
- "DIN 91379 Characters and Sequences". 19 August 2022. Retrieved 2022-08-19 – via GitHub.
- IT-Planungsrat (2022-11-10). "Beschluss 2022/51 – String.Latin" [Decision 2022/51 – String.Latin] (in German). Retrieved 2022-12-22.
- Der Beauftragte der Bundesregierung für Informationstechnik. "Architekturrichtlinie für die IT des Bundes – Technische Spezifikationen zur Architekturrichtlinie –" [Architecture guideline for federal IT – Technical specifications for the architecture guideline –] (PDF) (in German). Retrieved 2022-10-08.
- "OpenPDF is an open source Java library for PDF files". March 19, 2022 – via GitHub.
- "Accents, DIN 91379, non Latin scripts". May 10, 2022 – via GitHub.
- "The Apache™ FOP Project". Feb 9, 2023.
- "Mirror of Apache FOP". Feb 9, 2023 – via GitHub.
- "Noto Latin, Greek, Cyrillic". Feb 9, 2023 – via GitHub.
- "Combining comma above right at wrong position · Issue #33 · notofonts/latin-greek-cyrillic". GitHub.
External links
- Tim Braatz. "Der neue Zeichensatz DIN SPEC 91379" [The new character set DIN SPEC 91379] (in German). public magazin. Retrieved 2022-03-20.
- Dr. Stefan Döring. "Projekt Unicode: Endlich korrekte Namen!" [Project Unicode: Finally correct names] (in German). Landeshauptstadt München. Retrieved 2022-03-20.
- "In 80 Tagen um die Welt: Unicode in der Verwaltung" [In 80 days around the world: Unicode in the administration] (in German). cit GmbH. 19 November 2020. Retrieved 2022-03-20.