The Unicode Standard : 5.0

The Unicode Standard : 5.0

  • ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す
  • 製本 Hardcover:ハードカバー版/ページ数 1417 p.
  • 言語 ENG
  • 商品コード 9780321480910
  • DDC分類 005.722

Full Description


"Hard copy versions of the Unicode Standard have been among the most crucial and most heavily used reference books in my personal library for years."--Donald E. Knuth, The Art of Computer Programming"For more than a decade, Unicode has been a foundation for many Microsoft products and technologies; Unicode Standard Version 5.0 will help us deliver important new benefits to users."--Bill Gates, chairman, Microsoft Corporation"The path W3C follows to making text on the Web truly global is Unicode."--Sir Tim Berners-Lee, kbe, Web inventor and director of the World Wide Consortium (W3C)"Without Unicode, Java wouldn't be Java, and the Internet would have a harder time connecting the people of the world."--James Gosling, Inventor of Java, Sun Microsystems, Inc. These and other software luminaries recognize that Unicode has become an indispensable tool for supporting an increasingly global marketplace (see inside for more acclaim). A comprehensive system of standards for representing alphabets throughout the world, Unicode is the basis for modern programming-- Windows, XML, Python, PERL, Mac OS, Linux--and every major search engine and browser in operation today.New to Unicode Version 5.0A stable foundation for Unicode Security Mechanisms Property data for the Unicode Collation Algorithm and Common Locale Data Repository Improvements to the Unicode Encoding Model for UTF-8 Rigorous stability of case folding and identifiers for improved interoperability and backward compatibility--enabling additional new ways to optimize code A systematic framework for improved text processing for greater reliability--covering combining characters, Unicode strings, line breaking, and segmentationThis new edition of Unicode's official reference manual has been substantially updated to document the latest revisions to the Unicode Standard, with hundreds of pages of new information. It includes major revisions to text, figures, tables, definitions, and conformance clauses, and provides clear and practical answers to common questions. For the first time, the book contains the Unicode Standard Annexes, which specify vital processes such as text normalization and identifier parsing.These improvements are so important that Version 5.0 is the basis for Microsoft's Vista generation of operating systems, and is included in upgrade plans for Google, Yahoo!, and ICU, to name but a few. This is the one book all developers using Unicode must have.

Contents

List of Figures xxiiiList of Tables xxviiForeword by Mark Davis xxxiPreface xxxiiiAcknowledgments xxxix Chapter 1 Introduction 11.1 Coverage 21.2 Design Goals 41.3 Text Handling 5Chapter 2 General Structure 92.1 Architectural Context 92.2 Unicode Design Principles 132.3 Compatibility Characters 232.4 Code Points and Characters 252.5 Encoding Forms 282.6 Encoding Schemes 352.7 Unicode Strings 372.8 Unicode Allocation 382.9 Details of Allocation 412.10 Writing Direction 462.11 Combining Characters 482.12 Equivalent Sequences and Normalization 542.13 Special Characters and Noncharacters 572.14 Conforming to the Unicode Standard 59Chapter 3 Conformance 653.1 Versions of the Unicode Standard 653.2 Conformance Requirements 703.3 Semantics 763.4 Characters and Encoding 783.5 Properties 813.6 Combination 913.7 Decomposition 953.8 Surrogates 973.9 Unicode Encoding Forms 983.10 Unicode Encoding Schemes 1053.11 Canonical Ordering Behavior 1093.12 Conjoining Jamo Behavior 1173.13 Default Case Algorithms 123Chapter 4 Character Properties 1294.1 Unicode Character Database 1304.2 Case--Normative 1324.3 Combining Classes--Normative 1334.4 Directionality--Normative 1384.5 General Category--Normative 1384.6 Numeric Value--Normative 1394.7 Bidi Mirrored--Normative 1414.8 Name--Normative 1424.9 Unicode 1.0 Names 1444.10 Letters, Alphabetic, and Ideographic 1444.11 Properties Related to Text Boundaries 1454.12 Characters with Unusual Properties 145Chapter 5 Implementation Guidelines 1515.1 Transcoding to Other Standards 1515.2 Programming Languages and Data Types 1535.3 Unknown and Missing Characters 1555.4 Handling Surrogate Pairs in UTF-16 1575.5 Handling Numbers 1585.6 Normalization 1605.7 Compression 1615.8 Newline Guidelines 1615.9 Regular Expressions 1665.10 Language Information in Plain Text 1665.11 Editing and Selection 1675.12 Strategies for Handling Nonspacing Marks 1695.13 Rendering Nonspacing Marks 1725.14 Locating Text Element Boundaries 1785.15 Identifiers 1795.16 Sorting and Searching 1795.17 Binary Order 1815.18 Case Mappings 1845.19 Unicode Security 1905.20 Default Ignorable Code Points 192Chapter 6 Writing Systems and Punctuation 1976.1 Writing Systems 1986.2 General Punctuation 202Chapter 7 European Alphabetic Scripts 2257.1 Latin 2267.2 Greek 2377.3 Coptic 2437.4 Cyrillic 2457.5 Glagolitic 2467.6 Armenian 2477.7 Georgian 2497.8 Modifier Letters 2507.9 Combining Marks 252Chapter 8 Middle Eastern Scripts 2638.1 Hebrew 2648.2 Arabic 2698.3 Syriac 2838.4 Thaana 291Chapter 9 South Asian Scripts-I 2959.1 Devanagari 2969.2 Bengali 3129.3 Gurmukhi 3179.4 Gujarati 3219.5 Oriya 3229.6 Tamil 3249.7 Telugu 3309.8 Kannada 3319.9 Malayalam 334Chapter 10 South Asian Scripts-II 34110.1 Sinhala 34110.2 Tibetan 34310.3 Phags-pa 35310.4 Limbu 36010.5 Syloti Nagri 36310.6 Kharoshthi 364Chapter 11 Southeast Asian Scripts 37311.1 Thai 37311.2 Lao 37611.3 Myanmar 37911.4 Khmer 38211.5 Tai Le 39311.6 New Tai Lue 39411.7 Philippine Scripts 39511.8 Buginese 39711.9 Balinese 399Chapter 12 East Asian Scripts 40712.1 Han 40812.2 Ideographic Description Characters 42712.3 Bopomofo 43112.4 Hiragana and Katakana 43312.5 Halfwidth and Fullwidth Forms 43412.6 Hangul 43512.7 Yi 438Chapter 13 Additional Modern Scripts 44513.1 Ethiopic 44513.2 Mongolian 44813.3 Osmanya 45713.4 Tifinagh 45713.5 N'Ko 45813.6 Cherokee 46313.7 Canadian Aboriginal Syllabics 46413.8 Deseret 46513.9 Shavian 467 Chapter 14 Archaic Scripts 47114.1 Ogham 47214.2 Old Italic 47314.3 Runic 47514.4 Gothic 47714.5 Linear B 47814.6 Cypriot Syllabary 47914.7 Phoenician 48014.8 Ugaritic 48214.9 Old Persian 48314.10 Sumero-Akkadian 483Chapter 15 Symbols 48915.1 Currency Symbols 49015.2 Letterlike Symbols 49215.3 Number Forms 49815.4 Mathematical Symbols 50215.5 Invisible Mathematical Operators 50715.6 Technical Symbols 50815.7 Geometrical Symbols 51215.8 Miscellaneous Symbols and Dingbats 51415.9 Enclosed and Square 51715.10 Braille 51915.11 Western Musical Symbols 52015.12 Byzantine Musical Symbols 52515.13 Ancient Greek Musical Notation 526Chapter 16 Special Areas and Format Characters 53116.1 Control Codes 53216.2 Layout Controls 53416.3 Deprecated Format Characters 54316.4 Variation Selectors 54516.5 Private-Use Characters 54616.6 Surrogates Area 54816.7 Noncharacters 54916.8 Specials 55016.9 Tag Characters 554Chapter 17 Code Charts 56317.1 Character Names List 56317.2 CJK Unified Ideographs 56917.3 Hangul Syllables 570Chapter 18 Han Radical-Stroke Index 1023Appendix A Notational Conventions 1077Appendix B Unicode Publications and Resources 1083B.1 The Unicode Consortium 1083B.2 Unicode Publications 1084B.3 Unicode Technical Standards 1085B.4 Unicode Technical Reports 1086B.5 Unicode Technical Notes 1087B.6 Other Unicode Online Resources 1088Appendix C Relationship to ISO/IEC 10646 1091C.1 History 1091C.2 Encoding Forms in ISO/IEC 10646 1095C.3 UCS Transformation Formats 1096C.4 Synchronization of the Standards 1097C.5 Identification of Features for the Unicode Standard 1097C.6 Character Names 1098C.7 Character Functional Specifications 1098Appendix D Changes from Previous Versions 1099D.1 Improvements to the Standard 1099D.2 Versions of the Unicode Standard 1100D.3 Clause and Definition Numbering Changes 1102D.4 Changes from Version 4.1 to Version 5.0 1104D.5 Changes from Version 4.0 to Version 4.1 1106D.6 Changes from Unicode Version 3.2 to Version 4.0 1109D.7 Changes from Unicode Version 3.1 to Version 3.2 1111D.8 Changes from Unicode Version 3.0 to Version 3.1 1113Appendix E Han Unification History 1115E.1 Development of the URO 1115E.2 Ideographic Rapporteur Group 1116Appendix F Unicode Encoding Stability Policies 1119F.1 Encoding Stability Policies for the Unicode Standard 1119Glossary 1125References 1153R.1 Source Standards and Specifications 1153R.2 Source Dictionaries for Han Unification 1161R.3 Other Sources for the Unicode Standard 1161R.4 Selected Resources: Technical 1171R.5 Selected Resources: Scripts and Languages 1173Indices 1179I.1 Unicode Names Index 1179I.2 General Index 1231Annexes 1251UAX 9: The Bidirectional Algorithm 1251UAX 11: East Asian Width 1275UAX 14: Line Breaking Properties 1283UAX 15: Unicode Normalization Forms 1333UAX 24: Script Names 1365UAX 29: Text Boundaries 1373UAX 31: Identifier and Pattern Syntax 1393UAX 34: Unicode Named Character Sequences 1405UAX 41: Common References for Unicode Standard