Global, i18n, l10, Unicode

Page history last edited by Brev Patterson 8 mos ago
  • Character Encoding = Unicode
  • Ident = RFC4646 ISO3166
  • External Strings = Resource Bundles
  • Core libs = ICU, PHP intl
  • Language Negotiation = mod_intl, intl

 

  • php intl library - mb_* functions instead of toupper, tolower, strlen, etc
  • display native language options in native language: Deutch (not German)
  • RFC4646+RFC7647
    • lang-script-country 9iso 3166-1 & 2?
    • zh-Hast-HK
    • de-AT
    • en-US
    • (not underscore)
  • Time in ISO8601 UTC 2009-01-15T13:30:21.4562
  • Olson Time Zones
    • America/Los Angeles
    • Europe/London

 

  • glyph=single visual unit of text
  • character=single logical unit of text
  • code point = integer assigned to character (U+00C0)
  • character set = collection of code points
  • character encoding = utf-8
  • all text has char encoding (ascii = 0x00-0x7F)

 

  • input
    • convert to utf-8
    • verify utf-8
  • output
    • make sure utf-8
  • Unicide = ISO 10646
  • http header, meta tag both = utf-8

Comments (0)

You don't have permission to comment on this page.