• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

Chapter Six. XHTML: Restructuring the Web > Unicode and Other Character Sets

Unicode and Other Character Sets

The default character set for XML, XHTML, and HTML 4.0 documents is Unicode (http://www.w3.org/International/O-unicode.html), a standard defined, oddly enough, by the Unicode Consortium (www.unicode.org). Unicode is a comprehensive character set that provides a unique number for every character, “no matter what the platform, no matter what the program, no matter what the language.” Unicode is thus the closest thing we have to a universal alphabet, although it is not an alphabet but a numeric mapping scheme.

Even though Unicode is the default character set for web documents, developers are free to choose other character sets that might be better suited to their needs. For instance, American and Western European websites often use ISO-8859-1 (Latin-1) encoding. You might be asking yourself what Latin-1 encoding means, or where it comes from. Okay, to be honest, you're not asking yourself any such thing, but we needed a transition, and that was the best we could do on short notice.


PREVIEW

                                                                          

Not a subscriber?

Start A Free Trial


  
  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint