• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL

Chapter 4. Text Basics > Special Character Encoding

4.10. Special Character Encoding

For the most part, characters within documents that are not part of a tag are rendered as is by the browser. However, some characters have special meaning and are not directly rendered, while other characters can't be typed into the source document from a conventional keyboard. Special characters need either a special name or a numeric character encoding for inclusion in a document.

4.10.1. Special Characters

As has become obvious in the discussion and examples leading up to this section, three characters in source documents have very special meaning: the less-than sign (<), the greater-than sign (>), and the ampersand (&). These characters delimit tags and special character references. They'll confuse a browser if left dangling alone or with improper tag syntax. So you've got to go out of your way to include their actual, literal characters in your documents.[7]

[7] The only exception is that these characters may appear literally within the <listing> and <xmp> tags, but this is a moot point, since the tags are obsolete.

Similarly, you've got to use a special encoding to include double quotation mark characters within a quoted string, or when you want to include a special character that doesn't appear on your keyboard but is part of the ISO Latin-1 character set implemented and supported by most browsers.

4.10.2. Inserting Special Characters

To include a special character in your document, enclose either its standard entity name or a pound sign (#) and its numeric position in the Latin-1 standard character set[8] inside a leading ampersand and an ending semicolon, without any spaces in-between.

[8] The popular ASCII character set is a subset of the more comprehensive Latin-1 character set. Composed by the well-respected International Organization for Standardization (ISO), the Latin-1 set is a list of all letters, numbers, punctuation marks, and so on, commonly used by Western language writers, organized by number and encoded with special names. Appendix F contains the complete Latin-1 character set and encoding.

Whew. That's a long explanation for what is really a simple thing to do, as the following example illustrates. The example shows how to include a greater-than sign in a snippet of code by using the character's entity name. It also demonstrates how to include a greater-than sign in your text by referencing its Latin-1 numeric value:

if a &gt; b, then t = 0
if a &#62; b, then t = 0

Both examples cause the text to be rendered as:

if a > b, then t = 0

The complete set of character entity values and names are in Appendix F. You could write an entire document using character encoding, but that would be silly.

  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint