• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL

Chapter 4. Editing HTML > Cleaning Up HTML

Cleaning Up HTML

For the most part, Dreamweaver writes passable, clean code. If you modify the code, Dreamweaver usually avoids changing it back. On the other hand, some applications (most notably Microsoft products) write hideous code that begs intervention from the UN.

Dreamweaver offers several handy shortcuts for cleaning up gnarly code. You may have handwritten the code half-smashed on No-Doz and Jolt cola, or an intern may have demonstrated his or her lack of brilliance all over your site, or you may have produced pages in a lackluster editor.

Dreamweaver even makes some common errors that are easily fixed. There are three ways to clean up your code: opening a file, using the Clean Up HTML command, and using the Clean Up Word HTML command.

What Dreamweaver does on opening a file

Dreamweaver makes certain revisions to a page when it's first opened. To get a prompt when these changes occur (Figure 4.55), or to turn off some of the automatic corrections, you can modify the preferences. If you're a beginning coder, it's best to leave most of these options as is.

Figure 4.55. When you open a file with errors in it, you can get a prompt like this one that tells you what's being fixed. This is the file that I trashed in Figure 4.10.

To modify the auto-cleanup prefs:

From the Document window menu bar, select Edit > Preferences. The Preferences dialog box will appear.

In the Category list at the left, select Code Rewriting. That panel of the dialog box will appear (Figure 4.56).

Figure 4.56. The Code Rewriting panel of the Preferences dialog box. The Warn When Fixing or Removing Tags checkbox is turned off by default; check it if you want to see the prompt in Figure 4.55.

To see a prompt when Dreamweaver modifies your code, check the Warn When Fixing or Removing Tags checkbox.


For assistance in modifying the other attributes, see Appendix D on the Web site for this book.

HTML Code Format Details

Good code is nitpicky, right? This sidebar describes some of the nitpickier details and rationales for indenting, wrapping, line breaks, and tag case. Use this sidebar in conjunction with the steps in the preceding sections.

  • Indenting: By default, Dreamweaver indents certain elements of HTML—the rows and cells in a table, for example. Not indenting may save some download time on very large pages.

    To set an indent size (the default is two spaces or two tabs), type a number in the Indent text box. To set the tab size, because tabs in HTML are spaces, type a number in the Tab text box.

    Some production teams indent in tables or on frameset pages even if they don't do so anywhere else. (It makes working with nested tables and framesets easier.) To turn on indenting specifically for Table Rows and Columns or Frames and Framesets, check the appropriate box.

  • Wrapping: To wrap within the Code inspector window automatically, check the Automatic Wrapping checkbox. To turn off autowrapping, uncheck it. (You can wrap individual pages differently by using the Options menu in the Code inspector or Code View.)

    The default column width for text-based programs like vi and Telnet is usually 76 or 80 columns (a column in this context is the number of monospace characters across a window). To set a different width, type it in the After Column text box.

  • Line Breaks: Line breaks are done differently on different platforms. Because line breaks are actually characters, a line-break character may show up in Unix, for example, if a Mac or Windows line break is inserted. If you work with pages that will be checked in to a document management system like CVS, be sure to check with your house style guide or an engineer to verify your choices here.

  • Tag Case: Some folks are especially picky about whether tags and attributes are written in UPPERCASE or lowercase.

    To set the case for attributes (the case can be the same or different from tag case), select lowercase or UPPERCASE from the Case for Attributes drop-down menu.

    (Attribute values are always lowercase, as in <TD ALIGN="center">.)

    You can have Dreamweaver override the tag and attribute case for documents that were produced in other applications or before you edited preferences.

    To change the HTML case of older documents opened in Dreamweaver, check the Tags and/or Attributes checkbox in the Override Case Of line. Handily enough, the Clean Up HTML command (described in the next section) will set the proper tag case for your documents.

Performing additional clean-up

Aside from Dreamweaver's automatic cleanup functions, you can have it perform more specific code-massaging at any point.

To clean up HTML code:

From the Document window menu bar, select Commands > Clean Up HTML. The Clean Up HTML dialog box will appear (Figure 4.57).

Figure 4.57. Choose which elements to clean up in the Clean Up HTML dialog box.

Dreamweaver lets you remove the following boo-boos (Figure 4.58):

Figure 4.58. This "page" is really just a catalog of errors to be fixed.

  • Empty Tags (Lines 8 and 9)

  • Redundant Nested Tags (Line 11)

  • Non-Dreamweaver HTML Comments (regular comments not inserted by the program; Line 13)

  • Dreamweaver HTML Comments (This option removes comments Dreamweaver inserts with scripts and the like).

  • Specific Tags (any specified tag; Line 15). You must type the tag in the text box. Type tags without brackets, and separate multiple tags with commas. For example: blink, u, tt).

Check the box beside the garbage you want to be removed (Figure 4.57).

Even Dreamweaver is guilty of redundancy when coding <font> tags (Figure 4.59). To combine all redundant font tags, check the Combine Nested <font> Tags When Possible checkbox.

Figure 4.59. The three <font> tags on line 14 can easily be combined into a single <font> tag using the Clean Up HTML command.

To see for yourself the errors Dreamweaver catches, check the Show Log on Completion checkbox.

Ready? Click on OK. Dreamweaver will scan the page for the selected errors, and if you chose to display a log, it will return a list of what it fixed (Figure 4.60).

Figure 4.60. After cleaning up the stuff in Figure 4.58, this dialog box shows what was done.

Cleaning Up Word HTML

Many text documents, for better or worse, are prepared in Microsoft Word at one stage or another in the production process. Word (95, 97, 98, 00) offers a timesaving Save As HTML feature that puts in paragraphs, line breaks, links, and most text formatting. But it does it so badly!

Fortunately, the errors Word makes when converting pages to HTML are consistently bad. The Dreamweaver team figured out the error patterns and wrote a widget to fix most of them.

To clean up Word HTML:

In the Document window, open the page you saved as HTML using Word.

From the Document window menu bar, select Commands > Clean Up Word HTML.

Dreamweaver will read the document info to determine which version of Word was responsible for the damage. If it can't detect this information, a warning will appear (Figure 4.61). Your document may not have been prepared in Word; you might want to run it through twice.

Figure 4.61. This dialog box will appear if you use Clean Up Word HTML to fix a file that wasn't created in Word, or that was created with an ancient version.

In any case, the Clean Up Word HTML dialog box will appear, perhaps after you click on OK to dismiss the dialog (Figure 4.62, Figure 4.63 and Figure 4.64).

Figure 4.62. The Clean Up Word HTML dialog box for Word 97/98.

Nesting Instincts

Valid, by-the-spec HTML asks that <font> tags be nested inside <p> tags. This means that each paragraph contains its own font formatting. This can take up quite a bit of room and add significant download time to large pages.

If you want to cheat on this, which the browsers allow, then turn off the Fix Invalidly Nested and Unclosed Tags option. Then, you can use a single <font> tag to modify as many blocks of text as you desire.

Figure 4.63. The Clean Up Word HTML dialog box for Word 2000.

Figure 4.64. The Detailed panel of the Clean Up Word HTML dialog box for Word 97/98.

If Dreamweaver detects the version of Word used to save the HTML, it will appear in the Clean Up HTML From drop-down menu. If not, select your version. (For Word 95, select Word 97/98). You may get a warning that the version is different from what Dreamweaver detected.

The following options are available for fixing. For more details about Word- specific markup, see the sidebar, Detailed Word Markup.

  • Remove Word specific markup (tags that aren't standard HTML tags)

  • Clean Up CSS (fixes modifications made using Cascading Style Sheets)

  • Clean Up <font> tags (consolidates redundant text formatting)

  • Fix Invalidly Nested Tags (rearranges tags nested in nonstandard order)

  • Set Background Color. (Type the hex code in the text box. #ffffff is white. If you don't know the hex code, skip this one and apply the background color later.)

  • Apply Source Formatting. (Makes modifications to the indenting, line breaks, and case selections. See Setting HTML Preferences, earlier in this chapter.)

To see a dialog box describing the fixes Dreamweaver made, make sure the Show Log on Completion checkbox is marked.

Ready? Click on OK. Dreamweaver will make the selected revisions and display a log if you asked it to do so (Figure 4.65).

Figure 4.65. This dialog box is a log of the changes that were made using the Clean Up Word HTML command.

Detailed Word Markup

Word makes some singular, usually unnecessary additions to standard HTML code when you save a Word file as HTML. If any of this proprietary code is something you want to address on your own, you can ask Dreamweaver not to remove it.

In the Clean Up HTML dialog box, click on the Detailed tab. That panel will come to the front (Figure 4.64 and Figure 4.66).

Figure 4.66. The Detailed panel of the Clean Up Word HTML dialog box for Word 2000.

In all versions of Word, the program applies its own <meta> and <link> tags in the head of the document. If these are useless to you, check the Word Meta and Link Tags from <head> checkbox (Figure 4.67).

Figure 4.67. Word inserted these META and XML tags; the two META NAME tags will be removed, as will all the extraneous XML markup. These tags may be useful for importing documents; if so, uncheck the XML checkbox.

  • Word 97/98: Word 97 and 98 make peculiar choices when it comes to font sizes. To convert Word's font size choices to your own, click the checkbox for the font size, and then select a heading size or font size from the associated drop-down menu. For example, a wise choice would be to assign size 3 text to the default size in Dreamweaver. If you want to keep Word's size assignment, select Don't Change.

  • Word 2000: Word is getting ahead of itself in using XML, or in other words, it includes proprietary code for perfectly vanilla HTML functions. It also makes a few more boo-boos.

To remove XML from the opening <html> document tag, check that box.

To remove other Word HTML markup (in the form of proprietary tags), check the Word XML Markup checkbox.

To remove pseudo-code, check the <![if …]><![endif]> Conditional Tags and Their Contents checkbox.

To remove both empty paragraphs and extra margins, check that box.

These details can be modified at any point during your cleanup.

  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint