Building Web Pages With HTML 5

Depending on who you ask, HTML 5 is either the next important step toward creating a more semantic web or a disaster that’s going to trap the web in yet another set of incomplete tags and markup soup.

The problem with both sides of the argument is that very few sites are using HTML 5 in the wild, so the theoretical solutions to its perceived problems remain largely untested.

That said, it isn’t hard to see both the benefits and potential hang-ups with the next generation of web markup tools.

What’s different about HTML 5?

First off, what do we mean by HTML 5? Ideally, we mean the whole thing — new semantic structural tags, API specs like canvas or offline storage and even some new inline semantic tags. However, for practical reasons (read: browser support issues) we’re going to limit this intro to just the structural tags. As cool as Canvas, offline storage, native video or the geolocation APIs are, they aren’t supported consistently across all the browsers yet.

“But wait,” you say, “most browsers don’t support the new structural elements either!” This is true, but the vast majority of them will happily accept any tag you want to make up. Even IE 6 can deal with the new elements, though if you want to apply styles using CSS, you’ll need a little JavaScript help.

The one thing to keep in mind when you’re applying styles to the new tags is that unknown tags have no default style in most browsers. They’re also treated as inline elements. However, because most of the new HTML 5 tags are structural, we’ll want them be behave like block elements. The solution is make sure that you include display:block; in your CSS styles.

To help make some sense of what’s new in HTML 5 today, we’re going to dive right in and start using some of the new structural elements.

Finally, a doctype anyone can remember

The first thing we need to do to create an HTML 5 document is use the new doctype. Now, if you’ve actually memorized the HTML 4 or XHTML 1.x doctypes, you’re better monkeys than us. Whenever we start a new page we have to bring up an old one and cut and paste the doctype definition over.

It’s a pain, which is why we love the new HTML 5 doctype. Are you ready? Here it is:

<!DOCTYPE html>

Shouldn’t be too hard to commit that to memory. Simple and obvious. Case insensitive.

The idea is to stop versioning HTML so that backwards compatibility is easier. Whether or not that pans out in the long run is a whole other story, but at least it saves you some typing in the mean time.

Semantic structure at last

OK, we have our page defined as an HTML 5 document. So far so good. Now what are these new tags we’ve been hearing about?

Before we dive into the new tags, consider the structure of your average web page, which (generally) looks something like this:

<html>

    <head>

    ...stuff...

    </head>

    <body>

        <div id="header">

            <h1>My Site</h1>

        </div>

        <div id="nav">

            <ul>

                <li>Home</li>

                <li>About</li>

                <li>Contact</li>

            </ul>

        </div>

        <div id=content>

            <h1>My Article</h1>

            <p>...</p>

        </div>

        <div id="footer">

            <p>...</p>

        </div>

    </body>

</html>

That’s fine for display purposes, but what if we want to know something about what the page elements contain?

In the above example, we’ve added IDs to all our structural divs. This is a fairly common practice among savvy designers. The purpose is two-fold — first, the IDs provide hooks which can be used to apply styles to specific sections of the page and, second, the IDs serve as a primitive, pseudo-semantic structure. Smart parsers will look at the ID attributes on a tag and try to guess what they mean, but it’s hard when ID names are different on every site.

And that’s where the new structural tags come in.

Recognizing that these IDs were common practice, the authors of HTML 5 have gone a step further and made some of these elements into their own tags. Here’s a quick overview of the new structural tags available in HTML 5:

<header>

The header tag is intended as a container for introductory information about a section or an entire webpage. The <header> tag can include anything from your typical logo/slogan that sits atop most pages, to a headline and lede that introduces a section. If you’ve been using <div id="header"> in your pages, that would be the tag to replace with <header>.

<nav>

The nav element is pretty self-explanatory — your navigation elements go here. Of course what constitutes navigation is somewhat debatable — there’s primary site navigation, but in some cases there may also be page navigation elements as well. The WHATWG, creators of HTML 5, recently amended the explanation of <nav> to show how it could be used twice on the same page.

The short story is that if you’ve been using a <div id="nav"> tag to hold your page navigation, you can replace it with a simple <nav> tag.

<section>

Section is probably the most nebulous of the new tags. According the HTML 5 spec, a section is a thematic grouping of content, typically preceded by a header tag, and followed by a footer tag. But sections can also be nested inside of each other, if needed.

In our example above, the div labeled “content” would be a good candidate to become a section. Then within that section, depending on the content, we might have additional sections.

<article>

According the WHATWG notes, the article element should wrap “a section of content that forms an independent part of a document or site; for example, a magazine or newspaper article, or a blog entry.”

Keep in mind that you can have more than one article tag on the page; for example a blog homepage might have the last ten articles, each wrapped in an article tag. Articles can also be broken into sections using the section tag, though you’ll want to be somewhat careful when planning your structure otherwise you’re liable to end up with some ugly tag soup.

<aside>

Another fairly nebulous tag, the aside element is for content that is “tangentially related to the content that forms the main textual flow of a document.”That means a parenthetical remark, inline footnotes, pull quotes, annotations or the more typical sidebar content like you see to the right of this article.

According to the WHATWG’s notes it seems like <aside> would work in all those cases, despite the fact that there’s considerable difference between a pull quote and tag cloud in your sidebar.

Hey, no one said HTML 5 was perfect!

<footer>

Footer should also be self-explanatory, except perhaps that you can have more than one. In other words, sections can have footers in addition to the main footer generally found at the bottom of most pages.

In detail…