La belleza del marcado semántico (en inglés).

Fuentes:

http://ablognotlimited.com/articles/the-beauty-of-semantic-markup-introduction

http://ablognotlimited.com/index.php/articles/the-beauty-of-semantic-markup-part-1-quotes-citations

http://ablognotlimited.com/index.php/articles/the-beauty-of-semantic-markup-part-2-strong-b-em-i

http://ablognotlimited.com/index.php/articles/the-beauty-of-semantic-markup-part-3-headings

The Beauty of Semantic Markup, Introduction

JUL152010

Ever since I started writing Microformats Made Simple, I’ve been distracted … from this blog, from my career goals, from my personal life. And even though the book is long since finished, I’m still distracted.

It is getting better, though. I’m gaining a bit more focus each day.

I quit my job to pursue freelance work in hopes of reconnecting with what I love about my career: making great web sites.

I’m letting go of some of my responsibilities with the user group I co-manage,Webuquerque, so that I can better experience the reason I co-founded it: the community.

I started writing for other publications to encourage my passion for sharing knowledge and information to audiences beyond my meager reach.

And I even decided to break an almost seven-year stretch of happily living alone to *fingers-crossed* happily living in sin with my boyfriend.

A New Blog Series!

The next thing I’m returning focus to is this blog. And there aren’t words to describe how right and grounding this feels.

A Blog Not Limited is my baby. My first online presence where I can do and say anything I want in my own voice, however wrong, inappropriate or profane it may be.

It was the vehicle that inspired me to start writing about microformats and, by extension, gave me the opportunity to write a book.

It is the place where I started to get comfortable “putting myself out there” … where I could shamelessly self promote, while simultaneously sharing information.

And in return for all this, I’ve been a bad parent. I haven’t done a single bit of development I planned to do after it’s first anniversary. I even forgot the anniversary this year.

As for content, I haven’t written too much beyond the shameless self promotion, which sorta takes away the “shameless” part because I like to have a balance (not to mention, I’m beyond sick of promoting me and the book).

I can overlook my neglect of the design and development needed here (for now), but I can’t overlook neglect of content. Which brings me to the reason for this post: I’m starting a new blog series focusing on semantic markup.

Why Semantic Markup?

The most obvious answer is I dig it. I mean I really love it.

Semantic markup is more exciting, challenging and satisfying for me than all the cool shit you can do with CSS 3 transitions or HTML<canvas> or any of the latest–and–greatest emerging web trends.

There is a purity to semantic markup that appeals to me. It is simply deciding what markup elements best describe the content. If you have content for a primary heading on a page, you simply use <h1>. If you have a series of paragraphs, <p> is there for you. Want to provide a sequential list of instructions, <ol> won’t let you down.

But those are the most simplistic scenarios. I also believe writing truly semantic markup is much like putting pieces of a puzzle together.

It’s beyond only using <table>s for tabular data. It is beyond not using<blockquote>s to simply indent text.

You have to be able to see the big picture.

That means a solid understanding of the content itself, even how the content may change over time. For example, if I’m working with contact information I want to know what, specifically, is included. Name, job title, birthday, web site? Will additional content be added in the future? All of these little details affect the final markup.

An understanding of the design and CSS needed to translate that design is essential. Is there a “sticky footer”? What, if any, image replacement techniques will be used? What about font embedding? Again, these design and CSS details will absolutely require certain markup considerations.

You even need an appreciation for your fellow designers and developers who may, someday, have to work with your markup. I’ve too often inherited sites from designers who believed 10 nested <div>s were necessary to contain a single blog post. Or from developers who felt id="static23" was a descriptive and useful naming convention.

It’s About Craftsmanship

HTML is, by comparison, one of the easiest web languages to learn and start using right away. And nothing is going to stop you from using dozens of <table>s for layout, or structuring navigation with <a>s and <br />s, or any of the thousands of examples of shoddy markup that exist on the web today.

Anyone can write crap markup. It takes a craftsman to write that “big picture” markup I described.

It takes knowledge to understand the semantics of HTML elements and properly apply them to content. It takes commitment to spend the extra time needed to follow semantic naming conventions for ids and classes. It takes experience to know that sometimes you do need a few extra <div>s to support a design or invalidrole attributes to support accessibility.

POSH Foundation

After I wrote Meaningful Markup: POSH and Beyond, I realized I had much more to share beyond the basics covered in that article. And so, I decided to start this Beauty of Semantic Markup series.

The focus will be Plain Old Semantic Markup (POSH, for you acronym-loving geeks) examples for real-world content. Not a bunch of theory about what benefits semantic markup offers. No admonitions that you must write semantic markup to support web standards and accessibility. You can see Meaningful Markup if you want that.

Instead, I want to focus on foundation because that’s where craftsmanship begins.

I’ll take different types of content and mark them up, using POSH and semantic naming conventions. Some posts will focus on a specific element, such as <table>, and show best practices for structure and attributes. Some posts will focus on specific content, such as quotes and citations, and cover which elements are most semantically appropriate to use.

Of course, I’ll introduce microformats when the content examples dictate. I’ll also get into CSS when it seems relevant. And you better believe I’ll address accessibility. As for HTML5, I would never neglect it, but I do want to focus on more “foundational” elements first.

Who Is This Series For?

First and foremost, this series is for me. As I explained, refocusing on this blog and writing for myself is something I need and want to do. Plus, I learn so much when I spend the time researching and writing about a topic.

But also, with abundance of news and posts focusing on emerging trends in this industry, I feel compelled to re-engage in the discussion about fundamentals.

Far too often, I hear from developers who know they aren’t producing the best markup, but due to a range of circumstances — limited resources at their jobs, employers who think “web people” can do everything, lack of experience and knowledge — they feel hamstrung. And far too often, I’ve inherited work from other developers who didn’t appreciate good markup, which led to me spending unnecessary time fixing their work (and that means more time and money for employers and clients).

If you are one of those developers/designers, then this series is for you. It’s not going to be brain science. It probably won’t be anything that hasn’t already been written about before. But I do hope it will provide simple, easy–to–understand examples that you can (and will) use to take your markup to a higher level. Hell, I even plan to use the markup examples as my own personal reference to save time when I’m developing.

If you are a markup master already, this series may not be for you. Then again, it might. I’ve been writing HTML for over 10 years and I regularly discover new ways of approaching my markup. Perhaps this series may offer a golden nugget of goodness for even the most experienced front-end developers.

I also hope this series inspires discussion. I’m not perfect and I don’t know everything there is to know about markup. I hope as I present my suggestions for markup, others will chime in with their own ideas and conventions. And then we all learn something new. Yay!

The Plan

I don’t have a formal editorial calendar for this series, but I plan to tackle a range of semantic markup topics, including (and not limited to):

  • Accessible <table>s for tabular data
  • Semantic naming conventions for <id>s and <class>es
  • List elements (<ul><ol> and <dl>)
  • Images with captions
  • Accessible <form>s
  • Document structure with the new HTML5 semantic elements
  • A survey of sites that are “doing it right” and those that aren’t
  • Headings (<h1><h6>)
  • <acronym> and <abbr>

I also plan to explore different semantic markup approaches for different types of content, such as blog posts, site maps, image galleries and more. It might also be nice to take a look at one of those sites that is “doing it wrong” and show what I would do instead.

This series may span several months, a year or even more. As long as I feel interested and engaged in the topic, I plan to write about it. I hope to publish a new article every week, but it could be every 10 days … if I’m really busy, it could be longer.

I suggest you subscribe to A Blog Not Limited to get updates via RSS or follow me on Twitter, where I’ll posts links to new articles.

In the meantime, I already have part one, covering quotes and citations, ready for your inspection. Enjoy!

The Beauty of Semantic Markup, Part 1: Quotes & Citations

JUL152010

As I mentioned in my introduction, this series is going to take a close look at the fundamentals of semantic markup. In this first installment, I’m focusing on quotes and citations.

Before we get started, if you’d like to know more about semantic markup — what it is, why you should develop your sites with it — check out my article, Meaningful Markup: POSH and Beyond.

Now, let’s get to it!

Content First

My approach to writing markup always starts with the content, because knowing the nuances of content definitely affects the final markup. Let’s consider quotes:

  • Will the quote include a citation to the source? Is the source online?
  • Will the source citation require a link users can select?
  • Is the quote to appear on its own or within a body of text?
  • Does the quote contain block-level elements (such as several paragraphs or a list)?

Once you know the answer to these questions, you can get started on the markup.

Semantic Elements

Before you begin marking up those quotes, you should know which elements are at your disposal for this type of content.

<blockquote>

The <blockquote> is a block-level element intended for (comparatively) large quotations that contains other block-level elements. A <blockquote> could be a single paragraph, a series of paragraphs, a paragraph and a list, paragraphs and headings … you get the picture.

In real-world context, a <blockquote> may contain an excerpt from a book or resource. It can also be a relatively lengthy testimonal, as you might see on some companies’ web sites. And, of course, <blockquote> could also be an actual quote; something a person said in conversation, in a speech, in a presentation.

Since I’m in the mood for a bit of ego stroking, let’s apply <blockquote> to myLinkedIn recommendation from my former boss:

  1. <blockquote>
  2. <p>Emily is hands-down the best XHTML/CSS designer I've known. She consistently cranks out the most semantic, valid XHTML/CSS web designs in amazingly short order. Most importantly, she keeps the user experience and semantics in mind... setting the bar for code quality that has not yet been met even by companies netting $400-800k for web design projects.</p>
  3. <p>I always compare XHTML mark up and CSS code quality from other companies to Emily's and every time I'm disappointed. If I want the best, lightest, most semantic XHTML designs, I go to Emily. Other companies and individuals might come up with XHTML that validates, but only Emily's is as efficient and as semantic as possible.</p>
  4. </blockquote>
THE cite ATTRIBUTE

If your <blockquote> content is from an online source, you can indicate that source via the cite attribute with a valid URL. The above recommendation example is, indeed, online:

  1. <blockquote cite="http://www.linkedin.com/in/emilyplewis">
  2. <p>Emily is hands-down the best XHTML/CSS designer I've known. She consistently cranks out the most semantic, valid XHTML/CSS web designs in amazingly short order. Most importantly, she keeps the user experience and semantics in mind... setting the bar for code quality that has not yet been met even by companies netting $400-800k for web design projects.</p>
  3. <p>I always compare XHTML mark up and CSS code quality from other companies to Emily's and every time I'm disappointed. If I want the best, lightest, most semantic XHTML designs, I go to Emily. Other companies and individuals might come up with XHTML that validates, but only Emily's is as efficient and as semantic as possible.</p>
  4. </blockquote>
RENDERING & INTERPRETATION

In terms of how browsers render <blockquote>s, they are typically displayed with a left indent. It is this visual presentation that led to much abuse of <blockquote>. Some less–than–savvy developers use(d) it to give content an indent. And that’s a big ole no-no.

Screen readers, meanwhile, will often announce the beginning and end of a<blockquote> to give users context of the content.

<q>

The <q> is an inline element used for, you guessed it, quotes that appear inline within other text (like a sentence) and do not require any block-level elements, such as paragraph breaks.

In the real-world, <q> is most often appropriate for comparatively shorter quotes, such as a simple phrase or statement within a paragraph or sentence:

  1. <p>When I was younger, my mom used to do my hair before school. And in her morning rush, I often got forehead burns from the curling iron as she did my bangs. Her response? <q>Beauty is pain.</q> And thus began my indoctrination into society's pursuit of beauty.</p>
THE cite ATTRIBUTE

Just like <blockquote>, you can also use the cite attribute with <q> to indicate the source, if it’s online, of the quote:

  1. <p>Wikipedia defines citations as <qcite="http://en.wikipedia.org/wiki/Citations">a reference to a published or unpublished source</q>.</p>
RENDERING & INTERPRETATION

Browsers are supposed to render <q> elements with quotation marks before and after the content. As such, the W3C advises authors against including quotation marks in the content itself.

Furthermore, if you nest <q>s, browsers are supposed to render both the inner and outer quotes with the proper punctuation. In American English, for example, this would mean the inner quote would be delimited with single quotation marks, while the outer quote would begin and end with double quotation marks.

However, browsers makers often go their own ways, which is what Internet Explorer did and, as such, all versions of IE prior to IE8 do not add those delimiting quotation marks. The other major browsers, however, do insert quotation marks.

It is worth mentioning, that quotation marks vary according to language. As such, best practices recommend specifying the lang attribute for <q> to ensure the proper punctuation is applied:

  1. <p>Wikipedia defines citations as <q lang="en-us"cite="http://en.wikipedia.org/wiki/Citations">a reference to a published or unpublished source</q>.</p>

As far as screen readers go, they don’t seem to treat content within <q>s any differently than other content. They don’t announce the beginning or end of the quote, as with <blockquote>.

<cite>

While I’ve mentioned the cite attribute, there is also a <cite> element. It is aninline element used for references to a source. In other words, a citation.

And, as a sidenote, I’m not embarrassed to admit that for years, I was incorrectly using this element for inline quote content. I didn’t even realize there was a <q>element, much less that <cite> is intended for references to other work, not the actual reference itself.

In real-world practice, <cite> can be used in conjunction with a quote, to indicate the source of the quote … a person, book, whatever. This is particularly useful if the source is not online and, therefore, you can’t use the cite attribute with<blockquote> or <q>:

  1. <blockquote>
  2. <p>Mr. L. Prosser was, as they say, only human. In other words he was a carbon-based bipedal life form descended from an ape. More specifically he was forty, fat and shabby and worked for the local council.</p>
  3. <p><cite>The Hitchhiker's Guide to the Galaxy</cite></p>
  4. </blockquote>

Update

Thanks to a comment from Chris Pederick, I’ve already learned something new from this series (yay!).

It seems the HTML5 working draft says using <cite> when referencing a person is a no-no. Instead, we can use <b> or <span>What. The. Fuck!?

Yeah, that just seems stupid to me. And a step away from semantics. But, oh well, lots of stuff in specifications are stupid to me. And who knows, the final HTML5 spec is a ways off, so it could change.

Want to try to help make it change? Contribute to the WHATWG wiki pagedocumenting uses of <cite> in reference to people.

I now return you to regularly scheduled programming …

Note that if you use <cite> within <blockquote>, you must first contain it with another block-level element, because <cite> is inline and <blockquote> can only contain block-level elements:

  1. <blockquote>
  2. <p><cite>The Hitchhiker's Guide to the Galaxy</cite></p>
  3. </blockquote>

And even if you are referencing an online source, you can still use <cite> along with the cite attribute:

  1. <blockquote cite="http://en.wikipedia.org/wiki/Citations">
  2. <p>A prime purpose of a citation is intellectual honesty; to attribute to other authors the ideas they have previously expressed, rather than give the appearance to the work's readers that the work's authors are the original wellsprings of those ideas.</p>
  3. <p><cite>Wikipedia</cite></p>
  4. </blockquote>

When I am dealing with an online source, as in the above example, I often drop in a link as a containing element for <cite>:

  1. <blockquote cite="http://en.wikipedia.org/wiki/Citations">
  2. <p>A prime purpose of a citation is intellectual honesty; to attribute to other authors the ideas they have previously expressed, rather than give the appearance to the work's readers that the work's authors are the original wellsprings of those ideas.</p>
  3. <p><a href="http://en.wikipedia.org/wiki/Citations"><cite>Wikipedia</cite></a></p>
  4. </blockquote>

You don’t, however, have to make the <a> href value the same URL as that ofcite.

Consider the earlier LinkedIn recommendation example. The recommendation exists on my profile page, so that makes sense as the URL for the cite attribute. But if I wanted to include my former boss’s name as the <cite> reference, I would typically provide a link to his personal site:

  1. <blockquote cite="http://www.linkedin.com/in/emilyplewis">
  2. <p>Emily is hands-down the best XHTML/CSS designer I've known. She consistently cranks out the most semantic, valid XHTML/CSS web designs in amazingly short order. Most importantly, she keeps the user experience and semantics in mind... setting the bar for code quality that has not yet been met even by companies netting $400-800k for web design projects.</p>
  3. <p>I always compare XHTML mark up and CSS code quality from other companies to Emily's and every time I'm disappointed. If I want the best, lightest, most semantic XHTML designs, I go to Emily. Other companies and individuals might come up with XHTML that validates, but only Emily's is as efficient and as semantic as possible.</p>
  4. <p><a href="http://www.ianpitts.com/"><cite>Ian Pitts</cite></a></p>
  5. </blockquote>

And just for purposes of demonstration, since all of the above examples use<blockquote><cite> can absolutely be used to reference the source of <q>content:

  1. <p><cite>Wikipedia</cite> defines citations as <q cite="http://en.wikipedia.org/wiki/Citations">a reference to a published or unpublished source</q>.</p>

<cite> can also be used as a direct reference to a source without any quote content, such as in my previous paragraph where I mentioned the W3C and provided a link to the W3C’s specification:

  1. <p>Browsers are supposed to render q elements with quotation marks before and after the content. As such, the <cite>W3C</cite> advises authors <a href="http://www.w3.org/TR/html401/struct/text.html#h-9.2.2">against including quotation marks in the content</a> itself.</p>
RENDERING & INTERPRETATION

The default browser rendering of <cite> is often in italics. As for screen readers, they don’t treat content contained by <cite> in any special way.

Enter Microformats

Whenever I see content that includes a name (person, place or organization), I immediately think of the hCard microformat for marking up contact information. Since my LinkedIn recommendation example includes the name and web site of a person, it is a good fit for hCard:

  1. <blockquote cite="http://www.linkedin.com/in/emilyplewis">
  2. <p>Emily is hands-down the best XHTML/CSS designer I've known. She consistently cranks out the most semantic, valid XHTML/CSS web designs in amazingly short order. Most importantly, she keeps the user experience and semantics in mind... setting the bar for code quality that has not yet been met even by companies netting $400-800k for web design projects.</p>
  3. <p>I always compare XHTML mark up and CSS code quality from other companies to Emily's and every time I'm disappointed. If I want the best, lightest, most semantic XHTML designs, I go to Emily. Other companies and individuals might come up with XHTML that validates, but only Emily's is as efficient and as semantic as possible.</p>
  4. <p class="vcard"><a href="http://www.ianpitts.com/" class="url"><cite class="fn">Ian Pitts</cite></a></p>
  5. </blockquote>

Don’t know what hCard is? No problem, check out part 3 of my Getting Semantic With Microformats series.

Real-World Applications

Now that you know the fundamental structure and semantic uses of these elements, where might you use them? I’ve already mentioned testimonials and in-text quotes, but how about:

  • <blockquote> for blog comments
  • <blockquote> for excerpts of reviews
  • <blockquote> or even <q> for status updates on social networks
  • <q> for colloquialisms
  • <cite> any time you mention a resource

And I’m sure there are many more. It is really about considering your content and deciding what is semantically appropriate for it.

Debbie Downer Wants to Know Why

I’m not going to get too much into the specifics of why you should be using these elements for your quotation and referenced content. Again, I refer you toMeaningful Markup for that.

But I’m not clueless about those folks out there who poo-poo these approaches (I did work for a huge corporate monster for five years). Maybe it is the .NET developer on your team who only knows how to (horrifically) use <div>s. Maybe it is the department VP who thinks since her nephew can make web sites, it is simple, straightforward and markup doesn’t matter.

For those folks, I offer the following:

  • Semantic markup supports today’s web standards. And aside from beingstandards that all web professionals should aim for, they do have ROI.
  • Semantic markup provides the foundation for accessible sites.
  • Semantic markup can provide design patterns to help streamline team development. If everyone knows that <blockquote> with <cite> is to be used instead of a series of <div>s, <span>s, <br />s and the like, less time is needed to not only develop markup, but also to style that markup.
  • Semantic markup can contribute to SEO to help improve your content’sfindability.

For Your Consideration

It just wouldn’t be the web we all know and love if there weren’t debates and issues. <q> and <cite> aren’t immune.

No Love for <q>

Some designers, fed up with IE’s lack of delimiting quotation marks for <q>, drop the element entirely from their markup. Some turn to JavaScript or CSS to add the quotation marks.

As for me, I don’t sweat it too much. Depending on the client and project, I may serve conditional CSS to IE (versions prior to 8) so that the <q> content appears in italics. That’s about as far as I go. I encourage you to decide for yourself, but first maybe a bit more information to help your decision-making:

<cite> Is Better?

Also, there are some folks who feel that <cite> can be used similarly to <q> for inline quotes or citations. And since <q> has inconsistent browser rendering, <cite>is the better way to go.

True, the W3C spec states <cite> contains a citation or a reference to other sources, and I can understand that some people think a citation can be an actual quote or excerpt.

But I disagree here. My understanding of citation aligns with Wikipedia‘s definition ofa reference to a published or unpublished source. So, semantically, I see <q> as being specific to inline quotes, referenced content, etc., while <cite> is the source of the referenced content.

Stop Picking Sides and Just Develop

Ultimately, though, I try to avoid getting too wedded to any single approach toanything. The web is constantly evolving and, as such, I try to be the kind of professional that evolves with it.

Plus, I don’t know everything and I’m okay with that. I like to see what other people are doing, think about what makes the most semantic (and usable andaccessible) sense for me and my sites, and then just do it. When I encounter a new idea, I consider it and, if necessary, make changes to my approach. Which is one of the primary reasons I started this series.

So, let me know what you think about my approaches to <blockquote><q> and<cite>. If you have other ideas, please share them.

Coming Next

I’m still deciding on the topic for part 2. Right now, it is a toss-up between images with captions and accessible <table>s. You’ll just have to stay tuned to find out!

See what I did there? A cliffhanger! How exciting!

The Beauty of Semantic Markup, Part 2: <strong><b><em><i>

AUG062010

So, I had planned to focus the second installment of this series on markup for images with captions. The topic was a request from my friend Ian and his birthday was coming up. However, his birthday has passed, and I’m just now writing. Plus, I’ve been thinking a lot lately about something more fundamental: bold and italicized text.

This may seem a trivial thing to be consuming my “markup mind,” but after Tantek Çelik‘s HTML5 presentation, it’s been bugging me. And what specifically has been bugging me is the recommendation of <b> and <i> in HTML5.

Shut the Front Door!

Yep, it is true. <b> and <i> are back and, apparently, more “useful.” And when I first learned this, I was instantly put-off. I come from the “separate content from presentation” school that dropped these two elements in favor of the “more semantic” <strong> and <em>.

At the time when folks were thinking about the structural/semantic markup approach, <b> and <i> were strictly presentational. The HTML 4 spec declared these two as style elements that simply rendered text in bold and italics, respectively. Further, screen readers didn’t differentiate them in any special manner, adding to the logic that they were only useful for visual differentiation.

Conversely, in HTML 4, <strong> and <em> offered meaning, as well as the default visual rendering. Content marked up with <em>, for example, semantically indicated emphasis (and defaulted to italics in visual browsers), while the use of <strong>indicated strong emphasis (and defaulted to bold).

Re-Definitions in HTML5

The WC3‘s HTML5 recommendation, however, brings us some redefinitions of these elements:

  • The <b> element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as keywords in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is emboldened.
  • The <i> element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized. Usage varies widely by language.
  • The <strong> element now represents importance rather than strong emphasis.

<em>, meanwhile, isn’t featured on the list of changed elements in HTML5. Although, the working draft does seem to define it slightly differently than the previous spec: as emphatic stress rather than just emphasis.

Building an Argument

Upon first consideration, I was totally cool with the modified definitions of <strong>and <em> (although, admittedly, slightly confused as to what emphatic stressmeant), but I still felt <b> and <i> were presentational. I mean, the W3C even usesstylistically and typographic presentation in it’s definitions for those elements.

But, thanks to this new series, I got to doing some research. And when I started, I was aiming to build an argument against <b> and <i>. First, I wanted to find out how screen readers treated these guys.

Screen Readers

As it turns out (and as I expected), two most popular screen readers don’t, by default, read content contained by these elements any differently than other content.

What I didn’t expect to discover, though, is that they also don’t treat <strong>and <em> in any special way.

There goes the main “they aren’t accessible” argument I was hoping for. None of the tags seems to offer any special accessibility to screen reader users.

Search Engine Optimization

So then I began a hunt for an SEO argument. Somewhere in the dusty annals of my mind, I recalled reading that Google paid particular attention to content contained by <strong>.

Turns out, I was wrong yet again. In fact, at one point, Google gave greater weight to content marked up with <b>, not <strong>.

As of today, though, the search engine gives equal weight to <strong> and <b>, as well as <em> and <i>.

Crap. I thought I had all this ammunition against <b> and <i>, when what I really had were outdated and incorrect notions.

Forget the Argument, Focus on Semantics

I’ve said it before, but apparently I need to listen to my own advice about being too wedded to a particular semantic point of view … especially when operating with wrong assumptions. Time to focus on the entire point of this series: semantics.

So, let’s take a closer look at <b> and <i> in HTML5

Presentation via CSS

In addition to the definitions I shared above, the HTML5 draft also specifies that CSS should control the presentation of <b> and <i>; that neither will, necessarily, appear in bold or italics by default.

Of course, this ultimately comes down to what the browser makers do, but this is a good clarification that these elements are no longer exclusively presentational in nature.

Further, the draft encourages authors to use the class attribute to define why a<b> or <i> element is used in order to allow for unique styling of different implementations.

Consider the new definition of assigning <b> to keywords and product names to offset those terms without adding importance. By extending <b> withclass="keyword" or class="product" (or some other equally semantic values), you have your CSS hooks to give each a unique presentation and you are also adding meaning to your markup (kinda like how microformats work).

Same is true for applying <i> to taxonomy terms, idioms, phrases in another language and the like. Specifying the “why I’m using <i>” via class offers potential for both styling and semantics.

Common Typographic Conventions

Even with these caveats, though, I can’t help but still think about the presentational nature of the definitions in HTML5. As I mentioned, stylistically offset just screams presentation to me.

But then I started thinking about how bold and italicized text is commonly used in print. Yes, they do offer visual indicators, but more often (at least in my experience), text offset with italics or bold does conveys meaning, especially when considered in context.

Latin words, inner dialog or thoughts, titles of songs … I often see this type of content italicized in print. And, in context, I recognize the additional meaning the italics provides the content.

Media Independence

In HTML5, <b> and <i> are explicitly media independent. Essentially, because each element is no longer tied to bold or italics (visual presentation), the new semantic meaning they offer is available to non-visual browsers.

Again, it is up to those browser and screen reader makers to take advantage of that meaning, but media independence further supports the new semantic direction of these elements.

Warming Up

With all this additional information, I’m warming up a bit to using <b> and <i> again. But I’d be lying if I said I was completely comfortable.

<b> and <i> have historic ties to the notions of bold and italic. I mean, that’s what “b” and “i” represent.

Why a new element wasn’t introduced that is independent of this presentational history bugs me a bit. But, then again, using what people are already familiar with isn’t always a bad thing.

Still, I worry that people will use these elements for presentational purposes. Or that folks won’t apply the recommended class values to differentiate instances of these elements.

I can’t help but think that this is just a big can of worms that will get messy if markup authors don’t understand and apply the spec properly. And let’s not even talk about the “challenges” that could result from what browser makers will end up doing or not doing.

Practical Usage

Aside from my concerns, I do want to give some thought to how I would actually use <b> and <i>, now that they are semantic. And, of course, what roles <strong>and <em> will play in my markup.

<strong>

Even with the realigned definition of <strong> in HTML5, I plan to use it as I always have, because I never really thought of it as strong emphasis. I always used it as it is now defined: indicating importance.

For my projects, the types of content I commonly mark up with <strong> include:

  • Alerts
  • Warnings
  • Reminders
  • Important content (duh)

For example:

  1. <p><strong>Registration is required</strong> for this event.</p>

or

  1. <p>The presentation begins at <strong>6:30 pm</strong>, so be sure to show up a few minutes early to avoid interrupting our speaker.</p>

or

  1. <p><strong>Password provided for this username is incorrect.</strong> Please try again or you may request your password be emailed to you.</p>

I don’t think there is a hard–and–fast rule about applying <strong>. To me, it is more about content. What is important? Is it the time a presentation starts, or is it the reminder to arrive early?

And this is what I dig about semantic markup. Focusing on content.

<em>

Like <strong>, I pretty much plan to use <em> the same as I always have. Even with the new (slightly unclear) definition of emphatic stress<em> still means, to me, stressed content. As in content that I would verbalize in a stressed tone to indicate emphasis.

And because I write the way I talk (with lots of stressed words), I use this element often in my content:

  1. <p>Talking about microformats in less than 30 minutes (plus leaving time for questions) was <em>quite</em> a challenge.</p>

or

  1. <p>You can use the <cite> attribute with <q> to indicate the source of a quote, <em>if</em> it's online</p>.

It is really a matter of knowing the content well enough to know what terms and/or phrases should be emphasized in this fashion.

<b>

To be honest, based on the HTML5 definition of <b>, I’m not sure how often I’ll actually use it. The draft suggests it can be used with product names and keywords, but I, personally, don’t see a need to differentiate this type of content in any way.

Of course, a client might feel differently. Perhaps a client might want all of the product names on their site to appear stylistically offset. So, in that situation, I would use it and take advantage of the recommended application of class to indicate the purpose of the element:

  1. <p>For data management, we offer two flagship products: <b>Moxie</b> and <b>Mojo</b>.</p>

And if that same client also wanted to highlight keywords associated with their products, I might:

  1. <p><b>Moxie</b> offers users the ability to <b>cleanse</b><b>extract</b> and <b>transform</b> data.</p>

Meanwhile, in my CSS, I would style .product and .keyword in some fashion, likely both unique.

Also, HTML5 does specify that <b> can be used simply to indicate text that needs unique styling, such as those typographic conventions of drop caps and paragraph leads:

  1. <p><b>I</b>t was a cold and rainy night.</p>

Although, I’m not sure I would favor this approach over using :first-letter in my CSS (like I already do on this blog). But I guess I could see it for styling a paragraph lead uniquely:

  1. <p><b>It was a cold and rainy night,</b> despite what the weatherman had announced on the evening news. Bob was annoyed his stargazing plans were in danger from the looming storm.</p>

Still, even after considering those scenarios, I’m frankly not convinced <b> is going to be a regular element in my arsenal.

<i>

As for <i>, I can see using it much more often than <b>. Particularly for technical, legal or medical terms, as well as foreign language phrases:

  1. <p>A <i>patent foramen ovale</i> is a congenital defect between the two upper chambers of the heart.</p>

or

  1. <p>I try to live my life according to the axiom, <i>illegitimi non carborundum</i>.</p>
FOREIGN LANGUAGES

Since I used a Latin phrase in the last example, now might be a good time to address use of the lang attribute. HTML5 Doctor provides an excellent article on the same topics I’m covering here.

In their examples of using <i> for foreign language phrases, they apply the langattribute to indicate which foreign language is being referenced. For example:

  1. <p>Mix baking soda and vinegar together, and <i lang="fr">voilá</i>, you get a cool chemical reaction.</p>

However, another article on the topic, Using <b> and <i> elements, warns against this approach:

… the language attribute only describes the language of the text, not the meaning. It is possible that you will want to style text in a different language differently according to the context in which it is used, either now or in the future.

As for me, I think that if I do use <i> for foreign phrases, I’ll likely skip the langattribute and rely on class for any special styling.

Exercise Discretion

While I’m admittedly still a bit on the fence about actually using <b> and <i>regularly, you may feel differently and want to start marking up right away. If that is the case, please use these elements intelligently and correctly.

Don’t just apply <b> because you need a bold effect and you are feeling lazy. Don’t use <i> for a publication title, when <cite> may be the appropriate element (seepart 1 of this series for more on <cite>).

Even the HTML5 draft recommends discretion:

… authors are encouraged to consider whether other elements might be more applicable than the i element, for instance the em element for marking up stress emphasis, or the dfn element to mark up the defining instance of a term.

Go Forth & Experiment

After gathering all this information, I had hoped to have a firm conclusion about <b>and <i>. Alas, I don’t. So, what I shall do is try different approaches and see how they work for me, my sites and my clients.

I have some clients who I know won’t take the time to add the extra markup for something like <b>, while some clients may embrace that extra level of control. And I have some CMS implementations that currently aren’t configured in a way that will easily allow the addition of <i>.

And then there’s still that little voice in my head that can’t seem to fully accept<b> and <i> as semantic elements.

Only time and practice will tell how big a role <b> and <i> will play for me. Until then, I’m eager to hear your thoughts!

The Beauty of Semantic Markup, Part 3: Headings

NOV072010

I always find myself drawn to fundamental concepts, because they can be deceptively simple. Headings are like that. You know, <h1><h6>.

They seem simple until you take time to think … think about structure, semantics, accessibility, search engines and, now, HTML5’s sectioning model.

And I have, indeed, been thinking about headings lately, especially as I dive intoHTML5 and (re?)consider the approaches I’ve taken in the past.

So this series now shifts focus to <h1><h2><h3><h4><h5> and <h6>.

Headings for Outlines

The semantic purpose of headings is to indicate a content outline; a structure:

A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.

— W3C

You can even see this heading-based outline using the W3C’s validator service, if you have “Show Outline” selected (note: does not work with HTML5 doctype):

W3C Validator outline optionFor example, here’s the heading outline for one of my recent blog posts:

Heading outline for A Blog Not Limited postLooking at this three–year–old markup now, I wouldn’t take the exact same approach today, but the gist is there. My blog name is the first heading, with all the other headings “nested” hierarchically after.

Of course, not all sites are going to have a heading hierarchy, such as one with a columnar layout, where the most important heading (<h1>) appears after, for example, an <h2>:

  1. <div>
  2. <h2>Quick Links</h2>
  3. </div>
  4. <div>
  5. <h1>Site Name</h1>
  6. </div>
  7. <div>
  8. <h2>Search</h2>
  9. </div>

But even in this example, the headings are still used to convey a content structure.

Best Practices & Debates

As far as I can glean, the “best practices” for indicating content structure is simply to use <h1> for the most important information, <h2> for less important information and so on. Also, it is probably best to not skip any heading levels, such as going from <h1> to <h3>But that’s it.

For quite a long time, many folks believed there should be only one <h1> on a page, despite the fact that this is not part of the specification. I happened to be one of those people and, if I recall correctly, my reasoning for this “logic” was based on an assumption about search engine penalties.

Google now refutes this misperception, but does advise the judicial use of <h1>. Which leaves the argument that the reason for only one <h1> is that there can only be one “most important” heading on a page.

Traditionally, I’ve also agreed with this thinking. But, as you’ll see later in this article, HTML5 has me thinking very differently about <h1>s. HTML5 aside, though, I’m still inclined towards the one <h1> approach … which brings up yet another debate (don’t you love our little industry?).

This debate assumes a single <h1>, but questions what content should be inside that <h1>. Site name? Company name or logo? Page title?

As you can see from this blog (as well as pretty much every other project I’ve marked up) I’ve been on the side of the site name, which is often the company name. I’ve never used <h1> for a logo, and I can’t say I even understand that approach. <h1> is for text. A logo is not text. There’s no argument there for me.

But I’m now appreciating the logic that <title> is, semantically, the appropriate element for the site name, while <h1> may be more useful for the page heading.

Headings for Navigation

A wonderful result of using headings to indicate content structure is that it aids navigation. Users can scan headings on a web browser to more quickly find the information most important to them. Even non-browser users can take advantage of headings for this purpose, as many assistive technologies leverage the outline to navigate.

The JAWS screen reader, for example, lets users navigate the page by jumping from heading to heading:

This demonstration alone confirms for me that using <h1> for page headings is probably the best way to go. I imagine it gets old fast hearing the site name repeated because it is contained by an <h1>. But that’s just my own personal decision (though a good one, I suspect).

In terms of “best practices” to support accessible navigation, as long as you are focusing on content structure, you are probably good. Regarding multiple <h1>s, there is no definitive answer about how it affects accessibility. Anecdotally, it could cause some screen reader users to miss key content.

Regarding heading hierarchy, the WCAG 2.0 accepts both nested (where <h3>follows <h2> which follows <h1>) and non-hierarchical headings. (And, in case you were wondering, Google doesn’t mind non-hierarchical headings either.)

Headings for SEO?

Speaking of Google (how’d you like that segue?) … headings have historically been heralded as helping SEO. In fact, in the above image of my blog outline, you’ll see that I strayed from the semantic, outline-focused approach with the use of <h2>for my blog’s “tagline.” This is because at some point in time (years ago) I heard that search engines favored headings with keywords, so I felt the semantic “bending” was worth it.

What now seems more accurate is that search engines use headings the same way that people do: to discern important content and understand content hierarchy. Both Google and Yahoo! advise authors to write headings with this approach.

The question that matters to me, though, is do search engines give greater weight to heading content? No idea. There are thousands (perhaps millions) of articles that say headings carry greater weight, but I could find nothing definitive from the major search engines.

So, what’s my verdict? Today, I don’t think I would use a heading for a site taglinejust to achieve SEO. I suspect that the search engines have such sophisticated algorithms, that a single heading to expose a few keywords isn’t going to help me in the rankings. And if it hinders accessible navigation by “confusing” the content outline, then it just isn’t worth it to me.

Um, Isn’t This Old News?

Maybe. This might be old news to you, and awesome if it is. That means you already take a thoughtful approach to markup, and we would probably be best of friends.

But it wasn’t all old news to me. I never took time to consider the appropriate use of <h1>. I had outdated assumptions about SEO and headings. And, while I knew about screen reader navigation, I never took the time to actually watch someone use a screen reader on a site without headings (you really must watch that video above).

Then I started messing around with HTML5, and an entirely new world of possibility opened, forcing me to make sure I understood how and why to use headings. Hence, this post.

HTML5 Sections & Outlines

So back to this new world. HTML5 is pretty cool, especially if you are a POSH lover like me. It gives markup authors a broader semantic arsenal to work with (if you haven’t yet, pick up a copy of Jeremy Keith‘s HTML5 for Designers to get up–to–speed).

New Semantic Elements

One of the many things HTML5 brings to the table is a new outline algorithm. This is based off of the new semantic, structural elements:

  • <section> is used for content that can be grouped thematically. A <section>can have a <header>, as well as a <footer>. The point is that all content contained by <section> is related.
  • <header> typically contains the headline or grouping of headlines for a page and/or <section>s, although it can also contain other supplemental information like logos and navigational aids.
  • <footer> is used for content about a page and/or <section>s, such as who wrote it, links to related information and copyrights.
  • <nav> is used to contain major navigation links for a page. While it isn’t a requirement, <nav> will often be contained by <header>, which, by definition, contains navigational information.
  • <article> is used for content that is self-contained and could be consumed independent of the page as a whole, such as a blog entry. <article> is similar to <section> in that both contain related content. The best rule of thumb for deciding which element is appropriate for your content is to consider whether the content could be syndicated. If you could provide an Atom or RSS feed for the content, <article> is most likely the way to go.
  • <aside> indicates the portion of a page that is tangentially related to the content around it, but also separate from that content, such as a sidebar or pull-quotes. A good method for deciding whether <aside> is appropriate is to determine if your content is essential to understanding the main content of the page. If you can remove it without affecting understanding, then <aside>is the element to use.
More In-Depth Outlines

These new elements provide authors a way to explicitly group content, and each has its own self-contained outline, so you don’t have to follow a page-focused heading hierarchy. Instead, you can start with <h1> within each element, and the algorithm uses the hierarchy and nesting of the sectioning elements to determine the outline level of each <h1>.

Wait! What?

Yeah, that’s what I said when I first learned this. The best way to grok this, I think, is with an example:

  1. <header>
  2. <h1>Blog Archive</h1>
  3. </header>
  4. <section>
  5. <h1>Posts by Month</h1>
  6. <article>
  7. <h1>Blog Post Title</h1>
  8. </article>
  9. <article>
  10. <h1>Another Blog Post Title</h1>
  11. </article>
  12. </section>
  13. <aside>
  14. <h1>Popular Posts</h1>
  15. </aside>

The HTML5 outline algorithm, then, gives us:

  • Blog Archive
    • Posts by Month
      • Blog Post Title
      • Another Blog Post Title
    • Popular Posts

If this multiple <h1> approach was used with previous versions of HTML, the outline would be inaccurate:

  • Blog Archive
  • Posts by Month
  • Blog Post Title
  • Another Blog Post Title
  • Popular Posts
<hgroup>

HTML5 also introduces a new element, <hgroup>, which can be used to suppress headings from the content outline. A specific situation in which this would be useful is on this very blog, where I’m using an <h2> for my tagline. As I mentioned, I nowthink this idea wasn’t the best because it could adversely affect accessible navigation.

But, by using <hgroup>, all headings after the first child are ignored by the content outline:

  1. <header>
  2. <hgroup>
  3. <h1>A Blog Not Limited</h1>
  4. <h2>to web design, standards &amp; semantics</h2>
  5. </hgroup>
  6. </header>
BENEFITS …

While it remains to be seen whether this will benefit my clients and projects, there is some sound reasoning behind this new approach to sectioning content and outlines.

First, with self-contained outlines, you can have an infinite number of heading levels. You are no longer limited to 6 (<h1><h6>). For a deep site, I could see this being useful. Not so much with shallower levels of information.

Second, self-contained content with independent heading hierarchies enable portable content. Consider a blog post that often appears on the home page, as well as its own page, and can even be syndicated or shared with other sites. Before, you would often have to modify the heading markup for the blog post depending on where it appeared. Now with the HTML5 outline algorithm, you can independently define the markup for the blog post without regard to where it may appear on a site (or another site).

… BUT BE AWARE

Perhaps, over time, these admittedly practical benefits may be worth taking full advantage of headings in HTML5. For now, though, there are a few issues. As of now, browsers don’t support this new outline algorithm. If you want to see the outline of an HTML5 page, you have to use an external tool.

It’s not surprising, then, to know that assistive technologies aren’t supporting this algorithm either. Which, for me, is my biggest concern. If I start with <h1>s in each container of related content, that is going to cause major problems with heading-based navigation in text browsers and screen readers.

Middle Ground?

Fortunately, HTML5 is backwards-compatible and flexible, so is isn’t an either/or proposition. You can still approach headings from a page-level hierarchy and use the new HTML5 elements to group content. The spec even says authors can start sections with either <h1> elements or headings reflective of the section’s nesting level. So that’s what I plan on doing, at least until assistive technologies catch up.

Anuncios

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s