Meta tags are an afterthought in WordPress. Even most meta tag plugins are limited in their application or focused on search engine optimization (SEO). My goal is to work through the problem from core principles and see where that takes me. This requires answering three questions:
The traditional — and wrong — answer to the question of why use meta tags at all is for search engine optimization (SEO). The more interesting answer is for content management.
Search engine optimization
At one point the meta keywords tag was considered crucial for improving search engine rankings. Abuse of the tag quickly pushed it out of Google’s algorithm. On its Webmaster Central Blog Google states this plainly (September 2009):
Our web search (the well-known search at Google.com that hundreds of millions of people use each day) disregards keyword meta tags completely. They simply don’t have any effect in our search ranking at present.
In SEOmoz.org’s Search Engine Ranking Factors 2009, use of the meta keywords tag is listed as of very minimal importance. Use of the meta description tag is regarded slightly more favorably, but only as a way to tailor an entry on a search engine results page (SERP).
One meta tag that does matter to search engines is the meta robots tag. Used in conjunction with the site-level robots.txt file, this tag controls which pages and images are indexed by a search engine robot. Vanessa Fox’s comprehensive article, Managing Robot’s Access to Your Website (June 2008), provides details on the entire topic. Google has a simple overview of how Googlebot responds to the tag (March 2007), starting with this:
By default, Googlebot will index a page and follow links to it.
One oft-stated reason to exclude Googlebot from following certain links in a blog is to ensure that posts are only located via a single URL. If a post has more than one URL, it may dilute Google’s page rank, especially if different URLs are used by incoming links.
We now recommend not blocking access to duplicate content on your website, whether with a robots.txt file or other methods. Instead, use the rel=”canonical” link element, the URL parameter handling tool, or 301 redirects. If access to duplicate content is entirely blocked, search engines effectively have to treat those URLs as separate, unique pages since they cannot know that they’re actually just different URLs for the same content. A better solution is to allow them to be crawled, but clearly mark them as duplicate using one of our recommended methods.
Beyond meta tags per se, there are three metadata properties of a web page that relate to SEO: the page title, as set by the title tag, the URL, and the canonical URL element mentioned above. SEOmoz ranks the use of search keywords in the title tag as the most important “on-page” ranking factor and the use of keywords in a page domain name as third. Of minimal importance is the use of keywords in subdomain and URL folder and file names.
Warning: SEO information ages very quickly, which is why I’ve dated my references.
A second reason to use meta tags is to attach bibliographic information to a page. Because neither the name nor the value parameter of the meta tag is curtailed by XHTML standards, individuals and institutions can adopt any number of alternate tagging schemes to classify content.
One widely known tagging scheme is the Dublin Core Metadata Initiative (DCMI) which provides a specialized set of metadata terms for RDF/XML or HTML/XHTML content. Dublin Core elements cover such specifics as title, subject, creator, publisher, and so on. While Dublin Core meta tags can be adopted by anyone, I think it is safe to say that the DCMI is targeted toward research librarians and editors of scholarly publications who need to share bibliographic data on a professional basis.
If you do want to try Dublin Core tagging, the documents Using Dublin Core (specifically Section 4. Dublin Core Elements), and Expressing Dublin Core metadata using HTML/XHTML meta and link elements provide a good starting point.
You can also create and apply your own tagging schemes. Custom name:value pairs may help an internal search engine find pages through controlled terms, help web administrators track the output of a CMS or multiple-author blog, or allow the creative repackaging of site content.
The New York Times web site provides a perfect example. Here is a portion of the meta tag code from a sample article, “Chamberlain, Still a Riddle, Is Battered by the Mariners“:
<meta content="With the playoffs looming, the Yankees are concerned about enigmatic starter Joba Chamberlain, and find themselves wondering just how much progress he has made." name="description"/>
<meta content="Baseball,Seattle Mariners,New York Yankees,Chamberlain Joba,Girardi Joe" name="keywords"/>
<meta content="" name="misspelling"/>
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"/>
<meta content="NOARCHIVE" name="ROBOTS"/>
<meta content="September 21, 2009" name="DISPLAYDATE"/>
<meta content="Chamberlain, Still a Riddle, Is Battered by the Mariners" name="hdl"/>
<meta content="Wrong Club Left Baffled by Chamberlain" name="hdl_p"/>
<meta content="By TYLER KEPNER" name="byl"/>
<meta content="With the playoffs looming, the Yankees are concerned about enigmatic starter Joba Chamberlain, and find themselves wondering just how much progress he has made." name="lp"/>
<meta content="The New York Times" name="cre"/>
<meta content="NewYork" name="edt"/>
<meta content="20090921" name="pdate"/>
<meta content="" name="ttl"/>
<meta content="" name="virtloc"/>
<meta content="Baseball" name="des"/>
<meta content="Chamberlain, Joba;Girardi, Joe" name="per"/>
<meta content="Seattle Mariners;New York Yankees" name="org"/>
A quick scan reveals the use of the meta description and meta keywords tags in standard form, plus many custom tags, including:
- hdl for headline
- hdl_p for print headline
- byl for byline
- cre for creator
- pdate for print date
- per for persons
- org for organizations
The value of these tags becomes apparent when you look at the New York Times developer’s tools (see my post The Times Goes Google on Us on Information Design Watch). Independent developers that use the publisher’s various APIs retrieve this metadata along with article text. For example, artist Jer Thorp’s visualization NYTimes: Sex & Scandal since 1981 (one of his New York Times Visualizations) utilizes the meta org tag identified above.
Thorp explains (in the comments):
The branching segments are “org facets” – organizations which were associated with the stories that were found in the keyword search. This is one of the nicest things about the NYTimes API – you can ask for and process all kinds of interesting information past the standard “how many articles?” queries.
Other developers have used the NYTimes API to build faceted search interfaces and pull content into topic-specific knowledge bases. Internally, New York Times staff can use the tags to look for patterns in article access and popularity.
In the most reductive terms, a web site is composed of two types of pages: unique content pages and the index pages that link or aggregate them. In this crude bifurcation, unique content pages include forms. A search form is content. Search results is an index.
Index pages may, of course, have unique content of their own while content pages may incorporate static or dynamic lists of links.
In blog context, these two page types are expressed as follows (with the most common default WordPress template name or names in parenthesis):
Unique Content Pages
- Individual post (single.php)
- Individual page (page.php)
- Individual image or media page (image.php, video.php, audio.php, application.php)
- Individual comments popup page (comments-popup.php)
- HTTP 404: Not Found page (404.php)
- Search page (requires custom page template)
- Home page (index.php)
- Category archive (category.php)
- Date archive (archive.php)
- Tag archive (tag.php)
- Author archive (author.php)
- Search results (search.php)
- Site map (requires custom page template)
For either SEO or bibliographic concerns, a meta tagging scheme must first focus on correctly categorizing unique content pages. Any number of meta tags might apply to posts and pages, starting with description and including author, copyright, publication date, and so forth.
For index pages, meta tags should describe the page’s function and scope. The blog home page may use the blog name and description; a category page may use the category name and description; and so forth. Other meta tags may come into play depending on the source and desired use of the index.
Both types of pages should have a title and URL that identifies their content or purpose. In the case of title tags this is simply good practice. In the case of page URLs, getting a domain name that incorporates prime search keywords for a new site may be difficult or impossible, but subdomains and permalinks are generally configurable.
First things first. If you want to incorporate post and page titles into WordPress permalinks, go to the Settings page of the Admin interface and set Permalinks to use the Day and Name or Month and Name setting. Done.
After that, adding metadata to WordPress pages requires template and php work.
You can directly edit the templates of a WordPress theme. Or, you can use a plugin to insert information into pages as they are rendered. A good plugin can do everything a template approach does and more. However, the template approach has the advantage of a certain immediacy and provides a good introduction to the type of functions that might also be used by a plugin.
Tagging WordPress templates
WordPress is designed to present different page types by predefined default templates. In most cases, WordPress will step from specific to general templates until it finds one that applies. For example, if an archive of posts by author is requested, WordPress will look first for author.php, then archive.php, then index.php. This is described by the Template Hierarchy page of the WordPress Codex. As a result, you can code meta tags into any default template that contains its own header.
In many WordPress themes, different page templates share a common header template, header.php. To customize the header for different types of pages, WordPress provides some 30 conditional tags as described by the Conditional Tag page of the WordPress Codex. Some of these are useful to meta tagging and some are not.
Furthermore, a number of WordPress functions can call blog data into a header. By combining templates, tests, and data functions, quite a lot of metadata can be incorporated into a theme – but with certain limitations as we shall see.
WordPress templates and tests
The table below identifies default templates for different page types and their corresponding conditional tags. This list only includes conditional tags that will work outside the WordPress loop so comments_open(), pings_open(), and has_tag() are not listed. A number of more specialized conditional tags, including has_excerpt() and is_sticky() are also not pursued for now.
Italicized text identifies placeholders for text (slug, pagetemplate, mimetype, nickname, title, tag), a number (id), or an array of arguments (array).
|Home (list of posts)||1. home.php
|Author Archive||1. author.php
|Category Archive||1. category-slug.php
|Date Archive||1. date.php
|Tag Archive||1. tag-slug.php
|Search Results||1. search.php
|Individual page||1. pagetemplate.php
|Individual post||1. single.php
|Individual attachment||1. mimetype.php (image.php, video.php, audio.php, etc.)
|Comments popup||1. comments-popup.php
2. single.php (?)***
3. index.php (?)***
|Custom content pages|
|Front page||1. pagetemplate.php
|Search page||1. pagetemplate.php
|Site map||1. pagetemplate.php
|* is_paged() will return true on the second or subsequent pages of a paginated archive.|
|** is_subpage() and is_tree() are not default WordPress functions. They can be created in a theme’s functions.php file based on guidelines in Conditional Tags page of the WordPress Codex.|
|*** The WordPress Codex does not actually identify a template hierarchy for a popup comments page, so this is my best guess. In any case, popup comments create a crappy user experience and should be avoided.|
WordPress data functions
Just as WordPress provides a hierarchy and set of tests to isolate specific page types, it provides functions that write data into a page from the blog database.
The following table lists data types, the functions that call them, and the context to which they apply:
|Functions available outside the loop()|
|Blog Title||bloginfo(‘name’)||Entire blog|
|Blog Tagline||bloginfo(‘description’)||Entire blog|
|Blog Authors||wp_list_authors()||Entire blog*|
|Post or page title||single_post_title()||Individual post or page|
|Post or page fields name:value pairs||get_post_custom()||Individual post or page*|
|Category Title||single_cat_title()||Category archive page|
|Category Description||category_description()||Category archive page|
|Tag title||single_tag_title()||Tag archive page|
|Month title||single_month_title()||Date-based archive page|
|Functions available in the loop()|
|Post or page categories||the_category()||Individual post or page*|
|Post or page tags||the_tags()||Individual post or page*|
|Post or page field name:value pairs||the_meta()||Individual post or page*|
|Post or page author(s)||the_author()||Individual post or page*|
|* Requires custom programming to be used for meta purposes|
The limitations of these functions are threefold. First, the data is minimal. Second, the data may need to be used for purposes other than meta tags. Third, not all contexts are covered.
These limitations especially impact index pages. A Category Archive page, for example, could use Category Title in a Title tag and Category Description in a meta description tag. But using Category Description in a meta tag may conflict with the phrasing you would prefer if the description also appears on the page. Nor can you add custom meta tags by category unless you atomize every category by its own template or the is_category() conditional.
For individual pages and posts the options are more extensive, but still problematic. By running a WordPress loop in the heading of the single post template, author, tag, and category lists can be utilized as meta data values. However, as in the examples above, categories and tags have their own user-oriented purpose and may not align to a site’s metadata scheme. Meanwhile, WordPress pages do not call categories or tags.
The two workhorse functions for handling individual page and post metadata are the_meta() and get_custom_meta(). With some simple custom php code in a theme’s functions.php file these can easily write custom field names and values as meta tags. The question then becomes how best to utilize this capability.
Scoping a meta tag plugin
For portability and flexibility, the best way to apply meta tags in WordPress is with a plugin. I define the purpose of this (as yet) hypothetical plugin as follows:
Allow the manual creation and assignment of metadata values to posts and pages from the WordPress Admin interface.
For this goal, and based on the analysis above, the plugin needs to do the following:
- Parse custom fields assigned to individual posts and pages.
- Extend field functionality to index page types.
- Allow fields to be applied site wide.
- Provide options to use (or not) tags, categories, descriptions, authors, and other existing WordPress data as metadata if desired.
Parse custom fields assigned to individual posts or pages
To make the plugin portable, it must define a unique namespace for its field names. An options page will provide an input field for a namespace prefix along with an option to strip the prefix when posts and pages are published.
Custom Field Prefix
Prefix to identify fields used by this plugin
Strip prefix when published
In this example, a custom field named meta-keywords would be written:
<meta name="keywords" content="value" />
Alternatively, if you wanted to apply the Dublin Core metadata scheme, you could use “DC.” as a prefix and not strip it out. A custom field named “DC.title” would be written:
<meta name="DC.title" content="value" />
With this functionality in place, any number of custom meta tags can be created and applied on a post by post and page by page basis.
Extend field functionality to index page types
Index pages in WordPress do not reference custom fields. The plugin must extend custom field functionality to each category, tag, and author archive page, various date archive pages, and the home and search results pages.
Each category, tag, and user currently has its own Edit page which the plugin can leverage, though a custom author metadata page may be more efficient than adding more fields to the unwieldy Edit User page.
The other index pages each require a custom metadata form, including the blog home page and the search results page. Date archives are more complicated. The plugin must allow manual definition of meta tags by date archive type — year, month, day, time — or even, possibly, by specific date, either fixed or recurring.
Allow fields to be applied site wide
While applying metadata to posts and pages would be my highest priority, a fully realized plugin should allow custom fields to be applied to all content and index pages on the site. Doing this with a plugin rather than in a header template means that the fields can be overridden at the individual post or page level as necessary.
Such fields may include the various meta http-equiv tags, the meta copyright tag, the meta robots tag, and others.
Provide options to use (or not) tags, categories, descriptions, authors, and other existing WordPress data as metadata, if desired
Categories, tags, and other WordPress data types can work as metadata. For example, either categories or tags could qualify as meta keywords. Or they could be listed in meta fields named “categories” and “tags.”
The plugin must provide a mechanism to use standard blog data to as metadata. An option page should offer selections to apply blog data to individual pages or across the site. These data elements include:
- Blog Title
- Blog Tagline
For example, Blog Title might be assigned the following options:
Add to blog home page
Add to all: posts, pages, both
Meta name to use
Other data types could be atomized into more specific choices. A post or page might be assigned a publication date, a last modified date, or both. Expiry date would have to be a custom field.
What I’ve scoped is a fairly broad plugin, but one that can be built incrementally, starting with the ability to write custom field names and values as meta tags based on a unique namespace. This would allow blog administrators to apply a custom metadata scheme to their unique content pages while continuing to use templates to handle the less important index page meta tags. Eventually, all the metadata functionality outlined here could be incorporated into the WordPress Admin interface.
Official Google Webmaster Central Blog
Using the robots meta tag (March 2007)
Specify your canonical by Joachim Kupke and Maile Oyhe (February 2009)
Google does not use the keywords meta tag in web ranking by Matt Cutts (September 2009)
Reunifying duplicate content on your website by John Mueller (October 2009)