Content inventories are a valuable tool for making data-based decisions on the organization and classification of content on your website or portal. The following article provides an introduction to the benefits of content inventories, how they are conducted and common data metrics useful in content analyses.

What is a Content Inventory?

A content inventory is an accounting of the location, type and quantity of all content on a website or portal. It's often one of the most valuable tools of the UX Architect. While the content inventory process can be time consuming, the results are often the foundation of good Information Architecture.

Benefits include:

  • Understanding the types and quantities of content
  • Used during current to future state mapping for content migration
  • Aids in estimating the time and resource requirements for a migration effort
  • Starting point for analyzing content and setting up open or closed card sorts
  • Input into content type definitions for large Content Management Systems
  • Useful for website governance

Conducting a Content Inventory

The starting point of any content inventory is an understanding of the client needs and how it will be used. Will the content be migrated to a new location? Is a new navigation or labeling scheme required? Understanding the goals of your investigation will help when determining what content attributes to capture.

Common Elements to Capture

  • ID (Normally generated to track content items)
  • Page / Item Title
  • Location (URL)
  • Logical Path
  • Content Type (HTML, PDF)
  • Keywords (or other metadata)
  • Owner (who's responsible for the content)
  • New Location (for migration)
  • Notes

Automated Content Inventory

There's a good chance you can automate at least some of your content inventory. My current recommendation is to start off using the free utility, Xenu's Link Sleuth (http://home.snafu.de/tilman/xenulink.html). The tool provides the address (URL), type, size and title of any page it can physically crawl. The results of an automated crawl will vary depending on the platform your analyzing.

Manual Content Inventory

A manual content inventory consists of actually viewing each page and recording the required data attributes as you go. While the process is time consuming, I recommend conducting manual content inventories for at least the primary facets of the website you are analyzing, and for any deficiencies of an automated content inventory. Manual content inventories will provide a deeper understanding of the site and provide additional validation for any automated inventories.

Cleaning the Content Inventory

After conducting an automated and/or manual content inventory, you should take time to clean the data. Look for any data not relevant, or misrepresented in your study - remove any duplicate records, broken links or auto-generated pages.

Analyzing the Results

Analysis of Content Inventory data will vary depending on the client and need.

Common Data Metrics

  • Quantity of content types (Word, HTML, PDF)
  • Depth of content
  • Broken links
  • Correlation between inventory and usage statistics