INTRODUCTION

Very few project rollouts occur without the need to do last minute data cleanup. This activity most frequently occurs following a discussion with the stakeholders in which the need for the data cleanup is revealed. The side effects differ by project but most often it translates to some part of the system exhibiting less than favorable behavior. The need to resolve these types of data anomalies quickly can drive our decision making process in how we resolve them. The haste we exhibit can often lead to less than desirable results. Avoiding the "Dark Side" in these situations is actually quite simple. In short, use the API.

If you read many of the posts on StackOverflow and other great troubleshooting sites related to Sitecore integration, you'll come across a number of them asking how to make modifications to Sitecore data through the back end using SQL DML. Almost every response will be to avoid it. There are numerous reasons why from a maintainability standpoint such as data structure changes over the lifetime of Sitecore. There's a much larger reason to avoid it that has to do with the way Sitecore keeps track of data changes. The Sitecore databases and their content are cached when the Sitecore application starts. Any changes to Sitecore data through the back end are not picked up until the application restarts. While this may not be a big deal when in development mode, once the application goes into production nothing short of a restart (or custom code to evict the caches) will get that back-end updated data visible.

INTEGRATING WITH SITECORE

There are 2 recommended "out of box" approaches both of which will require some custom coding.

Sitecore Web Services

The first approach uses the native web service supplied by Sitecore. Of the 2 this is the least favored because of the relative complexity of working with raw XML as well as the challenges innate to the service interface. While it is possible to get an item from the service, for example, if you have versioning enabled you'll get all versions.

Repository Pattern Approach

The second approach borrows from a well known design pattern the goal of which is to centralize API interaction. The general idea is to create POCOs that mimic the content templates then create repository classes to interact with them and insulate the rest of the implementation from the Sitecore API; more importantly preventing "Spread" (e.g. the definition for News Article is spread across multiple layers and classes). For example, let's say that we have a content template for news articles.

News Article

Field Name

Type

Title

Single line text

Abstract

Multi-line text

Article

Multi-line text

Article Date

Date

Our repository definition and associated entity might look like the following:

Observe that we are passing the context in to the concrete repository. The context can differ depending on where your caller is. A page outside of the Sitecore context (i.e., not a presentation component) may point to Core for its context. In Page Edit mode the context will [most likely] be Master and when viewing the site the context will be Web. What we want is for our calling code to tell us what our context is since it will more likely know what the context should be. This has the added advantage of allowing reuse of the repository across Editing as well as Viewing and supports pages not directly served up by Sitecore.

There are many additional points to consider when taking this route for integration or import.

Content Tree Location

First and foremost is WHERE in the content tree the items need to be added. For simplicities sake, let's say that all news articles are added beneath a single news articles folder. This can be either passed into the repository constructor as a path or Item or we can store its path or ID as a constant in the repository class. If you are building a module you'll probably want to make this a configurable setting and store it in the content tree in which case you could store the path to the configuration setting. This is highly dependent on your reason for implementing a repository in the first place.

Branch Templates

When items are created from branch templates, the API provides for this. The steps are as follows:

1) Get the Branch Item from the context

2) Use the BranchItem class to create your news article

var newsFolder = Context.GetItem("/sitecore/content/Home/News-Section/BreakingNews");
var branchItem = Context.GetItem("/sitecore/templates/branches/NewsArticle");
var newNewsArticle = newsFolder.Add(article.Title, new BranchItem(branchItem));
 
newNewsArticle.Editing.BeginEdit();
// interact with attributes
newNewsArticle.Editing.EndEdit();
Link Fields

Fields that store links to other items in the content tree can be tricky to work with. These include DropLink and MultiLink. Internally, the IDs of the items are stored. When adding an item's ID to a link field, avoid stringifying the GUID. This stores the ID incorrectly as a lower-case string less the curly braces. Sitecore stores IDs in upper case with the curly braces. Instead, use Item.ID.ToString().

General Links

General links are stored as an XML element and the "linkType" attribute in particular determines how Sitecore renders the link. If an item is an external link linkType must be set to "external" or Sitecore will render it ashttp://yourdomain/remainderofUrl regardless of the url since Sitecore by default assumes the link is internal. General links can store internal, external and media links. The second part of this series will discuss managing these in more detail.

Date Fields

Dates are stored in ISO format. For our example above, assume the article we want to store is posted on 12/24/2012 at 2pm. The correct formatter to use to store this date in the item would be dateVariable.ToString("yyyyMMddTHHmmss") . The field should then contain "20121224T140000".

The next posting in this series will cover common extension methods and pitfalls you will encounter with them.

Summary

Interacting with Sitecore for the purpose of integration, data importation or for reasons of your own device that require custom coding should always seek to interact with the Sitecore API, either through Sitecore's web services or through custom classes. Even in the heat of rollout where the desire to correct mistakes quickly can drive decision making, take a step back and make sure that your approach embraces and takes advantage of Sitecore's API. The payoffs can mean resolving the problem(s) faster and with fewer if any side-effects.

Please stay tuned for the next in this series which will build on the News Article content type and will elaborate on how we can make our NewsArticleRepository reusable across Content Management, Content Delivery and integration with third parties.