Microsoft Content Management Server : The ASP.NET Stager Application (part 2) - Staging Channels and Postings

11/30/2012 6:21:23 PM

Logging in as the ‘Stage As’ User

The staging process will run using the credentials of the ‘Stage As’ user. The user ID of the ‘Stage As’ user is pieced together by appending WinNT:// with the user’s domain and ID. Since we are running the code as a console application, we won’t have a current CmsHttpContext. Therefore, we will use the CmsApplicationContext object to carry out the process and use the CmsApplicationContext.AuthenticateAsUser() method to log in. The default publishing mode is Published, so only live channels and postings will be staged.

You may be wondering why we didn’t set the publishing mode to Staging instead of Published. This is because URLs are generated differently when running the application in Staging mode. In Staging mode, the MCMS PAPI generates URLs for channels and postings in the following format:

<NCOMPASSSTAGINGSERVER>/NR/exeres/9556B302-5A45-47A2-897B-F3F8FFBED5F6.htm?NRMODE=
Staging&CMSAUTHTOKEN=77px32pfk5zz2nofimsv5tphe3wuxhqp6s7vx3avlhegvgutvpymze672w6ww
wi5vxje2bfntpzgp</NCOMPASSSTAGINGSERVER>

This unique URL was used by Site Stager to identify which links it needs to make a static copy of and to follow when staging the site. Site Stager would also modify these links in the generated page with the names of the new host server and destination directories.

If you plan to run the application in Staging mode, you will have to handle these links in the same way. However, to keep this example short, we won’t attempt to do so.

We will also impersonate the user that runs the staging process to create a set of network credentials for downloading each channel item and attachment later.

The choice of the ‘Stage As’ user is important because the application only stages pages the user has access to (recall that as part of the identity management feature of MCMS, a user can only view pages that he or she has been granted rights to). You can limit the scope of the stager to stage only selected sections of the site by limiting the user’s rights. For example, to stage the PlantCatalog channel, give the ‘Stage As’ user rights to that channel. This gives you flexibility in planning the size of each deployment. The more pages staged, the longer it takes to complete each task.

If you have left the m_StageAsUser variable as an empty string, an attempt will be made to log in as the guest user.

Add the following highlighted code to the Main() method:

[STAThread]
static void Main(string[] args)
{
  . . . code continues . . .
  try
						{
						// Login to MCMS as the 'Stage As' user
						cmsContext = new CmsApplicationContext();
						if (m_StageAsUser != "")
						{
						cmsContext.AuthenticateAsUser("WinNT://" + m_StageAsDomain + "/"
						+ m_StageAsUser, m_StageAsPwd);
						m_credentials = new NetworkCredential(m_StageAsUser,
      m_StageAsPwd, m_StageAsDomain);
						}
						else
						{
						cmsContext.AuthenticateAsGuest();
						m_credentials = null;
						}
						}
						catch
						{
						WriteToLog("Error: Unable to authenticate with MCMS server.");
						}
}

Revealing Hidden Postings

Next, the application picks up the setting stored in the m_DoNotExportHiddenItems variable and uses it to set the CmsApplicationContext.SessionSettings.AutoFilterHidden property. When this is set to false, the CmsApplicationContext will reveal items set to be hidden when published (or channel items with the ChannelItem.IsHiddenModePublished property set to true). As a result, channel items that are hidden will be staged.

[STAThread]
static void Main(string[] args)
{
  . . . code continues . . .
  // Reveal all hidden postings?
						cmsContext.SessionSettings.AutoFilterHidden = m_DoNotExportHiddenItems;
}

Staging Channels and Postings

The next step of the process is to generate static pages for each channel and posting that the ‘Stage As’ user can see. Here’s a quick synopsis of what we will be doing:

Get an instance of the start channel defined in the m_StartChannel variable.
Iterate through all channels and postings, beginning from the start channel.
For each channel and posting, issue an HTTP request.
Get the response and save it to a file.

Getting the Start Channel

Before walking through the list of channels and postings, we first need to get the start channel specified in the m_StartChannel variable. The iteration process begins from this channel.

Once we have an instance of the start channel, we are ready to begin the iteration process, triggered by the CollectChannel() method defined next.

Append the code highlighted below to the Main() routine:

[STAThread]
static void Main(string[] args)
{
  . . . code continues . . .
  // Start the iteration process
							Channel rootChannel = cmsContext.Searches.GetByPath(m_StartChannel)
							as Channel;
							if (rootChannel != null)
							{
							CollectChannel(rootChannel);
							}
							else
							{
							WriteToLog("Error: Start Channel '" + m_StartChannel + "' cannot be "
							+ "found or user does not have sufficient rights!");
							}
}

Iterating Through the Channel Tree

From the start channel, the CollectChannel() method walks through the rest of the tree, looking for channels and postings.

When it sees a channel, it will create a static version of the channel rendering script or default posting and all postings in its collection. The method calls itself recursively until all channels in the tree have been staged.

The Download() helper method will perform the task of downloading the content and generating static pages as we will see later. Add the CollectChannel() method to the code:

static void CollectChannel(Channel channel)
{
  // Download the channel rendering script or the default posting
  Download(GetUrlWithHost(channel.Url), channel.Path.Replace(
           m_StartChannel,"/"), m_DefaultFileName, EnumBinary.ContentPage);
  // Download all the postings within the channel
  foreach (Posting p in channel.Postings)
  {
    WriteToLog("Info: Downloading Posting: " + p.Path);
    Download(GetUrlWithHost(p.Url), channel.Path.Replace(m_StartChannel,"/"),
             p.Name, EnumBinary.ContentPage);
  }
  foreach (Channel c in channel.Channels)
  {
    CollectChannel(c);
  }
}

The CollectChannel() method uses a helper function, GetUrlWithHost(), which replaces any host names found within the URL with the value stored in the m_SourceHost variable. The reason we do this is to handle links that do not have the host name as part of the URL, particularly for sites that do not have host header mapping turned on. For such sites, Channel.Url will return a value without the host header, like this: /plantcatalog/ficus.htm. To successfully issue the HTTP request that gets the page, the host value (which in our case will be http://www.tropicalgreen.net) has to be pre-pended to the URL.

private static string GetUrlWithHost(string url)
{
  // Remove the host name e.g. http://www.tropicalgreen.net
  // from the URL
  if (url.StartsWith("http://"))
  {
    // Remove http://
    url = url.Remove(0,7);
    // Remove the host name e.g. www.tropicalgreen.net
    url = url.Remove(0,(url + "/").IndexOf("/"));
  }
  // add the source host to the URL
  url = m_SourceHost + url;
  return url;
}

Issuing an HTTP Request

The helper function that issues HTTP requests for all URLs to be staged is the Download() method. It accepts four input parameters:

url: The URL of the channel, posting and, later on, attachment.
path: The path of the directory the static file will be created in.
fileName: The name of the static file that will be created.
flag: The type of content that is being downloaded. We recognize two content types: content pages (channels and postings) and binaries.

To issue an HTTP request, we will utilize the HttpWebRequest object from the System.Net namespace. We will first create a request based on the URL of the channel or posting, then we will set the request to use the credentials of the ‘Stage As’ user (if one has been defined). It is here that we will make use of the user agent and header information defined earlier.

Notice that we have not allowed the request to be redirected automatically. That’s because in this example, we won’t attempt to handle pages that perform redirections. For example, if a page has Response.Redirect() statements coded within it, they will be ignored. You can, of course, enhance the tool to handle server-side page redirects (for an example of how to do so, see http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=153B8D20-EE51-4105-AAEF-519A7B841FCC).

static void Download(string url, string path, string fileName,
                     EnumBinary flag)
{
  try
  {
    string filePath = m_DestinationDirectory + path.Replace("/","\\");
    // Download the file only if it has not been downloaded before
    if (!File.Exists(filePath))
    {
      // Create a new request based on the URL
      HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

      // Make a request based on the specified 'Stage As' user
      if (m_credentials != null)
      {
        request.Credentials = m_credentials;
      }

      request.UserAgent = m_UserAgent;
      request.Headers.Add(m_HttpHeader, "true");
      request.AllowAutoRedirect = false;
    }
  }
  catch(Exception ex)
  {
    StringBuilder sb =  new StringBuilder();
    string newLine = System.Environment.NewLine;
    sb.Append("----------" + newLine);
    sb.Append("Exception of type [" + ex.GetType() + "] has been thrown!"
             + newLine);
    sb.Append("Error Processing URL:" + url + newLine);
    sb.Append("Message: " + ex.Message + newLine);
    sb.Append("Source: " + ex.Source + newLine);
    sb.Append("Stack Trace: " + ex.StackTrace + newLine);
    sb.Append("----------" + newLine);
    WriteToLog(sb.ToString());
  }
}

Use our custom DotNetSiteStager HTTP header to program your pages to behave differently when they are requested by our staging tool. For example, to include selected content on a generated static page, you would do something like this:

if (Request.Header("DotNetSiteStager" != null)
{
  // The following line of code will only execute when the page
  // is requested by the stager
  Response.Write("Page generated by DotNetSiteStager.");
}

Getting Responses and Creating Files

After issuing an HTTP request, we call the HttpWebResponse.GetResponse() method to get the response. If all goes well, we will get an HTTP 200 (OK) status code, which means that the page was downloaded successfully.

The resulting byte-array is then converted to a static file using a FileStream object.

static void Download(string url, string path, string fileName,
                     EnumBinary flag)
{
  // Stage only postings
  if (flag == EnumBinary.ContentPage)
  {
    try
    {
      . . . code continues . . .
      // Get the response;
							HttpWebResponse response = request.GetResponse() as HttpWebResponse;
							// If all goes well, we will retrieve a status code of 200 (OK)
							if (response.StatusCode == HttpStatusCode.OK)
							{
							BinaryReader br = new BinaryReader(response.GetResponseStream());
							int contentLength = (int)response.ContentLength;
							byte[] buffer = br.ReadBytes(contentLength);
							// Process page and download attachments
							ProcessPageAndGetAttachments(response.ContentEncoding, ref buffer);
							Directory.CreateDirectory(filePath);
							if (fileName.IndexOf(".") < 0)
							{
							fileName = fileName + "." + m_DefaultFileExtension;
							}
							FileStream fs = new FileStream(filePath + fileName,
             FileMode.Create);
							fs.Write(buffer, 0, buffer.Length);
							fs.Close();
							}
    }
    . . . code continues . . .
  }
  . . . code continues . . .
}

Notice that before generating the file, we make a call to a helper function, namely ProcessPageAndGetAttachments(), which attempts to do two things:

Correct the <base> tag to point it to the URL of the destination site. For example, when generating files from http://tropicalgreen to http://www.tropicalgreen.net, the host name stored in the href attribute of the <base> tag is changed accordingly.
Set the code page based on the encoding type identified in the response from the MCMS server. Should the encoding type be undetermined, it uses the default value stored in the m_CodePage variable.

Add the ProcessPageAndGetAttachments() method directly below the Download() method.

private static void ProcessPageAndGetAttachments(string encodingName,
                                                 ref byte[] buffer)
{
  // Check to see if encoding information is available
  if (encodingName.ToLower().IndexOf("charset=") >= 0)
  {
    // Encoding information is available. Use it!
    encodingName = encodingName.Substring(encodingName.IndexOf("charset=")
                 + 8);
  }
  else
  {
    // No encoding information found. Use the default setting.
    encodingName = m_CodePage;
  }
  // Convert the buffer to a string so that we work with it.
  Encoding enc = Encoding.GetEncoding(encodingName);
  string content = enc.GetString(buffer);
  // Process the page
  // Correct the <base> tag
  if (m_BaseUrl != "")
  {
    content = content.Replace("<base href=\"" + m_SourceHost, "<base href=\""
            + m_BaseUrl);
  }
  // Set the code page
  if (m_CodePage != "")
  {
    content = content.Replace("<HEAD>","<HEAD>\n<meta http-equiv=\""
            + "Content-Type\"content=\"text/html; charset=" + enc.WebName
            + "\">");
  }
  // Update the buffer with the modified content
  buffer = enc.GetBytes(content);
}

You may be wondering why we named the method ProcessPageAndGetAttachments() when it doesn’t deal with attachments. That’s because later on, we will enhance this method to handle attachments as well.

Related -----------------

- Microsoft Content Management Server : The ASP.NET Stager Application (part 3) - Staging Attachments

- Microsoft Content Management Server : The ASP.NET Stager Application (part 1) - The DotNetSiteStager Project, Recording Messages to a Log File

Other -----------------

- Microsoft Content Management Server : Staging Static Pages - Site Stager in Brief

- BizTalk 2010 : WCF LOB SQL Adapter - Consuming ASDK SQL Adapter in Visual Studio (part 2)

- BizTalk 2010 : WCF LOB SQL Adapter - Consuming ASDK SQL Adapter in Visual Studio (part 1)

- Windows Server 2008 Server Core : Renaming a File with the Ren and Rename Commands, Sorting File Content with the Sort Utility

- Windows Server 2008 Server Core : Moving Files and Renaming Files and Directories with the Move Command, Recovering Lost Files with the Recover Utility

- Windows Server 2008 : Moving Accounts with dsmove, Removing Objects with dsrm, Retrieving Information about Objects with dsquery

- Windows Server 2008 : Modifying Accounts with dsmod

- Designing and Configuring Unified Messaging in Exchange Server 2007 : Unified Messaging Shell Commands

- Designing and Configuring Unified Messaging in Exchange Server 2007 : Monitoring and Troubleshooting Unified Messaging (part 3) - Event Logs

- Designing and Configuring Unified Messaging in Exchange Server 2007 : Monitoring and Troubleshooting Unified Messaging (part 2) - Performance Monitors