Logo
programming4us
programming4us
programming4us
programming4us
Home
programming4us
XP
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server
programming4us
Windows Phone
 
Windows Server

BizTalk 2009 : Dealing with Extremely Large Messages (part 1) - Large Message Decoding Component

- Free product key for windows 10
- Free Product Key for Microsoft office 365
- Malwarebytes Premium 3.7.1 Serial Keys (LifeTime) 2019
6/11/2012 5:45:03 PM
A major problem that many have discovered is that accommodating extremely large (200MB+) files can be a major performance bottleneck. The shame is that in many cases the documents that are being retrieved are simply going to be routed to another outbound source. This is typical of the Enterprise Service Bus (ESB) type of architecture scenario In short, an ESB is software that is used to link internal and partner systems to each other—which basically is what BizTalk is designed to do out of the box. For these types of architectures, large files are generally routed through the ESB from an external party to an internal party or from internal to internal systems. Most times, the only logic that needs to be performed is routing logic. In many cases, this logic can be expressed in a simple filter criteria based on the default message context data, or by examining data elements within the message, promoting them, and then implementing content-based routing. Also in many cases, the actual message body's content is irrelevant beyond extracting properties to promote. The performance bottleneck comes into play when the entire file is received, parsed by the XMLReceive pipeline, and then stored into the Messagebox. If you have ever had to do this on a 200MB file, even though it works, there is a nasty impact to the CPU utilization on your BizTalk and SQL Server machines, where often the machines' CPU usage goes to 100% and the system throughput essentially goes down the drain.

Now imagine having to process 10 or 20 of these per minute. The next problem is going to be sending the file. The system will essentially take this entire performance hit all over again when the large file needs to be read from SQL Server out of BizTalk and sent to the EPM. You can quickly see how this type of scenario, as common as it is, most often requires either significant hardware to implement or a queuing mechanism whereby only a small number of files can be processed at a time.

You'll find a simple solution in BizTalk Server's capability to natively understand and use streams. The following examples show a decoding component that will receive the incoming message, store the file to disk in a uniquely named file, and store the path to the file in the IBaseMessagePart.Data property. The end result will be a message that only contains the path to the text file in its data, but will have a fully well-formed message context so that it can be routed. The component will also promote a property that stores the fact that this is a "large encoded message." This property will allow you to route all messages encoded using this pipeline component to a particular send port/pipeline that has the corresponding encoding component. The encoding component will read the data element for the path to the file, open up a file stream object that is streaming the file stored to disk, set the stream to the 0 byte position, and set the IBaseMessagePart.Data property to the FileStream. The end result will be that the file is streamed by the BizTalk runtime from the file stored on the disk and is not required to pass through the Messagebox. Also, performance is greatly improved, and the CPU overhead on both the BizTalk Server host instance that is sending the file and the SQL Server hosting the BizTalk Messagebox is essentially nil.

The partner to this is the sending component. In many scenarios, BizTalk is implemented as a routing engine or an Enterprise Service Bus. This is a fancy way of saying that BizTalk is responsible for moving data from one location within an organization to another. In many cases, what does need to be moved is large amounts of data, either in binary format or in text files. This is often the case with payment or EDI-based systems in which BizTalk is responsible for moving the files to the legacy system where it can process them. In this scenario, the same performance problem (or lack of performance) will occur on the send side as on the receive side. To account for this, the examples also include a send-side pipeline component that is used to actually send the large file to the outbound destination adapter.

1. Caveats and Gotchas

The solution outlined previously works very well so long as the issues described in the following sections are taken into account. Do not simply copy and paste the code into your project and leave it at that. The solution provided in this section fundamentally alters some of the design principles of the BizTalk Server product. The most important one of these is that the data for the message is no longer stored in the Messagebox. A quick list of the pros and cons of the proposed solution is provided here:

Pros:

  • Provides extremely fast access for moving large messages

  • Simple to add new features

  • Reusable across multiple receive locations

  • Message containing context can be routed to orchestration, and data can be accessed from the disk

Cons:

  • No ability to apply BizTalk Map

  • No failover via Messagebox

  • Custom solution requiring support by developer

  • Need a scheduled task to clean up old data

1.1. Redundancy, Failover, and High Availability

As was stated earlier, the data for the large message will no longer be stored in SQL Server. This is fundamentally different from how Microsoft designed the product. If the data within the message is important and the system is a mission-critical one that must properly deal with failovers and errors, you need to make sure that the storage location for the external file is also as robust as your SQL Server environment. Most architects in this situation will simply create a share on the clustered SQL Server shared disk array. This share is available to all BizTalk machines in the BizTalk Server Group, and since it is stored on the shared array or the storage area network (SAN), it should be as reliable as the data files for SQL Server.

1.2. Dealing with Message Content and Metadata

A good rule of thumb for this type of solution is to avoid looking at the message data at all costs once the file has been received. Consider the following: assume that you have received your large file into BizTalk and you need to process it through an orchestration for some additional logic. What happens? You will need to write .NET components to read the file and manually parse it to get the data you need. The worst-case scenario is that you need to load the data into an XMLDom or something similar. This will have performance implications and can negate the entire reason for the special large-file handling you are implementing.

If you know you are going to need data either within an orchestration or for CBR, make sure you write the code to gather this data within either the receiving or sending pipeline components. Only open the large data file at the time when it is being processed within the pipeline if you can. The best approach is to promote properties or create custom distinguished fields using code from within the component itself, which you can access from within BizTalk with little performance overhead.

1.3. Cleaning Up Old Data

If you read through the code in the section "Large Message Encoding Component (Send Side)," you will notice that there is no code that actually deletes the message from the server. There is a good reason for this. Normally you would think that once the message has flowed through the send pipeline it would be okay to delete it, but this is not true. What about a send-side adapter error? Imagine if you were sending the file to an FTP server and it was down; BizTalk will attempt to resend the message after the retry period has been reached. Because of this, you can't simply delete the file at random. You must employ a managed approach.

The only real solution to this would be to have a scheduled task that executes every few minutes that is responsible for cleaning up the data directory. You will notice that the name of the file is actually the InterchangeID GUID for the message flow. The InterchangeID provides you with a common key that you can use to query each of the messages that have been created throughout the execution path. The script that executes needs to read the name of the file and use WMI to query the Messagebox and determine whether there are any suspended or active messages for that Interchange. If there are, it doesn't delete the file; otherwise, it will delete the data file. 

1.4. Looping Through the Message

As stated previously, if you do know you will need the data within the message at runtime, and this data is of an aggregate nature (sums, averages, counts, etc.), only loop through the file once. This seems like a commonsense thing, but it is often overlooked. If you need to loop through the file, try to get all the data you need in one pass rather than several. This can have dramatic effects on how your component will perform.

2. Large Message Decoding Component (Receive Side)

This component is to be used on the receive side when the large message is first processed by BizTalk. You will need to create a custom receive pipeline and add this pipeline component to the Decode stage. From there, use the SchemaWithNone property to select the desired inbound schema type if needed. If the file is a flat file or a binary file, then this step is not necessary, because the message will not contain any namespace or type information. This component relies on a property schema being deployed that will be used to store the location to the file within the message context. This schema can also be used to define any custom information such as counts, sums, and averages that is needed to route the document or may be required later on at runtime.

Imports System
Imports System.IO
Imports System.Text
Imports System.Drawing
Imports System.Resources
Imports System.Reflection
Imports System.Diagnostics
Imports System.Collections
Imports System.ComponentModel
Imports Microsoft.BizTalk.Message.Interop
Imports Microsoft.BizTalk.Component.Interop
Imports Microsoft.BizTalk.Component
Imports Microsoft.BizTalk.Messaging
Imports Microsoft.BizTalk.Component.Utilities

Namespace Probiztalk.Samples.PipelinesComponents
    <(CategoryTypes.CATID_PipelineComponent), _
    System.Runtime.InteropServices.Guid("89dedce4-0525-472f-899c-64dc66f60727"), _
    ComponentCategory(CategoryTypes.CATID_Decoder)> _
    Public Class LargeFileDecodingComponent
        Implements IBaseComponent, IPersistPropertyBag, IComponentUI, _
        Global.Microsoft.BizTalk.Component.Interop.IComponent, IProbeMessage

					  

Private _OutBoundFileDocumentSpecification As SchemaWithNone = _
    New Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone("")
Private _InboundFileDocumentSpecification As SchemaWithNone = _
    New Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone("")
Private _ThresholdSize As Integer = 4096

Private resourceManager As System.Resources.ResourceManager = _
New System.Resources.ResourceManager( _
    "Probiztalk.Samples.PipelineComponents.LargeFileDecodingComponent", _
    [Assembly].GetExecutingAssembly)
Private Const PROPERTY_SCHEMA_NAMESPACE = _
    "http://LargeFileHandler.Schemas.LargeFilePropertySchema"
Private _FileLocation As String

'<summary>
'this property will contain a single schema
'</summary>
<Description("The inbound request document specification. " & _
"Only messages of this type will be accepted by the component.")> _
<DisplayName("Inbound Specification")> _
Public Property InboundFileDocumentSpecification() As _
    Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone
    Get

        Return _InboundFileDocumentSpecification
    End Get
    Set(ByVal Value As _
        Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone)
        _InboundFileDocumentSpecification = Value
    End Set
End Property

'<summary>
'this property will contain a single schema
'</summary>
<Description("The Large File Message specification." & _
                "The component will create messages of this type.")> _
<DisplayName("Outbound Specification")> _
Public Property OutBoundFileDocumentSpecification() As _
    Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone
    Get

					  

Return _OutBoundFileDocumentSpecification
            End Get
            Set(ByVal Value As _
           Global.Microsoft.BizTalk.Component.Utilities.SchemaWithNone)
                _OutBoundFileDocumentSpecification = Value
            End Set
        End Property

        <Description("Threshold value in bytes for incoming file to determine" & _
        "whether or not to treat the message as large. Default is 4096 bytes")&> _
        <DisplayName("Threshold file size")> <DefaultValue(4096)> _
        Public Property ThresholdSize() As Integer
            Get
                Return Me._ThresholdSize
            End Get
            Set(ByVal value As Integer)
                Me._ThresholdSize = value
            End Set
        End Property

        <Description("Directory for storing decoded large messages." & _
                        "Defaults to C:\Temp.")> _
        <DisplayName("Large File Folder Location")> _
Public Property LargeFileFolder() As String
            Get
                Return Me._FileLocation
            End Get
            Set(ByVal value As String)
                Me._FileLocation = value
        End Set
    End Property
    '<summary>
    'Name of the component
    '</summary>
    <Browsable(False)> _
    Public ReadOnly Property Name() As String Implements _
        Global.Microsoft.BizTalk.Component.Interop.IBaseComponent.Name
        Get
            Return resourceManager.GetString("COMPONENTNAME", _
                        System.Globalization.CultureInfo.InvariantCulture)
        End Get
    End Property

					  

'<summary>
'Version of the component
'</summary>
<Browsable(False)> _
Public ReadOnly Property Version() As String Implements _
    Global.Microsoft.BizTalk.Component.Interop.IBaseComponent.Version
    Get
        Return resourceManager.GetString("COMPONENTVERSION", _
               System.Globalization.CultureInfo.InvariantCulture)
    End Get
End Property

'<summary>
'Description of the component
'</summary>
<Browsable(False)> _
Public ReadOnly Property Description() As String Implements _
    Global.Microsoft.BizTalk.Component.Interop.IBaseComponent.Description
    Get
        Return resourceManager.GetString("COMPONENTDESCRIPTION", _
                    System.Globalization.CultureInfo.InvariantCulture)
    End Get
End Property

'<summary>
'Component icon to use in BizTalk Editor
'</summary>
<Browsable(False)> _
Public ReadOnly Property Icon() As IntPtr Implements _
    Global.Microsoft.BizTalk.Component.Interop.IComponentUI.Icon
    Get
        Return CType(Me.resourceManager.GetObject("COMPONENTICON", _
            System.Globalization.CultureInfo.InvariantCulture), _
            System.Drawing.Bitmap).GetHicon
    End Get
End Property

'<summary>
'Gets class ID of component for usage from unmanaged code.
'</summary>
'<param name="classid">
'Class ID of the component
'</param>
Public Sub GetClassID(ByRef classid As System.Guid) _
Implements _
Global.Microsoft.BizTalk.Component.Interop.IPersistPropertyBag.GetClassID
    classid = New System.Guid("89dedce4-0525-472f-899c-64dc66f60727")
End Sub

					  

'<summary>
'not implemented
'</summary>
Public Sub InitNew() _
Implements _
Global.Microsoft.BizTalk.Component.Interop.IPersistPropertyBag.InitNew
End Sub

'<summary>
'Loads configuration properties for the component
'</summary>
'<param name="pb">Configuration property bag</param>
'<param name="errlog">Error status</param>
Public Overridable Sub Load( _
    ByVal pb As Global.Microsoft.BizTalk.Component.Interop.IPropertyBag, _
    ByVal errlog As Integer) _
    Implements _
    Global.Microsoft.BizTalk.Component.Interop.IPersistPropertyBag.Load

  Try
        Me._ThresholdSize = ReadPropertyBag(pb, "ThresholdSize")
    Catch
        Me._ThresholdSize = 4096
    End Try

    Try
        Me._FileLocation = ReadPropertyBag(pb, "FileLocation")
    Catch
        Me._FileLocation = "C:\Temp"
    End Try
    Try
        Me.InboundFileDocumentSpecification = New _
            SchemaWithNone( _
                ReadPropertyBag(pb, "InboundFileDocumentSpecification"))
    Catch
        Me.InboundFileDocumentSpecification = New SchemaWithNone("")
    End Try
    Try
        Me.OutBoundFileDocumentSpecification = New _
            SchemaWithNone( _
                ReadPropertyBag(pb, "OutboundFileDocumentSpecification"))
    Catch
        Me.OutBoundFileDocumentSpecification = New SchemaWithNone("")
    End Try


End Sub

					  

'<summary>
'Saves the current component configuration into the property bag
'</summary>
'<param name="pb">Configuration property bag</param>
'<param name="fClearDirty">not used"<\param>
'<param name="fSaveAllProperties">not used"<\param>
Public Overridable Sub Save( _
    ByVal pb As Global.Microsoft.BizTalk.Component.Interop.IPropertyBag, _
    ByVal fClearDirty As Boolean, ByVal fSaveAllProperties As Boolean) _
    Implements Global.Microsoft.BizTalk.Component.Interop. _
    IPersistPropertyBag.Save

    WritePropertyBag(pb, "ThresholdSize", Me._ThresholdSize)
    WritePropertyBag(pb, "FileLocation", Me._FileLocation)
    WritePropertyBag(pb, "InboundFileDocumentSpecification", _
                     _InboundFileDocumentSpecification.SchemaName)
    WritePropertyBag(pb, "OutboundFileDocumentSpecification", _
                     _OutBoundFileDocumentSpecification.SchemaName)

End Sub

'<summary>
'Reads property value from property bag
'</summary>
'<param name="pb">Property bag"<\param>
'<param name="propName">Name of property"<\param>
'<returns>Value of the property"<\returns>
Private Function ReadPropertyBag( _
    ByVal pb As Global.Microsoft.BizTalk.Component.Interop.IPropertyBag, _
    ByVal propName As String) As Object

    Dim val As Object = Nothing
    Try
        pb.Read(propName, val, 0)
    Catch e As System.ArgumentException
        Return val
    Catch e As System.Exception
        Throw New System.ApplicationException(e.Message)
    End Try
    Return val
End Function

					  

'<summary>
'Writes property values into a property bag.
'</summary>
'<param name="pb">Property bag."<\param>
'<param name="propName">Name of property."<\param>
'<param name="val">Value of property."<\param>
Private Sub WritePropertyBag( _
ByVal pb As Global.Microsoft.BizTalk.Component.Interop.IPropertyBag, _
ByVal propName As String, ByVal val As Object)
    Try
        pb.Write(propName, val)
    Catch e As System.Exception
        Throw New System.ApplicationException(e.Message)
    End Try
End Sub

'<summary>
'The Validate method is called by the BizTalk Editor during the build
'of a BizTalk project.
'</summary>
'<param name="obj">An Object containing the
'configuration properties."<\param>
'<returns>The IEnumerator enables the caller to enumerate through a
'collection of strings containing error messages. These error messages
'appear as compiler error messages. To report successful property _
'validation, the method should return an empty enumerator."<\returns>
Public Function Validate(ByVal obj As Object) As _
System.Collections.IEnumerator Implements _
Global.Microsoft.BizTalk.Component.Interop.IComponentUI.Validate
    'example implementation:
    'ArrayList errorList = new ArrayList();
    'errorList.Add("This is a compiler error");
    'return errorList.GetEnumerator();
    Return Nothing
End Function
'<summary>
'called by the messaging engine when a new message arrives
'checks if the incoming message is in a recognizable format
'if the message is in a recognizable format, only this component
'within this stage will be execute (FirstMatch equals true)
'</summary>
'<param name="pc">the pipeline context"<\param>
'<param name="inmsg">the actual message"<\param>
Public Function Probe(ByVal pc As _
    Global.Microsoft.BizTalk.Component.Interop.IPipelineContext, _
    ByVal inmsg As Global.Microsoft.BizTalk.Message.Interop.IBaseMessage) _
    As Boolean Implements Global.Microsoft.BizTalk.Component. _
    Interop.IProbeMessage.Probe

					  

Dim xmlreader As New Xml.XmlTextReader(inmsg.BodyPart.Data)
    xmlreader.MoveToContent()

    If (_InboundFileDocumentSpecification.DocSpecName = _
            xmlreader.NamespaceURI.Replace("http://", "")) Then
        Return True
    Else
        Return False
    End If

End Function
'<summary>
'Implements IComponent.Execute method.
'</summary>
'<param name="pc">Pipeline context"<\param>
'<param name="inmsg">Input message"<\param>
'<returns>Original input message"<\returns>
'<remarks>
'IComponent.Execute method is used to initiate
'the processing of the message in this pipeline component.
'</remarks>
Public Function Execute(ByVal pContext As IPipelineContext, _
ByVal inmsg As IBaseMessage) _
As Global.Microsoft.BizTalk.Message.Interop.IBaseMessage _
Implements Global.Microsoft.BizTalk.Component.Interop.IComponent.Execute
    'Build the message that is to be sent out but only if it is greater
    'than the threshold
    If inmsg.BodyPart.GetOriginalDataStream.Length > Me._ThresholdSize Then
        StoreMessageData(pContext, inmsg)
    End If
    Return inmsg
End Function

'<summary>
'Method used to write the message data to a file and promote the
'location to the MessageContext.
'</summary>
'<param name="pc">Pipeline context"<\param>
'<param name="inmsg">Input message to be assigned"<\param>
'<returns>Original input message by reference"<\returns>
'<remarks>
'Receives the input message ByRef then assigns the file stream to
'the messageBody.Data property
'</remarks>

					  

Private Sub StoreMessageData(ByVal pContext As IPipelineContext, _
                             ByRef inMsg As IBaseMessage)
    Dim FullFileName As String = _FileLocation + _
                        inMsg.MessageID.ToString + ".msg"
    Dim dataFile As New FileStream(FullFileName, FileMode.CreateNew, _
                        FileAccess.ReadWrite, FileShare.ReadWrite, 4096)
    Dim myMemoryStream As Stream = inMsg.BodyPart.GetOriginalDataStream

    Dim Buffer(4095) As Byte
    Dim byteCount As Integer

    'Not really needed, just want to initialize the data within
    'the message part to something.
    'Proper way to do this would be to create a separate XML
    'schema for messages which have been encoded using the
    'encoder, create a new empty document which has an element
    'named "FilePath" and set the value of the element
    'to FullFileName. But at least this way we can see the value in
    'the document should we need to write it out
    Dim myStream As New MemoryStream(UTF8Encoding.Default. _
                                     GetBytes(FullFileName))

    If myMemoryStream.CanSeek Then
        myMemoryStream.Position = 0
    Else
        'Impossible to occur, but added it anyway
        Throw New Exception("The stream is not seekable")
    End If

    byteCount = myMemoryStream.Read(Buffer, 0, 4096)

    While myMemoryStream.Position > myMemoryStream.Length - 1
        dataFile.Write(Buffer, 0, 4096)
        dataFile.Flush()
        byteCount = myMemoryStream.Read(Buffer, 0, 4096)
    End While
    dataFile.Write(Buffer, 0, byteCount)
    dataFile.Flush()
    dataFile.Close()
    inMsg.BodyPart.Data = myStream
    inMsg.Context.Promote("LargeFileLocation", _
                          PROPERTY_SCHEMA_NAMESPACE, FullFileName)

					  

'Useful for CBR operations - i.e. route all messages that are _
            'large to a specific send port.
            inMsg.Context.Promote("IsEncoded", PROPERTY_SCHEMA_NAMESPACE, True)

        End Sub

    End Class
End Namespace				  
Other -----------------
- Microsoft SQL Server 2008 R2 : Client Setup and Configuration for Database Mirroring
- Microsoft SQL Server 2008 R2 : Testing Failover from the Principal to the Mirror
- System Center Configuration Manager 2007 : Site and SQL Server Backups (part 3) - Restoring ConfigMgr Backups - Performing a Site Reset
- System Center Configuration Manager 2007 : Site and SQL Server Backups (part 2) - Restoring ConfigMgr Backups - ConfigMgr Functional Crash
- System Center Configuration Manager 2007 : Site and SQL Server Backups (part 1) - Backing Up ConfigMgr
- Security and Delegation in Configuration Manager 2007 : Securing Configuration Manager Operations
- Security and Delegation in Configuration Manager 2007 : Securing the Configuration Manager Infrastructure (part 4) - Securing Service Dependencies for Configuration Manager
- Security and Delegation in Configuration Manager 2007 : Securing the Configuration Manager Infrastructure (part 3) - Securing Configuration Manager Accounts
- Security and Delegation in Configuration Manager 2007 : Securing the Configuration Manager Infrastructure (part 2) - Securing Configuration Manager Communications
- Security and Delegation in Configuration Manager 2007 : Securing the Configuration Manager Infrastructure (part 1) - Securing Site Systems
 
 
Top 10
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
 
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server