BizTalk 2010 Recipes : Document Schemas - Creating Flat File Schemas

3/26/2011 6:37:39 PM

1. Problem

You are consuming an inbound message in a flat file structure and must represent the data in an XML schema. The inbound flat file contains records that are both positional and delimited.

NOTE

Delimited files contain characters (such as commas) that separate the data. Files that are positional in nature contain data items that are a predefined length within the file. The physical position of the data defines what the data represents.

2. Solution

The solution outlined in this recipe consumes an inbound flat file schema message that may have a structure similar to the flat file shown in Listing 1 . Additionally, this recipe outlines the steps required to manually create a schema for a flat file. BizTalk 2010 also includes a Flat File Wizard for creating flat file schemas

NOTE

The number bar at the top of Listing 1 is included for reference only and is not part of the file content. The number bar is for counting the position of the characters.

Example 1. CustomerSalesOrder.txt

123456789012345678901234567890123456789012345678912345678901234567890
ORDER2004-10-24
SoldTo Shauna Marie    1223 Buttercup Lane Seattle       WA 98155
ShipTo Jen Schwinn     3030 Moby Road      Kent          WA 98110
ITEMS,ITEM111-AA|Grape|1|2.00|Bottle of Pop,ITEM111-AB|Cola|
1|2.00|Bottle of Pop

Additionally, the outbound BizTalk schema may have a structure similar to the XML file shown in Listing 2.

Example 2. CustomerSalesOrder.xml

<ns0:order xmlns:ns0="http://Schema_Chapter_Project.CustomerSalesOrder">
  <date>10/24/2004</date>
  <customerHeader>
    <customerType>SoldTo</customerType>
    <fullName>Shauna Marie</fullName>
    <street>1223 Buttercup Lane</street>
    <city>Seattle</city>
    <state>WA</state>
    <postal>98155</postal>
  </customerHeader>
  <customerHeader>
    <customerType>ShipTo</customerType>
    <fullName>Jen Schwinn</fullName>
    <street>3030 Moby Road</street>
    <city>Kent</city>
    <state>WA</state>
    <postal>98110</postal>
  </customerHeader>
  <items>
    <item>
      <productId>ITEM111-AA</productId>
      <productName>Grape</productName>
      <quantity>1</quantity>
      <unityPrice>2.00</unityPrice>
      <description>Bottle of Pop</description>
    </item>
    <item>
      <productId>ITEM111-AB</productId>
      <productName>Cola</productName>
      <quantity>1</quantity>
      <unityPrice>2.00</unityPrice>
      <description>Bottle of Pop</description>
    </item>
  </items>
</ns0:order>

Follow these steps to create the flat file schema:

Create a new BizTalk schema, and select the Flat File Schema template.
Determine the structure and layout of your message schema. The structure and layout of the message schema will largely determine how the inbound document is parsed. In the XML sample in Listing 2 , all data fields were defined as string elements. The customerHeader, items, and item nodes are defined as records.
Select the root node (the orders node in this example), and specify the child delimiter to be a carriage return and a line feed (CRLF). The most straightforward way to set the delimiter to a CRLF is by setting the child delimiter type to be Hexadecimal.
Specify the child delimiter property to be 0x0D 0x0A.
Set child order to be Infix.
Set tag identifier to read ORDER. The tag identifier property tells the schema where the data begins for the message. The children data for the root node of order are delimited by commas that appear in the middle of the data.
Based on the fact that there are two instances of customerHeader information, the max cardinality property for customerHeader record must be set to 2.
Set the structure for customerHeader to positional, since all of the child elements that represent customer information are related in a positional format within the flat file. Each child node that exists under customerHeader must have the position defined for the length of the data and the offset for where that value begins in the file. The way the value is represented starts from the left of the data element. For the length and offset of each element, see Table 1.
Set the next node tag value to ITEMS since Items is the next heading in the flat file.
Identify the delimiter as a comma, and set the child-order to prefix, since each item will be prefixed with a comma to indicate the beginning of that item.
Make sure that child delimiter type is set to character. Select item, and make sure the child delimiter is set to the pipe character (|), since the attributes for the items are delimited by the pipe character in the flat file.
Set the pipe character to infix, since each line has pipe characters set in between the delimited characteristics.

Table 1. Customer Header Child Elements
Element	Positional Length	Offset
customerType	7	0
fullName	16	0
street	20	0
city	14	0
state	3	0
postal	5	0

To test the flat file output of the schema, follow these steps:

Verify that Generate Instance Output Type on your schema reads Native. The Native property allows the schema to generate the native file format, which is a flat file, for that schema.
Right-click the schema, and select Generate Instance. You should see the default generated flat file.

To test the flat file schema to see the XML generated based on the provided flat file, follow these steps:

Verify that Generate Instance Output Type on your schema reads XML. The XML property allows the schema to process the inbound flat file and translate that flat file to an XML representation.
Right-click the schema, and select Validate Instance. You should see the XML version of the processed flat file.

3. How It Works

BizTalk is capable of processing both positional and delimited data, either in individual files or in a single file. The child delimiter is the key concept to keep in mind when creating a flat file schema. Any parent-level record that contains child elements or attributes must define whether the data in the flat file for those child records is delimited or positional and how the data is delimited.

Based on the layout of the destination message schema, you should consider the following when dealing with records versus dealing with child elements and attributes:

Records: If you use records to group child elements or attributes, consider how the child records will be demarcated. Will the child data be delimited, or is the child data positional? In the example in Listing 2-1 , each line of data is delimited by a CRLF. Knowing that each line of data is delimited by a CRLF aids in determining whether the output schema must support that specific delimiter. The basic line delimiter information points to the need of specifying a delimiter of a CRLF for the parent record of the output schema.

Tag identifiers: Records may contain tag identifiers to distinguish one type of record from another record. A tag value also allows you to identify where data begins in the file.

Positional elements/attributes: In the XML example in Listing 2-2, the customerHeader data is stored in positional format. For each child node, you must provide the offset (where to start reading the data) and the length for that data item. Additionally, the parent record must specify that the child data structure is Positional.

Delimited elements/attributes: The flat file example in Listing 2-1 shows multiple items occurring on the same line delimited by the pipe (|) character. The attributes related to a single item are then further delimited by the comma character. The item's parent record must specify that the child data structure is Delimited. Additionally, the child delimiter for the item's parent record must specify that each item is delimited by a pipe character.

Cardinality for records: By default, BizTalk sets the cardinality field for records and elements/attributes to a default value of 1. The value of 1 means that you expect to receive a maximum and minimum of one instance of that record and associated child values. If you expect an inbound flat file to contain more than a single record instance, you must change the max occurs value to a number equal to unbounded or the number of instances you expect to receive.

Wrap characters: If the incoming data contains characters that have been identified as delimiting characters (for example, commas), those characters can be ignored through the use of wrap characters. For example, if the record contained the name Shauna, Marie and you wanted to have the comma included as part of the name, you could define a wrap character of " (double quote) and enclose the name within the wrap character: "Shauna, Marie". BizTalk will treat any special characters defined within a set of wrap characters as field-level data.

Escape characters: The purpose of escape characters is very similar to that of wrap characters. Escape characters specify the character to be used to escape reserved characters, and then the reserved characters will be treated as literal characters in a message.

If the records are delimited, you must determine how the records are delimited. For managing CRLF type transactions, child delimiter type is set to hexadecimal, and the delimiter is set to 0x0D 0x0A. If the delimiter is a character value like a comma, set child delimiter type to character. The other key consideration for using delimiters is defining the child-order of the delimiter:

If the delimiter appears after the set of data, the child-order of the delimiter is postfix.
If the delimiter appears before the set of data, the delimiter is prefix.
If the delimiter appears in the middle of the set of data, the delimiter is infix.

The default child-order for a record is conditional default. The conditional default value means that if a tag value has been set for that record, then the child-order will be prefix. Otherwise, the child-order will be infix.

If the records are positional, you must determine the spacing of the delimited data. In the example, it was assumed that there were no spaces between the data (offsets set to 0) and the beginning of the data fields started at the left of each data value.

Another key consideration to keep in mind is the cardinality of the data groupings. When a BizTalk schema is created, by default, the cardinality is set to 1 (even if no cardinality value is explicitly set for a field). In the example, keeping the cardinality of the items set to the default would have caused some data to be lost when parsing both the customerHeader values and the item values. In the example, the cardinality of max value was changed to 2 to account for both the shipTo and soldTo values.

Other -----------------

- SharePoint 2010 : Testing the Three-State Workflow

- SharePoint 2010 : Reviewing the Workflow-Related Settings in Central Administration and Site Settings

- SharePoint 2010 : Defining Workflows in the Business Environment

- Exchange Server 2010 : Setting Up Public Folders (part 5) - Create and Configure a Public Folder

- Exchange Server 2010 : Setting Up Public Folders (part 4) - Create and Configure a Dynamic Distribution Group

- Exchange Server 2010 : Setting Up Public Folders (part 3)

- Exchange Server 2010 : Setting Up Public Folders (part 2) - Mail-Enable Public Folder & Configuring Public Folder Limits

- Exchange Server 2010 : Setting Up Public Folders (part 1) - Creating Public Folders & Configuring Public Folder Permissions

- Windows Server 2008 R2 : Managing Computers with Domain Policies (part 7)