BizTalk 2009 : Ready-to-Bake BizTalk Performance Plan (part 1) - Plan Your Tests

7/13/2011 9:02:05 AM

Now that you understand the three phases, we want to give you a ready-to-bake performance plan that has been proven to work over countless performance-tuning engagements. We do that in the following subsections. The expected deliverables or outputs of each phase are described so that you can have a good appreciation for the amount of work that should be invested into each phase.

1. Phase 1: Plan Your Tests

The first thing you need to do is write down how you will test BizTalk, the types and frequency of transactions you will model, and how you will measure and grade the performance of the system against the performance metric requirements. This plan and its associated deliverables will essentially be the performance acceptance criteria and sign-off for the solution owner.

Luckily, this plan is spelled out in detail in this section for you.

These are the expected deliverables from this phase:

Performance test plan:: This is a detailed plan for each type of performance test that will be executed.
A transaction set:: This is a complete description of what transactions will be modeled for each type of performance test.
Required test data:: This is an analysis of the type of data that will be required for each system.
Downstream system impact:: This is the impact of any downstream systems that will be updated as a result of the tests.
Transaction volumes:: This will be a number of quantifiable transactions per second that will represent the frequency of new messages being submitted to BizTalk.

One of the first steps that you'll want to perform in phase 1 is to model your transactions. The results from that step feed into several of the deliverables in the preceding list.

1.1. Modeling Your Transactions

Modeling transactions is by far the most important step in this phase. This step is to ensure that what you are modeling accurately represents the real-world usage of the system. Up until now, the majority of documents that you have created have been developer test data, or dummy data.

This also may be the first time that a system has been run end-to-end in an integrated manner under load. Often there is a series of integration tests that are done prior to this phase, but you can also use this phase to kick off the integrated system phase of your project since the deliverables and outputs will be very similar.

What you need to do is get a representation of the business transactions that will be performed once the system is put into production. Often there are "read-only" transactions and "update transactions.Read-only transactions would be either product inventory data requests or product catalog data requests. In the banking sector, these would be account balance requests. Update transactions are exactly that—they update some other system based on the data sent in the document. These are typically purchase orders or account withdrawal type of transactions.

The goal here is to work with your business analysts and system owners to figure out what the most common types of transactions will be and what transactions have the biggest effect on the business. Most solutions have dozens of different transaction types, so you will have to pick an arbitrary number—say the top five transactions—and focus on them. Most projects do not have the time to accurately performance test all potential transactions types, so you will need to figure out what ones give you the biggest bang for your buck.

1.1.1. Determine Transaction Frequency

As part of modeling transactions, you need to determine the rate and frequency at which they occur. The terms rate and frequency may sound synonymous, but we really are referring to two different things. Rate refers to quantity, and frequency refers to how often.

For example, assume we are modeling two potential transaction types that can occur within our system within 1 hour of production. Our duration for the performance test will be 1 hour, and we need to figure out exactly how many times each of those transactions will show up within that 1-hour period. Let's assume that we have purchase order transactions and stock verification transactions; the first is an update, and the second is a read. When we receive purchase orders, we actually receive them in batches since the customers will wait until their "purchase order bucket" is full before submitting them. On average, each customer will send us 30 purchase orders at a time. Also on average, we receive orders from a customer every 2 minutes. In this case, the rate will be 30 documents, and the frequency will be 2 minutes. The stock verification requests come in as they are needed, and on average we receive three every second. In this case, we will be receiving three documents per second. It is very important to think about your transaction volumes like this because it will allow you to have a common language with the business owners who really have no idea about the intricacies of performance testing, but they do understand how many orders/stock requests they receive each day.

You also need to model the "bell curve" of your application. Almost every solution in the world has something called the 8 to 5 effect, in that they receive 95% of their usage during regular business hours. Knowing this is important because if you are unable to determine the frequency of your document submission and have only "orders per day" numbers, then you need to take that effect into account when extrapolating what your usage will look like. Often the usage bands are much smaller—more like 10 a.m. to 2 p.m. This type of load is very "bursty" and often lends itself to a system that is always in an overdrive state during regular hours.

You also need to account for the 4:30 effect. This is a pattern that sometimes occurs at the end of the day when users of the system attempt to jam in all the remaining data at the end of the day before they go home. This last-minute effort often leads to a floodgate scenario in which the load unexpectedly jumps to a point where the system is above its steady state. We will talk about this effect along with the types of performance tests you should plan to execute to account for this and several other "normal" usage spikes.

The other thing to be aware of is the "not so often but really big" transaction that occurs very infrequently but will have a major impact on the solution. These are often things like product updates or batch updates that can potentially have thousands of nested transactions within them. If your solution has this type of transaction, you need to determine whether it will be processed during the day and whether it will have a dramatic impact on performance. If it is possible, consider offloading these transactions to a receive location that is active only during nonpeak hours.

1.1.2. Understand the Background Noise

In most solutions, there will be a certain amount of operational work that will be happening in parallel to BizTalk. We refer to this work as "background noise." It's important to understand the types of noise that you might encounter and to account for the effects of background noise in your testing. Here are some things to watch for:

Traffic being serviced from a public web site:: Often a web site that is taking customer orders, for example, will have its database physically located on the same database cluster as BizTalk. You will need to account for this load by having a web site load test running at the same time as your BizTalk performance test if this situation applies to you.
BizTalk background jobs:: There are many BizTalk jobs that are executing on a scheduled basis that can affect the overall throughput of the system. It is important to make sure that the SQL Server Agent is started for the duration of your performance tests to ensure that these jobs are running while the performance test is taking place.
System scheduled tasks:: Are there any operational scripts or scheduled tasks that run during the day that can affect performance such as backup jobs, log shipping jobs, cleanup scripts, and so on? In a perfect world, these should be running as well while the performance test is taking place.
Other BizTalk applications:: Is this hardware that is to be used dedicated exclusively for this system, or are their other BizTalk applications running? If it is a shared environment, then performance characteristics, uptime requirements, and transaction volumes will need to be considered for all applications running in the environment.

1.1.3. Create Test Data

To properly load test a system, you need to have data that accurately represents what you will be expecting in production. This is where any old data from a solution that you are replacing becomes invaluable. If you don't have such data, then what teams often do is start creating spreadsheets of test data. Each sheet will represent one document type, and each column becomes a field in that message. This way, the data can easily be loaded into a database (you will see why that is important later). Be wary of repeating IDs such as OrderId or CustomerId, which can cause errors should you get a duplicate entry (don't worry, we will give you a solution to that in a bit). Try to give yourself enough data variety so that you are not repeating the same test data over and over again. You would be surprised to see how many performance problems are masked because every customer is named "Jon Doe" or every company is named" "Contoso" and they are all ordering the same products.

1.2. Planning Your Test Types

There are several different types of performance tests that you will need to implement. The goals of each are different but equally important. The types of tests are as follows:

Steady state:^[]: This is what you define as a "typical" usage scenario. This needs to model what you believe will be the real-world usage of your solution given the assumptions you made in phase 1. Based on this scenario, you will determine whether your solution is sustainable in that the regular operating state of your solution can adequately process the rate at which messages are published and subscribed with no backlog. This is usually where you figure out whether the hardware for your solution is adequate. Additionally, should you determine that your solution is not sustainable, you need to start identifying bottlenecks or areas to tune in your solution to make it sustainable before proceeding with the other tests. In reality, if your solution is not sustainable under a steady-state condition, you have a serious problem that you need to fix by purchasing more hardware/software, tuning the application, or doing a combination of both.
Floodgate:: Assuming your solution is in a steady state that is sustainable, this scenario accounts for spikes in message processing that can occur normally throughout the day." Part of the deliverables of performance testing is to make assumptions about what these spikes will look like based on past usage data, or SWAG (scientific wild-ass guess). The important factor here is to state your assumptions in your performance deliverable and have your business/system owners agree or disagree with your assumptions. These assumptions are what drives the modeling and are really the important tool to gain acceptance of the performance test approach.
Overdrive:: This is a test that measures what the solution performance will look like given a constant message delivery rate that exceeds the anticipated load and creates a backlog of work. Overdrive load is load that is clearly not sustainable. The important take-away from this test is, can my solution continue to function and eventually clear the backlog assuming a constant and overwhelming flood of messages? You need to understand whether BizTalk will continue to process messages if it gets overloaded or whether the entire solution come to a grinding halt. Assuming the overdrive condition stops, how long does it take to clear the backlog, and is it linear and predictable? When the system is in overdrive, is the message processing rate constant, or does it degrade because of backups in correlating messages or related processes that also are not completing?

1.3. Determining Your Exit Criteria

Knowing when to stop is often overlooked but is very important. How do you know when you are done? Each performance test/tuning exercise needs to document what the exit criteria is for that test. The following are a few different options for exit criteria:

Pass—Performance criteria met immediately:: The system was able to meet the performance criteria without any tuning required by the development team.
Pass—Performance criteria met after tuning:: After tuning, the team was able to tune the solution enough to meet the performance criteria.
Fail—Performance was not met, significant redevelopment needed:: This isn't necessarily a bad thing. What you are stating here is that the current code does not meet the requirements. You need to qualify that with what the current performance level is, along with an estimate to rework the solution to have it meet the requirements. If the estimate to redesign, develop, and test that piece of the application is quite big, the project management/business owner may decide that the current performance is acceptable and not worth the additional cost. This way, the decision is based on a dollar value vs. the benefit of the additional work and not on an emotional "but I need this" basis.
Fail—Performance was not met, additional hardware required:: Again, this isn't necessarily a bad thing. What you are stating here is that the code cannot be optimized or changed in any way that will meet the current requirements, and all tuning options have been exhausted. The only solution is to go buy more hardware. This is often an easy decision to make because the costs associated with hardware are generally well known. Also, there is a potential to expropriate hardware that was destined for another purpose to retool a server for a dual purpose. In any case, this is a decision for the solution owner to make based on the options available at the time.

Related -----------------

- BizTalk 2009 : Ready-to-Bake BizTalk Performance Plan (part 4)

- BizTalk 2009 : Ready-to-Bake BizTalk Performance Plan (part 3) - What to Keep in Mind When Tuning

- BizTalk 2009 : Ready-to-Bake BizTalk Performance Plan (part 2) - Create, Execute, and Analyze

Other -----------------

- Windows Small Business Server 2011 : Planning the Network Infrastructure (part 2)

- Windows Small Business Server 2011 : Planning the Network Infrastructure (part 1)

- Introducing Windows Small Business Server 2011

- Microsoft Dynamics GP 2010 : Streamlining payables processing by prioritizing vendors

- Microsoft Dynamics GP 2010 : Gaining visibility with Horizontal Scroll Arrows

- Microsoft Dynamics GP 2010 : Speeding up account entry with Account Aliases

- SharePoint 2010 Search : Search Extensions - Visualization

- SharePoint 2010 Search : Search Extensions - Commercial Solutions

- SharePoint 2010 Search : Search Extensions - CodePlex and Open Source

- Windows Server 2008 R2 : Implement the Distributed File System (part 2)