Now that you understand the three phases, we want to
give you a ready-to-bake performance plan that has been proven to work
over countless performance-tuning engagements. We do that in the
following subsections. The expected deliverables or outputs of each
phase are described so that you can have a good appreciation for the
amount of work that should be invested into each phase.
1. Phase 1: Plan Your Tests
The first thing you need to do
is write down how you will test BizTalk, the types and frequency of
transactions you will model, and how you will measure and grade the
performance of the system against the performance metric requirements.
This plan and its associated deliverables will essentially be the
performance acceptance criteria and sign-off for the solution owner.
Luckily, this plan is spelled out in detail in this section for you.
These are the expected deliverables from this phase:
Performance test plan:
This is a detailed plan for each type of performance test that will be executed.
A transaction set:
This is a complete description of what transactions will be modeled for each type of performance test.
Required test data:
This is an analysis of the type of data that will be required for each system.
Downstream system impact:
This is the impact of any downstream systems that will be updated as a result of the tests.
Transaction volumes:
This will be a number of
quantifiable transactions per second that will represent the frequency
of new messages being submitted to BizTalk.
One of the first steps that
you'll want to perform in phase 1 is to model your transactions. The
results from that step feed into several of the deliverables in the
preceding list.
1.1. Modeling Your Transactions
Modeling
transactions is by far the most important step in this phase. This step
is to ensure that what you are modeling accurately represents the
real-world usage of the system. Up until now, the majority of documents
that you have created have been developer test data, or dummy data.
This also may be the first time
that a system has been run end-to-end in an integrated manner under
load. Often there is a series of integration tests that are done prior
to this phase, but you can also use this phase to kick off the
integrated system phase of your project since the deliverables and
outputs will be very similar.
What you need to do is get a
representation of the business transactions that will be performed once
the system is put into production. Often there are "read-only"
transactions and "update transactions.Read-only transactions would be either product
inventory data requests or product catalog data requests. In the banking
sector, these would be account balance requests. Update transactions
are exactly that—they update some other system based on the data sent in
the document. These are typically purchase orders or account withdrawal
type of transactions.
The goal here is to work with
your business analysts and system owners to figure out what the most
common types of transactions will be and what transactions have the
biggest effect on the business. Most solutions have dozens of different
transaction types, so you will have to pick an arbitrary number—say the
top five transactions—and focus on them. Most projects do not have the
time to accurately performance test all potential transactions types, so
you will need to figure out what ones give you the biggest bang for
your buck.
1.1.1. Determine Transaction Frequency
As part of modeling transactions, you need to determine the rate and frequency at which they occur. The terms rate and frequency may sound synonymous, but we really are referring to two different things. Rate refers to quantity, and frequency refers to how often.
For example,
assume we are modeling two potential transaction types that can occur
within our system within 1 hour of production. Our duration for the
performance test will be 1 hour, and we need to figure out exactly how
many times each of those transactions will show up within that 1-hour
period. Let's assume that we have purchase order transactions and stock
verification transactions; the first is an update, and the second is a
read. When we receive purchase orders, we actually receive them in
batches since the customers will wait until their "purchase order
bucket" is full before submitting them. On average, each customer will
send us 30 purchase orders at a time. Also on average, we receive orders
from a customer every 2 minutes. In this case, the rate will be 30
documents, and the frequency will be 2 minutes. The stock verification
requests come in as they are needed, and on average we receive three
every second. In this case, we will be receiving three documents per
second. It is very important to think about your transaction volumes
like this because it will allow you to have a common language with the
business owners who really have no idea about the intricacies of
performance testing, but they do understand how many orders/stock
requests they receive each day.
You also need to model the
"bell curve" of your application. Almost every solution in the world has
something called the 8 to 5 effect, in that they receive 95% of their
usage during regular business hours. Knowing this is important because
if you are unable to determine the frequency of your document submission
and have only "orders per day" numbers, then you need to take that
effect into account when extrapolating what your usage will look like.
Often the usage bands are much smaller—more like 10 a.m. to 2 p.m. This
type of load is very "bursty" and often lends itself to a system that is
always in an overdrive state during regular hours.
You also need to account
for the 4:30 effect. This is a pattern that sometimes occurs at the end
of the day when users of the system attempt to jam in all the remaining
data at the end of the day before they go home. This last-minute effort
often leads to a floodgate scenario in which the load unexpectedly jumps
to a point where the system is above its steady state. We will talk
about this effect along with the types of performance tests you should
plan to execute to account for this and several other "normal" usage
spikes.
The other thing to be aware of
is the "not so often but really big" transaction that occurs very
infrequently but will have a major impact on the solution. These are
often things like product updates or batch updates that can potentially
have thousands of nested transactions within them. If your solution has
this type of transaction, you need to determine whether it will be
processed during the day and whether it will have a dramatic impact on
performance. If it is possible, consider offloading these transactions
to a receive location that is active only during nonpeak hours.
1.1.2. Understand the Background Noise
In most solutions, there will
be a certain amount of operational work that will be happening in
parallel to BizTalk. We refer to this work as "background noise." It's
important to understand the types of noise that you might encounter and
to account for the effects of background noise in your testing. Here are
some things to watch for:
Traffic being serviced from a public web site:
Often a web site that is
taking customer orders, for example, will have its database physically
located on the same database cluster as BizTalk. You will need to
account for this load by having a web site load test running at the same
time as your BizTalk performance test if this situation applies to you.
BizTalk background jobs:
There are many BizTalk
jobs that are executing on a scheduled basis that can affect the overall
throughput of the system. It is important to make sure that the SQL
Server Agent is started for the duration of your performance tests to
ensure that these jobs are running while the performance test is taking
place.
System scheduled tasks:
Are there any
operational scripts or scheduled tasks that run during the day that can
affect performance such as backup jobs, log shipping jobs, cleanup
scripts, and so on? In a perfect world, these should be running as well
while the performance test is taking place.
Other BizTalk applications:
Is this
hardware that is to be used dedicated exclusively for this system, or
are their other BizTalk applications running? If it is a shared
environment, then performance characteristics, uptime requirements, and
transaction volumes will need to be considered for all applications
running in the environment.
1.1.3. Create Test Data
To properly load test a system,
you need to have data that accurately represents what you will be
expecting in production. This is where any old data from a solution that
you are replacing becomes invaluable. If you don't have such data, then
what teams often do is start creating spreadsheets of test data. Each
sheet will represent one document type, and each column becomes a field
in that message. This way, the data can easily be loaded into a database
(you will see why that is important later). Be wary of repeating IDs
such as OrderId or CustomerId, which can cause errors should you get a
duplicate entry (don't worry, we will give you a solution to that in a
bit). Try to give yourself enough data variety so that you are not
repeating the same test data over and over again. You would be surprised
to see how many performance problems are masked because every customer
is named "Jon Doe" or every company is named" "Contoso" and they are all
ordering the same products.
1.2. Planning Your Test Types
There are several
different types of performance tests that you will need to implement.
The goals of each are different but equally important. The types of
tests are as follows:
Steady state:
This is what you define
as a "typical" usage scenario. This needs to model what you believe will
be the real-world usage of your solution given the assumptions you made
in phase 1. Based on this scenario, you will determine whether your
solution is sustainable in that the regular operating state of your
solution can adequately process the rate at which messages are published
and subscribed with no backlog. This is usually where you figure out
whether the hardware for your solution is adequate. Additionally, should
you determine that your solution is not sustainable, you need to start
identifying bottlenecks or areas to tune in your solution to make it
sustainable before proceeding with the other tests. In reality, if your
solution is not sustainable under a steady-state condition, you have a
serious problem that you need to fix by purchasing more
hardware/software, tuning the application, or doing a combination of
both.
Floodgate:
Assuming your solution
is in a steady state that is sustainable, this scenario accounts for
spikes in message processing that can occur normally throughout the
day." Part of the deliverables of performance testing is to make
assumptions about what these spikes will look like based on past usage
data, or SWAG (scientific wild-ass guess). The important factor here is
to state your assumptions in your performance deliverable and have your
business/system owners agree or disagree with your assumptions. These
assumptions are what drives the modeling and are really the important
tool to gain acceptance of the performance test approach.
Overdrive:
This is a test that
measures what the solution performance will look like given a constant
message delivery rate that exceeds the anticipated load and creates a
backlog of work. Overdrive load is load that is clearly not sustainable.
The important take-away from this test is, can my solution continue to
function and eventually clear the backlog assuming a constant and
overwhelming flood of messages? You need to understand whether BizTalk
will continue to process messages if it gets overloaded or whether the
entire solution come to a grinding halt. Assuming the overdrive
condition stops, how long does it take to clear the backlog, and is it
linear and predictable? When the system is in overdrive, is the message
processing rate constant, or does it degrade because of backups in
correlating messages or related processes that also are not completing?
1.3. Determining Your Exit Criteria
Knowing when to stop is often
overlooked but is very important. How do you know when you are done?
Each performance test/tuning exercise needs to document what the exit
criteria is for that test. The following are a few different options for
exit criteria:
Pass—Performance criteria met immediately:
The system was able to meet the performance criteria without any tuning required by the development team.
Pass—Performance criteria met after tuning:
After tuning, the team was able to tune the solution enough to meet the performance criteria.
Fail—Performance was not met, significant redevelopment needed:
This isn't necessarily a
bad thing. What you are stating here is that the current code does not
meet the requirements. You need to qualify that with what the current
performance level is, along with an estimate to rework the solution to
have it meet the requirements. If the estimate to redesign, develop, and
test that piece of the application is quite big, the project
management/business owner may decide that the current performance is
acceptable and not worth the additional cost. This way, the decision is
based on a dollar value vs. the benefit of the additional work and not
on an emotional "but I need this" basis.
Fail—Performance was not met, additional hardware required:
Again, this isn't
necessarily a bad thing. What you are stating here is that the code
cannot be optimized or changed in any way that will meet the current
requirements, and all tuning options have been exhausted. The only
solution is to go buy more hardware. This is often an easy decision to
make because the costs associated with hardware are generally well
known. Also, there is a potential to expropriate hardware that was
destined for another purpose to retool a server for a dual purpose. In
any case, this is a decision for the solution owner to make based on the
options available at the time.