Microsoft SQL Server 2008 Analysis Services : Building Basic Dimensions and Cubes - Setting up a new Analysis Services project

7/20/2011 6:05:07 PM

Choosing an edition of Analysis Services

Before we start developing with Analysis Services, we need a clear idea of which edition of Analysis Services we're going to be developing for. There are two choices: Standard Edition, which is cheaper but missing some features, and Enterprise Edition, which is more expensive but feature complete. Licensing cost is likely to be the major factor in this decision. If money is no object, then you should use Enterprise Edition. If it is an issue, then you'll just have to live with the limitations of Standard Edition. Of course, if we install Analysis Services on a server that already has SQL Server installed then there are no extra license costs involved, we have to be careful they don't compete for resources. This document on the Microsoft website gives a detailed breakdown of which features are available in each edition: http://tinyurl.com/sqlstdvsent.

Don't worry about having to use the Standard Edition though. Some of the features it lacks can be recreated with a little bit of extra work. The key features in Enterprise Edition are in the area of performance for very large or complex cubes, and you can go a long way with Standard Edition before you really need to use Enterprise Edition. The Deployment Server Edition project property, which is described below, will help you make sure you only use the features available in the edition of your choice.

Setting up a new Analysis Services project

The first step towards creating a new cube is to create a new Analysis Services project in BIDS. Immediately after doing this, we strongly recommend putting your new project into source control. It's easy to forget to do this, or not bother, because building a cube doesn't seem like a traditional development project, but you'll be glad that you did it when you receive your first request to rollback a change to a complex MDX calculation.

As you're probably aware, there are two ways of working with Analysis Services projects in BIDS:

Project mode: where you work with a local Visual Studio project and deploy to your Analysis Services server only when you're happy with all the changes you've made
Online mode: where you edit your Analysis Services database live on the server and commit changes every time you click on the Save button

You'll only be able to use source control software effectively if you work in the project mode. Therefore, it's a good idea to resist the temptation to work in online mode unless you're only making temporary changes to your cube, even though online mode often seems to be the most convenient way of working.

With all new Analysis Services projects, there are a few useful project properties that can be set. You can set project properties by right-clicking on the Project node in the Solution Explorer pane and selecting Properties.

Here is a list of properties you may want to change:

Build
- Deployment Server Edition: If you plan to deploy Standard Edition in production, but you're using Developer Edition in development, you will want to set this property to Standard. This will make BIDS raise errors when you build a project if you accidentally use features that aren't available in the Standard Edition.
Deployment
- Processing Option: This property allows you to process your database automatically whenever you deploy a project. The 'Default' option will perform a Process Default, but in many cases, when you deploy a change that doesn't need any cube processing at all, a Process Default can still waste 10 or 20 seconds, and if you've made more substantial changes to an object you will still want to control when processing takes place. Setting this property to Do Not Process instead will stop all automatic processing. This means that you have to remember to manually process any objects yourself if you make changes to them but it will save you time in the long run by preventing a lot of unintentional processing.
- Server: This contains the name of the server you're deploying to and defaults to localhost. If you're not developing on a local Analysis Services instance, then you'll need to change this anyway. Even if you are, it's a good idea to enter the name of the target server, rather than use localhost, in case anyone wants to work on the project on another machine.
- Database: This contains the name of the database that you're deploying to. It defaults to the name of the Visual Studio project. Of course, you can change it if you want your project and database to have different names.

It is good to install BIDS Helper, an award-winning free community-developed tool that adds a lot of useful functionality to BIDS. You can download it from http://www.codeplex.com/bidshelper .

Creating data sources

Once we've created a new project and configured it appropriately, the next step is to create a data source object. Even though you can create multiple data sources in a project, you probably shouldn't.

You are then faced with the choice of which OLE DB provider to use, since there are often several different options for any given relational database. For SQL Server data sources, you have the option of using the SQLClient .NET data provider, the Microsoft OLE DB provider for SQL Server and the SQL Server Native Client (often referred to as SNAC). You should always choose the SQL Server Native Client since it offers the best performance. For Oracle data sources, the choice is more complicated since, even though Oracle is a supported data source for Analysis Services, there is a long list of bugs and issues. Some are addressed in the white paper at http://tinyurl.com/asdatasources , but if you do run into problems, the best approach is to try using Microsoft's Oracle OLE DB Provider, Oracle's own OLE DB Provider, the .NET Provider for Oracle or any of the third-party OLE DB Providers on the market to see which one works. Access, DB2, Teradata and Sybase are the other officially supported relational data sources, and if you need to load data from another source, you can always use SQL Server Integration Services to push data into the cube by using the Dimension Processing and Partition Processing destinations in a Data Flow.

Remember to install the same version of any OLE DB provider you're using on all of your development, test and production machines. Also, while BIDS is a 32-bit application and needs a 32-bit version of the driver to connect to a relational database, if your Analysis Services instance is 64-bit, it will need the 64-bit version of the same driver to process cubes successfully.

Analysis Services must also be given permission to access the data source, and how it does so depends on the type of data source you're using and how its security is set up. If you're using Windows authentication to connect to SQL Server, as Microsoft recommends you to, then you should set up a new Windows domain account specifically for Analysis Services, and then use the SQL Server Configuration Manager tool to set the Analysis Services service to run under that account. You should then give that account any permissions it needs in SQL Server on the tables and views you'll be using. Most of the time 'Read' permissions will be sufficient. However, some tasks, such as creating Writeback fact tables, will need more. You'll notice on the Impersonation Information tab in the Data Source Designer dialog in BI Development Studio there are some other options for use with Windows authentication, such as the ability to enter the username and password of a specific user. However, we recommend that you use the Use Service Account option so that Analysis Services tries to connect to the relational database under the account you've created.

If you need to connect to your data source using a username and a password (for example, when you're using SQL Server authentication or Oracle), then Analysis Services will keep all sensitive information, such as passwords, in an encrypted format on the server after deployment. If you try to script the data source object out you'll find that the password is not returned, and since opening an Analysis Services project in online mode essentially involves scripting out the entire database, you'll find yourself continually re-entering the password in your data source whenever you want to reprocess anything when working this way. This is another good reason to use project mode rather than online mode for development and to use Windows authentication where possible.

Creating Data Source Views

In an ideal world, if you've followed all of our recommendations so far, then you should need to do very little work in your project's Data Source View—nothing more than selecting the views representing the dimension and fact tables and setting up any joins between the tables that weren't detected automatically. Of course, in the real world, you have to compromise your design sometimes and that's where a lot of the functionality available in Data Source Views comes in useful.

When you first create a new Data Source View (DSV), the easiest thing to do is to go through all of the steps of the wizard, but not to select any tables yet. You can then set some useful properties on the DSV, which will make the process of adding new tables and relationships much easier. In order to find them, right-click on some blank space in the diagram pane and click on Properties. They are:

Retrieve Relationships—by default, this is set to True, which means that BIDS will add relationships between tables based on various criteria. It will always look for foreign key relationships between tables and add those. Depending on the value of the NameMatchingCriteria property, it may also use other criteria as well.
SchemaRestriction—this property allows you to enter a comma-delimited list of schema names to restrict the list of tables that appear in the Add/Remove Tables dialog. This is very useful if your data warehouse contains a large number of tables and you used schemas to separate them into logical groups.
NameMatchingCriteria—if the RetrieveRelationships property is set to True, then BIDS will try to guess relationships between tables by looking at column names. There are three different ways it can do this:

1. by looking for identical column names in the source and destination tables (for example, FactTable.CustomerID to Customer.CustomerID)
2. by matching column names to table names (for example, FactTable.Customer to Customer.CustomerID)
3. by matching column names to a combination of column and table names (for example, FactTable.CustomerID to Customer.ID).
This is extremely useful if the tables you're using don't actually contain foreign key relationships. You'll also see an extra step in the New Data Source View wizard allowing you to set these options if no foreign keys are found in the Data Source you're using.

Now, you can go ahead and right-click on the DSV design area and select the Add/Remove Tables option and select any tables or views you need to use. It might be a good idea not to select everything you need initially, but to select just one fact table and a few dimension tables so you can check the relationships and arrange the tables clearly, then add more. It's all too easy to end up with a DSV that looks like a plate of spaghetti and is completely unreadable. Even though you don't actually need to add every single relationship at this stage in order to build a cube, we recommend that you do so, as the effort will pay off later when BIDS uses these relationships to automatically populate properties such as dimension-to-measure group relationships.

Creating multiple diagrams within the DSV, maybe one for every fact table, will also help you organize your tables more effectively. The Arrange Tables right-click menu option is also invaluable.

Named Queries and Named Calculations allow you to add the equivalent of views and derived columns to your DSV, and this functionality was added to help cube developers who needed to manipulate data in the relational database, but didn't have the appropriate permissions to do so. However, if you have the choice between, say, altering a table and a SQL Server Integration Services (SSIS) package to fix a modeling problem or creating a Named Query, then we recommend that you always choose the former one—only do work in the DSV if you have no other choice. As we've already said several times, it makes much more sense to keep all of your ETL work in your ETL tool, and your relational modeling work in the relational database where it can be shared, managed and tuned more effectively. Resist the temptation to be lazy and don't just hack something in the DSV! One of the reasons why we advocate the use of views on top of dimension and fact tables is that they are as easy to alter as named queries and much easier to tune. The SQL that Analysis Services generates during processing is influenced heavily by what goes on in the DSV, and many processing performance problems are the result of cube designers taking the easy option early on in development.

If you make changes in your relational data source, those changes won't be reflected in your DSV until you click the Refresh Data Source View button or choose Refresh on the right-click menu.

Problems with TinyInt

Unfortunately, there's a bug in Analysis Services 2008 that causes a problem in the DSV when you use key columns of type TinyInt. Since Analysis Services doesn't support this type natively, the DSV attempts to convert it to something else—a System.Byte for foreign keys to dimensions on the fact table and a System.Int32 for primary keys on dimension tables which have Identity set to true. This in turn means you can no longer create joins between your fact table and dimension table. To work around this, you need to create a named query on top of your dimension table containing an expression that explicitly casts your TinyInt column to a TinyInt (for example using an expression like cast(mytinyintcol as tinyint) ), which will make the DSV show the column as a System.Byte. It sounds crazy, but for some reason it works.

Other -----------------

- Windows Server 2008 Server Core : Managing IIS - Working with the ApplicationHost.CONFIG File

- Microsoft Dynamics CRM 2011 : Creating a Dynamic Marketing List

- Microsoft Dynamics CRM 2011 : Evaluating Members Included in a List by Using Advanced Find & Removing Selected Members from a List

- Microsoft Dynamics AX 2009 : The MorphX Tools - Label Editor

- Windows Server 2008 R2 : Manage a DNS Server (part 3) - Manage Zone Database Files & Configure Single-Label DNS Resolution

- Windows Server 2008 R2 : Manage a DNS Server (part 2) - Manage DNS Integration with Active Directory & Change Zone Replication

- Windows Server 2008 R2 : Manage a DNS Server (part 1) - Change the Address of a DNS Server & Scavenge Properties for DNS

- Active Directory Domain Services 2008 : Manage Active Directory Domain Services Data - Reset a User Account Password

- Active Directory Domain Services 2008 : Manage Active Directory Domain Services Data - Enable a User Object

- SQL Server 2005 : Privilege and Authorization - Data Organization Using Schemas