Modifying an entity to work with the Table service

3/11/2011 9:20:50 AM

Before we look at how we can start coding against the Table service, you need to understand how your data is stored in the Table service and how that differs from the SQL-based solution. In the next couple of sections, we’ll look at the following:

How can we modify an entity so it can be stored in the Table service?
How is an entity stored in the Table service?

As these points suggest, before you can store the shirt data with the Table service, you need to do a little bit of jiggery pokery with the entity definition. Let’s look at what you need to do.

1. Modifying an entity definition

To be able to store the C# entity in the Table service, each entity must have the following properties:

Timestamp
PartitionKey
RowKey

Therefore, to store the Product entity in the Azure Table service, you’d have to modify the previous definition of the Product entity to look something like this:

[DataServiceKey("PartitionKey", "RowKey")]
public class Product
{
   public string Timestamp{ get; set; }
   public string PartitionKey { get; set; }
   public string RowKey { get; set; }
   public string Name { get; set; }
   public string Description { get; set; }
}

In the preceding code the original Product entity is modified to include those properties required for Table storage (Timestamp, PartitionKey, and RowKey). Don’t worry if you don’t recognize these properties—we’ll explain what they mean shortly.

To generate a hardcoded list of shirts using the new version of the Product entity, you’d need to change the hardcoded product list to something like this:

var products =
   new List<Product>
   {
     new Product
     {
        PartitionKey = "Shirts",
        RowKey= "1",
        Name = "Red Shirt",
        Description = "Red"
     },
     new Product
     {
        PartitionKey = "Shirts",
        RowKey = "2",
        Name = "Blue Shirt",
        Description = "A Blue Shirt"
     },
     new Product
     {
        PartitionKey = "Shirts",
        RowKey = "3",
        Name = "Frilly Blue Shirt",
        Description = "A Frilly Blue Shirt"
     }
};

As you can see from the preceding code, the only difference is that you’re now setting a couple of extra properties (PartitionKey and RowKey).

Look, no Timestamp

Notice that the revised object-creation code doesn’t set the Timestamp property. That’s because it’s generated on the server side and is only available to us as a read-only property. The Timestamp property holds the date and time that the entity was inserted into the table, and if you did set this property, the Table service would just ignore the value.

The Timestamp property is typically used to handle concurrency. Prior to updating an entity in the table, you could check that the timestamp for your local version of the entity was the same as the server version. If the timestamps were different, you’d know that another process had modified the data since you last retrieved your local version of the entity.

Now that you’ve seen how to modify your entities so that you can store them in the Table service, let’s take a look at how these entities would be stored in a Table service table.

2. Table service representation of products

In table 1 you saw how we’d normally store our list of Hawaiian shirt product entities in SQL Server, and table 1 shows how those same entities would logically be stored in the Windows Azure Table service.

Table 1. Logical representation of the Products table in Windows Azure
Timestamp	PartitionKey	RowKey	PropertyBag
2009-07-01T16:20:32	Shirts	1	Name: Red Shirt
			Description: Red
2009-07-01T16:20:33	Shirts	2	Name: Blue Shirt
			Description: A Blue Shirt
2009-07-01T16:20:33	Shirts	3	Name: Frilly Blue Shirt
			Description: A Frilly Blue Shirt

As you can see in table 1 , entities are represented in the Table service differently from how they’d be stored in SQL Server. In the SQL Server version of the Products table, we maintained a fixed schema where each property of the entity was represented by a column in the table. In table 11.2 the Table service maintains a fairly minimal schema; it doesn’t rigidly fix the schema. The only properties that the Table service requires, and that are therefore logically represented by their own columns, are Timestamp, PartitionKey, and RowKey. All other properties are lumped together in a property bag.

Extending an Entity Definition

Because all tables created in the Table service have the same minimal fixed schema (Timestamp, PartitionKey, RowKey, and PropertyBag) you don’t need to define the entity structure to the Table service in advance.

This flexibility means that you can also change the entity class definition at any time. If you wanted to show a picture of a Hawaiian shirt on the website, you could change the Product entity to include a thumbnail URI property as follows:

[DataServiceKey("PartitionKey", "RowKey")]
public class Product
{
   public string Timestamp{ get; set; }
   public string PartitionKey { get; set; }
   public string RowKey { get; set; }
   public string Name { get; set; }
   public string Description { get; set; }
   public string ThumbnailUri { get; set; }
}

Once you’ve modified the entity to include a thumbnail URI, you can store that entity directly in the existing Products table without modifying either the table structure or the existing data. Table 2 shows a list of shirts that include the new property.

Table 2. The modified entity with a new property can happily coexist with older entities that don’t have the new property.
Timestamp	PartitionKey	RowKey	PropertyBag
2009-07-01T16:20:32	Shirts	1	Name: Red Shirt
			Description: Red
2009-07-01T16:20:33	Shirts	2	Name: Blue Shirt
			Description: A Blue Shirt
2009-07-01T16:20:33	Shirts	3	Name: Frilly Blue Shirt
			Description: A Frilly Blue Shirt
2009-07-05T10:30:21	Shirts	4	Name: Frilly Pink Shirt
			Description: A Frilly Pink Shirt
			ThumbnailUri: frillypinkshirt.png

In the list of shirts in table 2, you can see that existing shirts (Red Shirt, Blue Shirt, and Frilly Blue Shirt) have the same data that was stored in table 11.2—they don’t contain the new ThumbnailUri property. But the data for the new shirt (Frilly Pink Shirt) does have the new ThumbnailUri property.

3. Storing completely different entities

Due to the flexible nature of the Table service, you could even store entities of different types in the same table. For example, you could store the Product entity in the same table as a completely different entity, such as this Customer entity:

[DataServiceKey("PartitionKey", "RowKey")]
public class Customer
{
   public string Timestamp{ get; set; }
   public string PartitionKey { get; set; }
   public string RowKey { get; set; }
   public string Firstname { get; set; }
   public string Surname { get; set; }
}

As you can see from the Customer entity, although the entity must contain the standard properties (Timestamp, PartitionKey, and RowKey) no other properties are shared between the Customer and Product entities; they even have different class names.

Even though these entities have very different definitions, they could be stored in the table, as shown in table 3 . The Table service allows for different entities to have different schemas.

Table 3. Storing completely different entities in the same table
Timestamp	PartitionKey	RowKey	PropertyBag
2009-07-01T16:20:32	Shirts	1	Name: Red Shirt
			Description: Red
2009-07-01T16:20:33	Shirts	2	Name: Blue Shirt
			Description: A Blue Shirt
2009-07-01T16:20:33	Shirts	FredJones	Firstname: Fred
			Surname: Jones
2009-07-05T10:30:21	Shirts	4	Name: Frilly Pink Shirt
			Description: A Frilly Pink Shirt
			ThumbnailUri: frillypinkshirt.png

Challenges of Storing Different Entity Types

Although the Table service is flexible enough to store entities of different types in the same table, as shown in table 3 , you should be very careful if you’re considering such an approach. If every entity you retrieve has a different schema, you’ll need to write some custom code that will serialize the data to the correct object type.

Following this approach will lead to more complex code, which will be difficult to maintain. This code is likely to be more error prone and difficult to debug. We encourage you to only store entities of different types in a single table when absolutely necessary.

Challenges of Extending Entities

On a similar note, if you need to modify the definition of existing entities, you should take care to ensure that your existing entities don’t break your application after the upgrade.

There are a few rules you should keep in mind to prevent you from running into too much trouble:

Treat entity definitions as data contracts; breaking the contract will have a serious effect on your application, so don’t do it lightly.
Code any new properties as additional rather than required. This strategy means that existing data will be able to serialize to the new data structure. If your code requires existing entities to contain data for the new properties, you should migrate your existing data to the new structure.
Continue to support existing property names for existing data. If you need to change a property name, you should either support both the old and new names in your new entity or support two versions of your entity (old definition and new definition). If you only want to support one entity definition, you’ll need to migrate any existing data to the new structure.

Now that you’ve seen how entities are stored within the Table service, let’s look at what makes this scalable.