Working with the Table service REST API : Querying data (part 3) - Filtering data with LINQ & Selecting data using the LINQ syntax

3/17/2011 5:11:55 PM

4. Filtering data with LINQ

In the previous section, we looked at how to filter queries server-side using the REST API. We’ll now look at how the REST API maps onto the LINQ queries.

As you may have guessed, LINQ queries eventually get resolved to the REST API URIs like the ones we looked at in the previous section. This means that although LINQ has a large and rich syntax, only those methods that map directly to the REST API can be supported.

While you’re debugging a LINQ query in Visual Studio, you can either hover over or put a watch on a context object (such as shirtContext in figure 1 ) and you’ll be able to see the underlying REST API query. Figure 1 shows the REST API query for a LINQ query that returns all products in the Shirts partition.

Figure 1. Mapping a LINQ query back to the REST API

Let’s now look at the typical queries that you’ll be able to perform.

Equality Comparisons

As you can see from the list in table 2 , only equality, range comparisons, and Boolean comparisons can be performed using the Table service. The following queries are typical equality comparisons that can be performed:

where shirt.RowKey == "Red Shirt"

where shirt.Description != "A Red Shirt"

where shirt.Partition == "Shirts"
  && shirt.Description != "A Red Shirt"

Range Comparisons

The Table service supports the filtering of range data using range queries. For example, the following WHERE clause will return those shirts priced at $50 or more, and less than $70:

where shirt.Price >= 50 && shirt.Price < 70

Because data is stored in the Table service as native types, rather than as string representations, the Table service will perform comparison routines using the native types rather than string comparisons. The following query will return all shirts whose price is greater than or equal to $50.20:

where shirt.Price >= 50.20

If this query were performed as a string comparison (which you would have to do with Amazon SimpleDB), it would not return shirts priced at $60 (because there are fewer characters in the string than 50.20) unless the price were stored as 60.00.

In Windows Azure Table service, the only time you need to worry about performing equivalent string comparisons is if you store a non-native string type as a partition or row key. Partition and row keys are always represented as strings in the Table service, so if you need to perform range comparisons on these entities, you’ll need to ensure that the string lengths of the stored data are correct.

Boolean Logic

As stated earlier, the Table service does respect property types. This means you can perform Boolean logic against entity properties that are defined as bool. For example, you could perform the following WHERE clause against a shirt that’s marked as a genuine Hawaiian shirt:

where shirt.IsMadeInHawaii && shirt.Price > 50

Prefix Queries

Using the range comparison and Boolean logic, you can manipulate your LINQ and REST queries to return all entities that start with a particular string. For example, if you wanted to return all shirts that were present in any of partition1, partition2, partition3, or partition4, you could use the following query:

where shirt.PartitionKey.CompareTo("Partition1") >= 0 &&
     shirt.PartitionKey.CompareTo("Partition5") < 0

LINQ to Objects Queries

Even though only a small subset of the LINQ syntax is available to be executed by the Table service, you can still perform in-memory LINQ queries (LINQ to Objects). In-memory LINQ queries do provide full access to the LINQ syntax, but all queries are executed on the client side, so they require the full dataset to be returned by the Table service first. This approach isn’t suitable for situations where you’re working with a large set of data.

By now you should have a taste of the types of queries that you can perform against the Table service. Let’s now look at how you can shape the data that’s returned from your queries.

5. Selecting data using the LINQ syntax

As you’ll have noticed in the supported LINQ syntax list (table 2 ), there was no mention of the SELECT statement. You can use the SELECT statement to return the entire entity, but you can’t use SELECT to instruct the Table service to only return a subset of the entity properties.

Returning an Entire Entity Using Select

To illustrate the limitations of using SELECT, let’s look again at a LINQ query that returns a product entity in its entirety:

var shirts = from shirt in shirtContext.Products
             where shirt.PartitionKey == "Shirts"
             select shirt;

This LINQ query was used earlier to return all entities that reside in the Products table. The following code is an Atom XML extract of one of the entities returned by the preceding LINQ query:

<content type="application/xml">
   <m:properties>
    <d:PartitionKey>Shirts</d:PartitionKey>
    <d:RowKey>shirts0</d:RowKey>
    <d:Timestamp m:type="Edm.DateTime">
       2009-07-29T21:14:45.022Z
    </d:Timestamp>
    <d:Description>A Shirt</d:Description>
    <d:Name>shirtshirts0</d:Name>
 </m:properties>
</content>

As you can see from the XML for the returned entity, every property of the product entity is returned by the Table service (PartitionKey, RowKey, Timestamp, Description, and Name).

If the Products table was held in SQL Server rather than the Table service, and the LINQ statement was executed against the database using LINQ2SQL or LINQ2Entities, the following SQL statement would be generated and executed on the SQL Server database:

SELECT PartitionKey, RowKey, Timestamp, Description, Name
FROM Products
WHERE PartitionKey = 'Shirts'

Shaping the Query

If you’re using LINQ2SQL or LINQ2Entities with a SQL Server database, and you don’t need to return the entire entity, you might choose to write a more efficient LINQ query that only requests and returns specific columns from the SQL Server Database. The following SQL statement requests just the Name and Description properties:

SELECT Name, Description
FROM Products
WHERE PartitionKey="Shirts"

The preceding SQL statement is less intensive to execute on the server (as there is less data being queried) and it will also use less network bandwidth due to the reduced dataset being returned to the application.

When you’re using LINQ2SQL or LINQ2Entities, you can modify your less efficient LINQ statements, like this:

select entity

to generate the more efficient SQL statement:

select new
   {
      Name = newShirt.Name,
      Description = newShirt.Description
   };

This would modify the previous select entity LINQ statement so it looks like this:

var shirts = from shirt in shirtContext.Products
             where shirt.PartitionKey == "Shirts"
             select new
             {
                Name = newShirt.Name,
                Description = newShirt.Description
             };

Unfortunately, because the Table service doesn’t support data shaping using the SELECT statement, you’d get a nasty exception if you attempted to run the preceding LINQ query. As a result, whenever you execute queries against the Table service, every property of the entity will always be returned as part of the query.

If you really do need to shape the returned data in your application, and you don’t mind that the entire entity will be returned from the server, you can always shape it locally using the following code:

var shirts = from newShirt in
             (
                from shirt inshirtContext.Products
                where shirt.PartitionKey == "Shirts"
                select shirt
             ).ToList()
             select new
             {
                Name = newShirt.Name,
                Description = newShirt.Description
             };

The preceding code uses the same LINQ query as in section 2 to filter the data in the Table service, but this time it returns the entire entity. By calling the ToList method on the inner LINQ query, you can ensure that the server-side query will return all properties of the entity.

Finally, the result of the ToList method is fed into the outer LINQ2Object query, which performs in-memory shaping of the data, returning a new anonymous type containing the two properties that you want.

You should be aware that although this query returns the entities shaped as you specify, it won’t improve server-side or bandwidth efficiency. If you have a very large entity with an infrequently used property that you don’t need in a particular query, this unused property will still be returned by the Table service.

6. Paging data

By default, SELECT queries will only return 1,000 items in a single result set. Not only is this the default amount of data returned, but it’s also the maximum amount of data returned.

If you wish to return a smaller amount of data, you can set this with the Take statement in LINQ, as follows:

(from shirt inshirtContext.Products
where shirt.PartitionKey == "Shirts"
select shirt).Take(100);

The preceding LINQ statement will return the first 100 items in the Shirts partition. The LINQ Take extension method will be resolved to the following query string parameter in the URI for the REST API call:

&top=100

If more items could be returned by the query than are present in the result set, continuation tokens will be provided to allow you to retrieve the next set of data in the query. This method of using continuation tokens effectively provides a method of paging.

If you wanted to return all items in the Shirts partition of the Products table, but it potentially contains more than 1,000 items, you could run the following REST API query:

http://silverlightukstorage.table.core.windows.net/Products?$filter=PartitionKey%20eq %20'Shirts'

Because more than 1,000 items would normally be returned in the query, you’ll receive the following continuation tokens in the response:

x-ms-continuation-NextPartitionKey: Shirts
x-ms-continuation-NextRowKey: 1001

If you wanted to return all the items in the Shirts partition that were not returned as part of the original query, you could retrieve the next set of data using the following query:

http://silverlightukstorage.table.core.windows.net/Products?$filter=PartitionKey%20eq %20'Shirts'&NextPartitionKey=Shirts&NextRowKey=1001

The preceding query would return all products in the Shirts partition from RowKey 1001 onwards, or at least the next 1,000 entities.

Related -----------------

- Working with the Table service REST API : Querying data (part 3) - Filtering data with LINQ & Selecting data using the LINQ syntax

- Working with the Table service REST API : Querying data (part 2) - Querying with LINQ & Filtering data with the REST API

- Working with the Table service REST API : Querying data (part 1) - Retrieving all entities in a table using the REST API

Other -----------------

- Working with the Table service REST API : Querying data (part 2) - Querying with LINQ & Filtering data with the REST API

- Working with the Table service REST API - Batching data

- Modifying entities with the REST API is CRUD (part 3) - Updating entities

- Modifying entities with the REST API is CRUD (part 2) - Deleting entities

- Modifying entities with the REST API is CRUD (part 1) - Inserting entities

- Working with the Table service REST API - Authenticating requests against the Table service

- Content delivery networks

- Using BLOB storage as a media server (part 3) - A Silverlight-based chunking media player

- Using BLOB storage as a media server (part 2) - A WPF-based adaptive-streaming video player

- Using BLOB storage as a media server (part 1) - Building a Silverlight or WPF video player