StartsWith

Using StartsWith to Filter on RowKeys

There are many scenarios where filtering on partial RowKeys makes sense. One of these scenarios is Azure Diagnostics Log analysis where events are partitioned by time based PartitionKeys and by compound RowKeys. This allows us to filter and find information effectively.

Event RowKeys are composed of deployment IDs, role names, instance names, categories and other information:

8637d014bcf94452a1e48f393a11674b___Brisebois.WorkerRole___Brisebois.WorkerRole_IN_0___0000000001652031520___WADLogsLocalQuery

Querying WADLogsTable Effectively

WADLogsTable
The following example, shows how to target a specific table partition and filter events based on a StartsWith pattern.

var storageAccount = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse(connectionString);
var client = storageAccount.CreateCloudTableClient();

var table = client.GetTableReference("WADLogsTable");

// Querying Windows Azure Diagnostics by Partition for a partial RowKey
var query = new FindWithinPartitionStartsWithByRowKey("0635204061600000000", "8637d014bcf94452a");
var result = query.Execute(table);


The FindWithinPartitionStartsWithByRowKey query returned many events from my pet project ‘This day on Azure‘. As we can observe, the RowKey structure allows us to partially filter from left to right. This example shows data from a single Cloud Service. But there are times when many Cloud Service uses the same Azure Storage Account to persist Azure Diagnostics.

{Microsoft.WindowsAzure.Storage.Table.DynamicTableEntity}
    ETag: "W/\"datetime'2013-11-18T21%3A17%3A38.0626707Z'\""
    PartitionKey: "0635204061600000000"
    Properties: Count = 9
    RowKey: "8637d014bcf94452a1e48f393a11674b___Brisebois.WorkerRole___Brisebois.WorkerRole_IN_0___0000000001652031489___WADLogsLocalQuery"
    Timestamp: {2013-11-18 9:17:38 PM +00:00}

{Microsoft.WindowsAzure.Storage.Table.DynamicTableEntity}
    ETag: "W/\"datetime'2013-11-18T21%3A17%3A38.0626707Z'\""
    PartitionKey: "0635204061600000000"
    Properties: Count = 9
    RowKey: "8637d014bcf94452a1e48f393a11674b___Brisebois.WorkerRole___Brisebois.WorkerRole_IN_0___0000000001652031490___WADLogsLocalQuery"
    Timestamp: {2013-11-18 9:17:38 PM +00:00}

Filtering Within a Partition

In order to produce a StartsWith search pattern with Azure Table Storage queries, one must query based on RowKey (this also works for PartitionKeys) and set the lower bound greater or equal to the StartsWith pattern. Then we must increment the last character from the lower bound string in order to produce the upper bound of the query. This will produce the desired StartsWith capability that we all have come to cherish.

public class FindWithinPartitionStartsWithByRowKey
    : IQuery<CloudTable, List<DynamicTableEntity>>
{
    private readonly string partitionKey;
    private readonly string startsWithPattern;

    public FindWithinPartitionStartsWithByRowKey(string partitionKey,
        string startsWithPattern)
    {
        if (partitionKey == null)
            throw new ArgumentNullException("partitionKey");
        if (startsWithPattern == null)
            throw new ArgumentNullException("startsWithPattern");

        this.partitionKey = partitionKey;
        this.startsWithPattern = startsWithPattern;
    }

    public List<DynamicTableEntity> Execute(CloudTable model)
    {
        var query = new TableQuery();

        var length = startsWithPattern.Length - 1;
        var lastChar = startsWithPattern[length];

        var nextLastChar = (char) (lastChar + 1);

        var startsWithEndPattern = startsWithPattern.Substring(0, length) + nextLastChar;

        var prefixCondition = TableQuery.CombineFilters(
            TableQuery.GenerateFilterCondition("RowKey",
                QueryComparisons.GreaterThanOrEqual,
                startsWithPattern),
            TableOperators.And,
            TableQuery.GenerateFilterCondition("RowKey",
                QueryComparisons.LessThan,
                startsWithEndPattern)
            );

        var filterString = TableQuery.CombineFilters(
            TableQuery.GenerateFilterCondition("PartitionKey",
                QueryComparisons.Equal,
                partitionKey),
            TableOperators.And,
            prefixCondition
            );

        var entities = model.ExecuteQuery(query.Where(filterString));

        return entities.ToList();
    }
}

Filtering Within a Partition Range

This query is essentially the same as the previous query, but it allows us to extend our search to multiple partitions. Remember that smaller data sets produce faster results.

public class FindWithinPartitionRangeStartsWithByRowKey
    : IQuery<CloudTable, List<DynamicTableEntity>>
{
    private readonly string partitionLowerBound;
    private readonly string partitionUpperBound;
    private readonly string startsWithPattern;

    public FindWithinPartitionRangeStartsWithByRowKey(string partitionLowerBound,
        string partitionUpperBound,
        string startsWithPattern)
    {
        if (partitionLowerBound == null)
            throw new ArgumentNullException("partitionLowerBound");
        if (partitionUpperBound == null)
            throw new ArgumentNullException("partitionUpperBound");
        if (startsWithPattern == null)
            throw new ArgumentNullException("startsWithPattern");

        this.partitionLowerBound = partitionLowerBound;
        this.partitionUpperBound = partitionUpperBound;
        this.startsWithPattern = startsWithPattern;
    }

    public List<DynamicTableEntity> Execute(CloudTable model)
    {
        var query = new TableQuery();

        var length = startsWithPattern.Length - 1;
        var lastChar = startsWithPattern[length];

        var nextLastChar = (char) (lastChar + 1);

        var startsWithEndPattern = startsWithPattern.Substring(0, length) + nextLastChar;

        var prefixCondition = TableQuery.CombineFilters(
            TableQuery.GenerateFilterCondition("RowKey",
                QueryComparisons.GreaterThanOrEqual,
                startsWithPattern),
            TableOperators.And,
            TableQuery.GenerateFilterCondition("RowKey",
                QueryComparisons.LessThan,
                startsWithEndPattern)
            );

        var partitionFilterString = TableQuery.CombineFilters(
            TableQuery.GenerateFilterCondition("PartitionKey",
                QueryComparisons.GreaterThanOrEqual,
                partitionLowerBound),
            TableOperators.And,
            TableQuery.GenerateFilterCondition("PartitionKey",
                QueryComparisons.LessThanOrEqual,
                partitionUpperBound)
            );

        var filterString = TableQuery.CombineFilters(
            partitionFilterString,
            TableOperators.And,
            prefixCondition
            );

        var entities = model.ExecuteQuery(query.Where(filterString));

        return entities.ToList();
    }
}

Reusable Queries

In 2012, I wrote about building reusable and testable queries. This query object interface has been tested on several of my projects and has proven to clean up solutions. It provides a way to name questions so that we can discover them by using tools like R# (ReSharper). It also has the added benefit of promoting code reuse and strongly encourages self documenting code.

New developers can browse the solution and discover the application’s capabilities by reading file names. Each name describes a question that is used throughout the application.

I have used this technique on Cloud Services, Web APIs and even on Clients. And every time I put this into practice, the code got cleaner and easier to work with.

A query isn’t just SQL or LINQ, it’s a question that is asked about or to something. In this specific case, we are asking questions to Azure Table Storage, but this same pattern can be applied to business domains, databases and plain old objects.

public interface IQuery<in TModel, out TResult>
{
    TResult Execute(TModel model);
}

5 responses to #Azure Table Storage – Using StartsWith to Filter on RowKeys

  1. 

    What are the speed implications of this?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s