Today I was watching the "Windows Azure Storage: What’s Coming, Best Practices, and Internals" session from Build 2013. These are the best practices that were presented.
- ServicePointManager.UseNagleAlgorithm = false;
- ServicePointManager.Expect100Continue = false;
- ServicePointManager.DefaultConnectionLimit = 100 (Or more)
As I previously mentioned in “Why Are WebRequests Throttled? I Want More Throughput!” the Windows Azure Storage team recommends that we disable Nagle for small messages that are less than 14 kilobytes. They also recommend that we augment the default connection limit. By default the limit is set to 2, which isn’t much for applications on the cloud. Another recommendation made by the team was to disable the expect 100 Continue response for requests you expect to succeed.
The use of the .Net 4.5 framework is greatly encouraged. A lot of work has gone into the Garbage Collector in order to drastically improve it. Using the latest framework is especially important for Worker Roles that continuously processes data.
Keep latency to a minimum. The recommendation to accomplish this, is to locate your Windows Azure Storage Accounts close to the consumers. Whether your data is used by computers or by your customers, you must take every precaution in order to keep them waiting as little as possible. In a post entitled “Size Matters!” I explored the idea of keeping the data being transferred over the network as small as possible. Using both of these techniques is imperative when it comes to responding to the clients as fast as possible.
Optimize what you send & receive. Working with Windows Azure Blob Storage Service, the Windows Azure Storage team recommends using range reads, using metadata and HEAD requests in order to limit the amount of data transferred over the network. When it comes to working with the Windows Azure Table Storage Service, they recommend using Upserts, Merge, Projection, Point Queries in order to minimize the number of requests and the volume of data that traveling over the network. And with Windows Azure Queue Storage Service, the Windows Azure team recommends to update messages when needed and to utilize batch sizes to fine tune the number of calls made by your application.
My previous post, “What if 20,000 Windows Azure Storage Transactions Per Second Isn’t Enough?”, talks about the scalability targets for Windows Azure Storage Accounts, which is a major topic throughout the session. It’s recommended to use multiple Windows Azure Storage Accounts in order to get the most out the the Windows Azure Storage Services. The presenter also recommended spreading your data over multiple datacenters. As mentioned above, placing data close to your users will not only give you the benefit of using multiple Windows Azure Storage Accounts, but it will also help to keep network latency to a minimum.
Pushing the concept a bit further, we should always try to distribute the load over many partitions to avoid spikes. Partitions are blobs, queues and partitions in Windows Azure Table Storage Service. A single blob has a target performance of 60 MBytes/sec. This means working with 2 blobs concurrently yields a possible performance of 120 MBytes/sec. By spreading your work over multiple partitions, the Windows Azure Storage Services is able to adapt, by load balancing it’s indexes, in order to reduce strain on itself and to provide the an acceptable performance.
Caching is a topic I have yet to blog about, but its crucial! Caching will enable you to satisfy more requests per second than a Windows Azure Storage Account which is limited to 20,000 requests per second. Accessing data on the Windows Azure Storage Services can be unpredictable, caching can help bring latency under control by providing more consistent response times. Finally, caching has an extra benefit that we usually do not think about, it provides a fallback solution in times when the origin data isn’t accessible for a short period of time.
Security is of great importance, you should always use HTTPS when you work with Windows Azure Storage Services.
Not controlling parallelism within your Cloud service can lead to many calls being made to the Windows Azure Storage Services and can ultimately lead your to being throttled by the service. Consequently, your application would stop being able to manipulate any data.
Debugging on the cloud is something that used to be complicated. Its strongly recommended to enable logging & metrics on each of your Windows Azure Storage Services. This can be done can be done via REST, Client API or portal. It enable clients to self diagnose issues, including performance related issues. It’s also important to note that data can be automatically disposed according to user specified retention policies. For example, you could have longer retention times for hourly metrics than for real-time metrics.
Best Practices For Blobs
- Try to match your read size with your write size and avoid reading small ranges from blobs with large blocks. You can use the following properties to control read and write size: CloudBlockBlob.StreamMinimumReadSizeInBytes / StreamWriteSizeInBytes
- Upload folder contents by uploading multiple files in parallel
- Upload files by uploading blocks in parallel
– Uploading concurrently means multiple workers upload different blobs
– Uploading in parallel means multiple workers upload different blocks for the same blob
- Uploading multiple blobs concurrently, will execute faster than uploading multiple blob blocks to the same blob. This is because uploading multiple blob black to a single blob in parallel affects a single partition and will be limited by the partition performance targets. Uploading multiple blobs in parallel will work on many different partitions and will probably be limited by the Virtual Machines bandwidth.
- A single blob is bound by the limits of a single partition
- Accessing multiple blobs concurrently scales really well
Best Practices For Table Storage
- For Critical Queries it’s important to Select PartitionKey and RowKey to avoid creating hotspots. Querying on any other field will result in full table scans. Tables scans are expensive and should be avoided at all costs for latency sensitive scenarios.
- Batch entities by PartitionKey in order to update them together.
- Since Windows Azure Table Storage Service is schema-less, you can store multiple types in the same table.
- Each table’s Index is based on the PartitionKey and the RowKey. If you need to query for other data, you will need to concatenate columns to form composite keys that you should store in the RowKey. In other circumstances, you may also want to store composite keys in the PartitionKey. If this happens, be use that you have more than one entity per partition. Store related entities together to reduce IO and improve performance.
- Entity Locality, as I mentioned in “Storing Windows Azure Storage Table Entities in Descending Order”, the PartitionKey and the RowKey are used to determines the sort order for your entities.
- Keep an eye on the Table Service Client Layer, the 2.1 version is said to have dramatic performance improvements and better NoSQL interface.
Best Practices For Queue Storage
- Make sure message processing idempotent, because messages become visible if the client worker fails to delete the message and the message risks getting processed more than once.
- Benefit from Updating messages: Extend visibility time based on message or save intermittent state
- Message Count should be used to scale workers
- Dequeue Count should be used to identify poison messages or the validity of invisibility time used
- Use Blobs to store large messages: Increase throughput by having larger batches containing smaller messages.
- Use multiple Queues to get more than a single queue partition to overcome performance target limitations