I’ve been playing around with Windows Azure Blob Storage Service for a while now and I have noticed that using directories greatly facilitates debugging and maintenance.
Imagine a Container that contains 400 000 block blobs. The Windows Azure tools in Visual Studio 2012 allow you to list 100 000 blobs. When this limit is reached you are notified and there is nothing you can do to continue listing. At that moment you can use other tools like CloudBerry Explorer and Azure Management Studio. Both tools will allow you to browse the full list of blobs.
Listing 100 000 takes a considerable amount of time. Visual Studio lists blobs as a flat list, which isn’t very useful when you surpass 1000 blobs. CloudBerry Explorer and Azure Management Studio allow you to browse by directory.
The following blob contains information about the location of Microsoft Canada.
By its name, we know that this will be listed in the container root. Directories are built into Windows Azure Blob Storage Service. Enabling directory listing for this blob, we need to tweak its name.
Using the ‘/‘ in the blob name, we are creating a hierarchy that can be used to retrieve blobs by browsing through directories.
Listing the contents of the container using CloudBerry Explorer or Azure Management Studio will list the root directory from the blob name. In this case it will list "quebec". Listing the contents of this directory will result in a list of directories containing "montreal". The microsoft-canada-corporation.json blob will be accessible from this last directory.
Listing using directories
Listing without directories
As shown above, the directory tree is part of the blob name.
Using directories doesn’t change anything when you are accessing blobs directly. But if you need to browse or filter blobs then directories are the way to go.
Save yourself time by storing
- Blobs by categories and dates
- Batches of data in the same directory
By employing this practice, I have been able to save myself countless hours. I no longer have to wait monstrous amounts of time for blobs to be listed when I’m performing manual maintenance checks. I’m now able to browse through my data in a much more structure and efficient manner.
Directories Windows Azure Blob Storage Service are like directories on your hard drive, if they are used properly, they help organize your data. When you have too many files in a directory, listing the contents of the directory will take a reasonable amount of time. In the end, keeping reasonably sized directories will help maintain acceptable performance for code that is required to list contents of directories in order to perform its work.
The more blobs, the more you have to think about building an efficient hierarchy.