Archives For Blobs Storage


Never would have imagined that the laws of physics would be so important in a world where virtualization is the new normal.

Data Locality is Important

Data Locality, refers to the ability to move the computation close to the data. This is important because when performance is key, IO quickly becomes our number one bottleneck. Data access times vary from milliseconds to seconds because of many factors like hardware specifications and network capabilities.

Let’s explore Data Locality through the following Scenario. I have eight files containing data about multiple trucks, and I need to Identify trips. A trip consists of many segments, including short stops. So if the driver stops for coffee and starts again, this is still considered the same trip. The strategy depicted below is to read each file and to group data points by truck. This can be referred to as mapping the data. Then we can compute the trips for each group in parallel over multiple threads. This can be referred to as reducing the data. And finally, we merge the results in a single CSV file so that we can easily import it to other systems like SQL Server and Power BI.

Single Machine

The single machine configuration results were promising. So I decided to break it apart and distribute the process across many task Virtual Machines (TVM). Azure Batch is the perfect service to schedule jobs. Continue Reading…


Has Something Gone Wrong?

Generally, we choose to leverage Read-Access Geo-Redundant Azure Storage Accounts (RA-GRS) because we can use it as part of our disaster recovery (DR) plan. And sometimes, we forget that our devil is in the details. Disaster recovery (DR) plans are rarely tested and can cause headaches when they are. So let’s relieve some of those headaches.

Headache…

“Geo Replication Lag” for GRS and RA-GRS Accounts is the time it takes for data stored in the Primary Region of the storage account to replicate to the Secondary Region of the storage account. Because GRS and RA-GRS Accounts are replicated asynchronously to the Secondary Region, data written to the Primary Region of the storage account will not be immediately available in the Secondary Region. Customers can query the Geo Replication Lag for a storage account, but Microsoft does not provide any guarantees as to the length of any Geo Replication Lag under this SLA.

The Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are the first items that come up in DR discussions. When we use RA-GRS we control the RTO because we decide when to read from the secondary location. The RPO is a bit different because that can vary due to physics and load. The best way get current Recovery Point (RP) is to get the last sync time for the RA-GRS in question. This post is all about getting the right information, when we need it, because we need facts to make the right decisions. Continue Reading…


A Monster VM Azure Resource Manager Template

In April I wrote a post about building a monster Virtual Machine using PowerShell on Microsoft Azure. Since then, Microsoft has released version 2 of the Azure Resource Manager (ARM). This version allows us to define a Virtual Machine, its data disks and its Desired State Configuration (DSC) VM Extensions as a template. Seeing this as a great opportunity, I decided to convert my first PowerShell script to an ARM template that would create a Virtual Machine and striped data disk.

The Target Virtual Machine Configuration

16 Cores
112 GB of RAM
800 GB of local SSD for temp disk
32 TB for the data disk
50,000 IOPS for the data disk
512 MB per second for the data disk

Continue Reading…


In my last blog post I shared about how I use a Microsoft Azure Virtual Machine as my development machine. It’s been a month since I made this drastic change, and I must admit that it’s been pretty cool! The Azure Virtual Machine has provided me with more agility in my hectic schedule. I’ve been able to start my day from one machine, move to a different machine and continue from where I left off. I no longer deal with putting my machine to sleep or trying to remember where I was when I left off. When I log off from one machine, I log back in right where I left off.

It’s Getting Tight

As I move from project to project, the resource requirements vary, and sometimes I need more resources. Fortunately, we can scale our Virtual Machines up and down through PowerShell and the Azure Portal. Continue Reading…


Microsoft Azure is all about the opportunity to push back on known boundaries. In the last couple of months, I dealt with some scenarios where on-premises Data Centers ran out of capacity. Consequently, we were not able to push our tests as far as we would have liked. Taking the work loads to Microsoft Azure gave us the opportunity to stretch workloads to their limits.

In one of these scenarios, the workload was limited to 16 aging physical machines, and the time required to process the data was not acceptable. We needed to find ways to reduce the compute time, and had means to accomplish this on-premises. Pushing this workload to its limits, we deployed it on Microsoft Azure and provisioned it with well over 256 cores. We ended up processing workloads so fast that we now had the opportunity to run them multiple times a day. This newly discovered agility gave us the ability to refine the workload processes without disturbing ongoing business activities.

The first scenario was all about the lack of compute resources necessary to push a workload to its full potential. In a second scenario, we needed an impressive amount of resources within a single Virtual Machine. This blog post is all about how we created this Monster VM.

Building a Monster VM

The Requirement

16 Cores
112 GB of RAM
800 GB of local SSD for temp disk
32 TB for the data disk
50,000 IOPS for the data disk
512 MB per second for the data disk

Let’s think about that spec for a second. That’s monstrous! And we’re going to build it! Continue Reading…


The Challenge

As developers, we deal with lots of complexity, and this is a good thing. It forces us to be creative, and sometimes to go beyond our known universe to overcome challenges.

Microsoft Azure is designed to help us make the right choices. It imposes performance targets through a multitude of mechanisms like throttling and quotas. One of which, I’m sure you have come to know, is that we cannot scale a Cloud Service to zero instances. Let’s stop for a moment and think about this limitation for a second. How would you creatively overcome this challenge? Continue Reading…


Find The Closest #Azure Data Center

I regularly get asked about how to choose a Microsoft Azure Data Center for optimal deployments. The answer is to pick the Data Center that is nearest to your users. For applications that have a broad user base, it’s favorable to deploy multiple instances of the application and to use Microsoft Azure Traffic Manager to direct users to the closest Data Center. This typically provides the best user experience. Continue Reading…


Uploading Content to Azure Blobs

I use Azure Blobs on a regular basis. They’re generally really useful and help me through some tough situations. Working with Blobs is simple. You can interact with them using Visual Studio, third-party tools, REST and even through PowerShell.

The following PowerShell command demonstrates how I upload content to Azure Storage.

Set-BlobContent -StorageAccountName 'scaleupdowndemopkgs' `
                -StorageContainer 'packages' `
                -FilePath 'C:\Service\cloud_package.cspkg' `
                -BlobName 'extra_small_vm_cloud_package.cspkg'

Continue Reading…