Archives For Patterns

pexels-photo-122164

Getting to Know Containers

Containers have sparked genuine interest over the last few years. As a developer, I’ve had my fair share of “It Works on My Machine” days, where I spent an interesting amount of my time trying to identify why my code doesn’t run in a given environment. Did I make a mistake? Did someone else make a mistake? Uncertainty, risk and the Human Factor definitely make for adrenaline packed all-nighters. Continue Reading…

blue-abstract-balls-spheres-large

Quick Thoughts

Businesses need to be agile to compete in today’s global economy. Programmers use various tools and techniques in order to meet this business requirement. The challenge is great and quite complex. Going too fast without the right approach can lead to ephemeral success.

I believe that Microservices give us the agility and architectural patterns that empower us to scale and create value at a far greater pace for the business compared to using a traditional tiered architectures approach.

Forget about 3-tier architectures, they just doesn’t scale. Stateless services need to rebuild their internal state for every call, and they can generate tremendous pressure on data stores. Consequently, this generates back pressure that bubbles up through the layers of our solution and reaches out to the edge. Back pressure then translates into unavailable services. The key is Data Locality and Stateful Services.

statemonolithic-vs-micro

Continue Reading…

2016-07-09

Speaking at DevTeach 2016

Years ago, I attended the DevTeach conference and was fortunate to participate in conversations that helped me overcome many challenges over the years that followed. This week I had the opportunity to speak at DevTeach in Montreal. For this event, I chose a topic that I’m really passionate about and needed to cover a lot of ground in a short amount of time.

The talk had a progression from a public cloud, to an architectural pattern, to a hyper-scale microservice platform and finally about a programming model.

My goal with this talk is primarily to introduce Actors and Service Fabric. Then provide attendees with additional information in the downloadable slides about the patterns that I feel are important to consider when building microservices.

Caught by surprise, I had a full room and a lot of great questions. Thanks everyone for making this a success. Continue Reading…

startup-photos

The 95th Percentile

Imagine a reality, where you can detect and fix issues without your users noticing that something went wrong.

We all aspire to measure performance in some way, and choosing what to measure can be a challenge in itself. By default, we think about averages, and we forget that there are many other possible measurements. Continue Reading…

untitled

Never would have imagined that the laws of physics would be so important in a world where virtualization is the new normal.

Data Locality is Important

Data Locality, refers to the ability to move the computation close to the data. This is important because when performance is key, IO quickly becomes our number one bottleneck. Data access times vary from milliseconds to seconds because of many factors like hardware specifications and network capabilities.

Let’s explore Data Locality through the following Scenario. I have eight files containing data about multiple trucks, and I need to Identify trips. A trip consists of many segments, including short stops. So if the driver stops for coffee and starts again, this is still considered the same trip. The strategy depicted below is to read each file and to group data points by truck. This can be referred to as mapping the data. Then we can compute the trips for each group in parallel over multiple threads. This can be referred to as reducing the data. And finally, we merge the results in a single CSV file so that we can easily import it to other systems like SQL Server and Power BI.

Single Machine

The single machine configuration results were promising. So I decided to break it apart and distribute the process across many task Virtual Machines (TVM). Azure Batch is the perfect service to schedule jobs. Continue Reading…

pexels-photo-52910-large.jpeg

Big Compute or Big Data?

This question comes up on a fairly regular basis. So I thought it would be interesting to share my understanding in hopes to help you make the right decision.

Both are enablers, and they create opportunities through various approaches. When the problem is understood, and the algorithms vary by parameter, then Big Compute is definitely an approach to consider. When we know our input data, and are experimenting with various algorithms, Big Data is a clear winner.

This being said, let’s try to materialize this into something more concrete.

Big Compute shines at large scales. Easily parallelizable workloads are the best use cases, because they allow us to break the workload into independent tasks. This is where we can gain the most from large numbers of compute cores. Big Compute is all about executing any software package, written in any language by passing in variables. This creates an amazing opportunity for developers to optimize their code to be extremely efficient. Optimizations range from concurrency management, memory management, limiting IOPS and other aspects like network communication optimization. Possible scenarios are well known algorithms like Monte Carlo simulations, rendering and work flows.

Big Data is all about empowering us to experiment with our data by providing us with tools, query languages and scripting capabilities that are geared at giving us a lot of agility. Tinkering with algorithms, is the perfect use case. We know our data, and want to extract insights from it. This means that we’re going to clean it, shape it and question it. Big Data is built for this; it makes it possible to iterate through multiple versions of our algorithms in a way that’s difficult with Big Compute.

So now that we’ve nailed this down, which is right for your workload?

Share your thoughts in the comments below

Ice-Shot-Glasses

Microsoft Azure has a lot of options when it comes to provisioning environments and resources. In my last post about troubleshooting Virtual Machine allocations, I enumerated a few of these options and briefly mentioned that I favored destroying environments to shutting them down. In this post, I will try to give more context around my preference.
Continue Reading…

transition

The Greenfield to Brownfield Transition

There’s a special moment in every project where it transitions from a Greenfield (new) project to a Brownfield (legacy) project. This moment usually occurs close to the second release.

As developers, we love Greenfield projects. It’s a new adventure where creativity runs wild and free.

The transition of a project from Greenfield to a Brownfield is an event that we can all appreciate. It’s delicate and requires attention from everyone on the team. Continue Reading…

3617_9780735695658f_58F7BB03

Automate Everything!

These two words immediately caught my attention! This is one of the hardest things for me as a developer. I spend most of my time designing and coding away, but I hardly spend anytime with PowerShell… so I’ll be blunt, as a developer, I must learn PowerShell and I must change my Definition of Done (DoD). I am not done until my feature can be deployed repeatedly and reliably. Automation makes me sleep better at night because my deployments yield predictable results.

This new e-book  is all about taking advantage of what the cloud has to offer. It’s packed with best practices for DevOps, data storage and high availability. Since the authors took a pattern-based approach, each chapter can be read independently.

Download all formats (PDF, Mobi and ePub) as well as link to the companion content hosted by the Microsoft Virtual Academy.

time distortion

Using Time-based Partition Keys in #Azure Table Storage

In a previous post about storing Azure Storage Table entities in descending order I combined a time-based key with a guid in order to create a unique key. This is practical when you need to use combined keys for the Row Keys or Partition Key. But it’s not practical for logs.

A better solution for logs, is to generate a Partition Key based on time. This allows you to query for logs by time periods. There are many ways to generate time-based partitions, so I will cover the two that I use the most. Continue Reading…