High Availability, and Disaster Recovery in Serverless solutions, does it matter?

Many a time when I talk to customers about Serverless architectures the topic of High Availability and Disaster Recovery comes up, the perception usually is that in a Serverless world we don’t care about these because, well; we are Serverless!

TL;DR: In this post, we discuss design and planning considerations for building robust serverless solutions. The goal is to focus on “what” should you plan when it comes to serverless and also provide an overview of “how” to design around these considerations. We cover an array of topics ranging from Ingestion EndPoint(s), Storage, Security and of course High Availability and Disaster Recovery. We also discuss governance considerations such as Compliance, Developer Experience and Release pipeline in the serverless world.

Note that while the references in this post refer to use of Azure Services, the concepts can be applied to any cloud provider and offerings.

Let’s get started!

Characteristics of a robust solution?

A robust solution can mean many things, to scope and level set our discussion we focus on the following characteristics of a solution:

Reliability: The system should work correctly in case of failures and faults.
Scalability: The system should be able to grow with same levels of performance if not better.
Maintainability: The system should organize itself to be productive and modifiable in the future.

Martin Kleppman describes these characteristics in amazing detail in his book Designing Data-Intensive Applications, a highly recommended read if you have anything to do with building scalable and quality software.

The Planning Sprint

The first question when thinking about Serverless comes around — do I need to plan for Serverless?. The answer is: If you are building a production quality solution then absolutely Yes.

The planning considerations, however, change from what you would do in a typical data center oriented architecture. The focus is not on how many servers I need to scale or how will I handle replication but rather what are the thresholds for the provider service and how much reliability can be provided by it. In my opinion, this is the right definition of being serverless; Your focus has changed to an abstraction of the underlying infrastructure, and you worry more about the service capabilities and thresholds than about the underlying hardware or virtualization.

Serverless Planning = Plan for service capabilities and thresholds and not for the infrastructure that runs the service.

I highly recommend that you run a planning sprint to determine requirements and how they will affect the provider service constraints. Firstly, a planning sprint (or Sprint 0) will give an opportunity to decide if Serverless makes sense for your workloads (the discussion of Serverless v/s Containers should happen here). Second, it also allows you to analyze the capabilities of the service to determine if you are choosing the right service for our job. Finally, it addresses concerns about geographical reach, compliance, and data sovereignty and future scale of the solution.

The what of Serverless

Below are areas to focus during the planning sprint, asking these questions allow us to look at things to consider when building serverless solutions:

These are guidelines and you may have more categories and questions based on unique requirements.

Ingestion Endpoint(s):

Understand how public and internal endpoints will handle requests.

Typical Size of a message? (Average, Max)?: Getting the average and max helps understand two aspects for scale: how much forecasted messages should I expect over a sustained period and what will be the max message size I need to accommodate during processing. This can impact which service you choose for consumption and processing of these messages, e.g., Azure IoT Hub today support message sizes up-to 256 KB so if you want to send larger messages, you will employ other techniques such as file upload or using Azure Storage Blobs. Understanding the size can also help decide if we split our messages before sending or can we trim the message itself so unnecessary bits are not sent over the wire improving bandwidth and processing times.

To know more about service throttles and limits for all Azure services, refer here

Who will send the messages? : Understand clients for your services or application:
- Plan for client(s) uniquely when required: The planning for a passive client like a browser will require fewer considerations around scale as compared to a Device which can send continuous streams of data and reach the throttle of a provider service quickly.
- Is there a direct connection (socket) possible?: This can determine how many active connections you need to configure in your public endpoint and whether the provider services will be able to handle them. It will also provide opportunity to tweak your services for optimum scale. For example, you can use the below configuration in Azure Functions to increase the incoming request pipeline and concurrent requests:

{
   "http":{
  "routePrefix":"api",
  "maxOutstandingRequests":20,
  "maxConcurrentRequests":10,
  "dynamicThrottlesEnabled":false
   }
}

Is there an intermediary between the client and endpoint: This could be a gateway server on-premise, an ISP VPN or a server running on Edge. The message processing and identity for messages coming from a gateway will be different as compared to a direct connection. Additionally, you may need to do IP Whitelisting, VNET configuration and hence need to understand if the provider service supports such functionality (refer to Offloading tasks to multiple services below for details on this topic).
- Burst mode: An additional thing to consider here is Burst mode scenarios where all the clients start sending messages at a particular time series increasing the load significantly and triggering a threshold for a downstream service. While true serverless services like Azure Functions are built to handle such scenarios, there could be a lag due to the provisioning of multiple consumption units and may also result in time-outs. In such a case you may want to dissect your workloads or move specific clients onto a dedicated tier to allow better distribution of the incoming requests.
- Offloading tasks to multiple services : In a Serverless Compute model we deal with Functions which can be considered as small single task that make up a business functionality. Since functions are lightweight, it is an good approach to offload some of the work required in message exchanges to other components. For example, when building REST API’s you can offload tasks like Load balancing, SSL Termination, Vanity URL, DNS resolution to an API Gateway, similarly you can offload authentication and authorisation to Identity service like Azure AD, the management of services can be offloaded to a service like API management. Finally, the geo-redundancy can be achieved using a service like Azure Traffic Manager. By using a bouquet of services, the Serverless Compute layer (aka Azure Functions) can focus solely on responding to triggers or handling events, and the remaining eco-system can work on ensuring the robustness of the solution.
What message format(s) are we dealing with?: The consideration here is whether the downstream services supports the message formats you want to send. Example, Azure IoT Hub today allows you to send binary data, but if you are analyzing data using Azure Stream Analytics (ASA), it supports CSV, JSON, AVRO as message formats today. So if you are sending data in BSON, or your proprietary format you will have to transform the payload before ASA can process the messages. You can use Azure Logic Apps to do the transformation, but now your architecture has changed and has more moving parts to manage.
Can we do batching at the site:: Batching small messages (e.g., Telemetry coming from a thermostat) is always recommended since it saves bandwidth and optimizes parallelism in downstream services. When possible try to batch, however, do consider the size limits of the service. Another consideration here whether the downstream service can process messages in batches since this can impact the load levelling of the solution. In most cases, this should not be a problem, but it is worth considering each service capability to process batches before making the decision.
Are there any workflow requirements to define a correlation between the data?: In the Serverless world, we are driving an Event Driven architecture for our solution. While the events model provides great loose coupling between our services, they also make it difficult to manage things like transactions, workflows, and correlation between components. Plan for how the incoming messages will need to be processed when state needs to be transferred across services. Ensure you use an orchestrator like Azure Logic Apps or Azure Durable Functions when component-to-component interaction is required. Additionally, leverage cloud patterns like Retry, Circuit Breaker, and Saga to ensure you can replay or rollback events in case of failures.
Are there any quality of service (QoS) attributes that apply to messages : Most Services provided by a cloud provider provide At-Least once messaging as a guarantee. This is primarily because of the considerations around the CAP theorem and the cost involved to build infrastructure that will provide higher guarantees. At-Least will work well for most interactions especially if you have appropriate retry logic and message handling employed in your service. If you have scenarios where message reliability is a must; first, think twice on why do you need such guarantees, in most case you won’t! In case you still convince yourself of a higher guarantee like exactly once, isolate the payloads that require such guarantee and use them sparingly.
Are there any specific protocols required to be able to send data to the cloud: A lot of IoT systems have custom protocols instead of the popular HTTP protocol. Consider the protocols supported by the provider services during service technical feasibility. In case the protocol is not supported you may have to build your custom protocol gateway layers which can impact your decision to use a Serverless service v/s building your custom component.
Frequency in which messages are being sent?: In a provider Service world, you are bound by the Units that you deploy to a Service (e.g., IoT Hub Units, Cosmos DB Request Units, Stream Units, etc.). The idea is a pattern known as the Scale Unit pattern which provides predictable monitoring and deployment of service units as your scale up, out or down. Since each service is bottled in a unit based model, you need to have consideration around how the incoming message will impact the units you have allocated for your service. But, in a true serverless world, this should not matter since the platform should automatically scale up or out, right? While it is true for Serverless services like Azure Functions (Consumption Plan), it does not apply to all services today. Also even in the case of core serverless services, there is going to be some degradation or lag when a new consumption unit gets deployed based on your load. While this lag is usually minimal (in ms), it can impact your response time if you are running a mission-critical application.

Storage

How data is stored and processed by downstream services.

Real Time v/s Batch Routing: Determine how the downstream systems will process the incoming messages. Your choice of service needs to align with how soon the data needs to be processed. For example, if you are processing data in motion, you need a near real-time service such as Azure Stream Analytics to process the data before it goes to other downstream systems. Similarly, if you are processing records over a period, you would instead want to employ Azure Time Series for processing. It is recommended to conceptually design your system using a model like the Lambda architecture and then decide which platform services match your requirements better.
Does the incoming data has any tags to classify / categories data?: Platform Services are getting more smarter as they learn about the needs of customers. You should explore features within these services that can provide out of box solution to complicated logic processing algorithm and then enrich your incoming messages to enable the use of the services. For example, if you want to route your incoming device data based on message properties or message body, IoT Hub provides a feature called Message Routing which can send messages to different downstream services based on a parameter in your message. It is handy if you are employing Hot Path vs. Cold Path Analytics since the same stream can now be sent to multiple downstream services without writing a single line of code.
Retention Policies and Archival: A lot of times planning for archival can be challenging but if you know how your data is growing and how much of it will move into cold storage you can employ some neat features provides by the platform services to reduce your cost and improve performance. For Example, Azure Storage Blob support a Tier based feature which allows you to move data from Hot, Cool to Archive Tiers, the pricing of each tier significantly varies and allows reducing data cost instead of using a single plan for both current and archival data.
Storage used by Serverless Compute: Azure Functions use storage accounts especially Blobs, Tables for its internal operations. What this means is that your Azure Function performance can be impacted by Storage limits and IOPS. Also, while developing Azure Functions, you need to plan for associated storage accounts including separating them per Function App, handling logs separately. If you are using Azure Durable Function, they leverage Azure Storage Queues for state management so you will need to consider additional implication when using Azure Functions.

Security

Leverage Security Threat models and Fault-Tolerant guidelines to prevent malicious attacks on your solution.

Transport and Messaging: Consider security at both layers
- Almost all provider services by default provide a secure transport channel of communication (HTTPS, AMQPS, etc.) for communication. Leverage this as a standard.
- Consider your authentication scheme and whether the service supports the negotiation through that scheme. Example, Azure Functions by default provide token-based security (Function, Anonymous, and Admin), if you need additional security such oAuth, you can leverage services like Azure API Management that can enable more secure scenarios.
- When thinking about using Third-party authentication schemes (e.g., Facebook Google, etc.) consider their SLAs. If you solely rely on a provider that has no SLA, your users may get locked out in case the external services goes down.
- Think end-end security and not just public endpoints. With a Serverless architecture, you will end up with a bunch of services that talk with each other to provide a solution. Some of these services may not have public endpoint however they still need to be secured to ensure the end-end protection of services.
Encryption and Encoding: Two key considerations if you have a system that encrypts or encode data when passing events between systems.
- Custom processes will be required if you are processing messages using platform services since they support standard formats
- The message size will increase and can impact overall persistence targets as well as response times because of encrypting/decrypting and encoding/decoding procedures.
Most platform services are secured at the transport layer so use these techniques sparingly for specific workloads where data security is a must. Note, I am not recommending here that you should loosen your security procedure but rather spend time in choosing your workloads and classify which messages need encryption. A way to plan for this is to build a Security and Fault Tolerance model and determine which messages can have a significant impact in case the system is compromised.
PII data: Whether your application has a public or internal endpoint; if users are accessing it, you need to think about their Privacy. This becomes a little tricky when using platform services since your solution is deployed on an infrastructure where the provider will also have a privacy policy. Understand the privacy policies described by the platform and align with your policy.
MTTR: Build a Mean Time To Respond strategy when it comes to security. You cannot stop hackers from always fiddling with your public services (especially if you are famous). With a Service provider, this becomes even lesser control for your organization. In the worst case scenario, your service or the platform provider service gets compromised, plan for a response strategy where you can limit the attack surface. For example, have proper monitoring in place and use analytics to determine variations in patterns, in case a change is detected block the impacted users, devices and issue patches through the automated build that limits the widespread of the issue.

Availability and Disaster Recovery

Availability out of box: The good thing about living in the serverless world is that you get availability out of the box. All services provide high availability capabilities and in most cases either autoscale or provide easy configuration to handle workloads. So technically, most of it is taken care. However, when thinking about availability, don’t restrict to the SLA provided by a “single” service; instead, focus on the end-end solution. Since we are dealing with multiple services, ensure that your solution uptime is not impacted by the aggregate SLA provided by a combination of provider services.
Transient Fault Handling: Serverless services provide some level of protection against transient failures through internal implementations of the Retry and Circuit Breaker patterns. For example, the WebJobs SDK which is the basis of Azure Functions provides these as part of the platform runtime.
- In addition to the default services, you can also use frameworks like Polly in your custom code to enable implementation of such patterns.
- Not all services provide transient fault handling capabilities so ensure you have appropriate measures on the calling end of the services. (e.g. EventHub triggers today does not have a automatic retry and the calling function needs to ensure retry logic.)
Disaster Recovery (DR):: There is minimal DR capabilities provided by the platform services today so if you are looking at a complete DR solution you will have to do extensive planning. Let’s look at some of these by breaking the DR conversation into the following components:
- Serverless Compute: Azure Functions lies under this umbrella and will constitute any custom code that you are running as part of your solution. A recommendation here is to use Stateless functions as much as possible, once you do that you can enable Disaster recovery by leveraging a service like Azure Traffic Manager. As of today, Azure Functions can be configured as an AppService in Traffic Manager and allows you to use any Routing strategy. Watch out for my next post on how to configure DR for Azure Functions to get more details.
- Data Replication: All data storage services in Azure include Azure SQL, CosmosDB, Azure Storage provide geo-replication of data across Azure data centers. These can be enabled to ensure that all data at rest can be moved to a different paired region and is available in case of a data center failure. Note that you will have to plan for the consistency pattern for the data based on your workloads, for example, if you choose eventual consistency there could be a possibility of data loss due to asynchronous replication.
- In-stream Processing: When we think about in-stream processing, I refer to Message queues and Job pipelines like Azure Stream Analytics. This is the tricky part when it comes to using provider services. Almost none of these services provide a message replication solution and even if they do there is minimal guarantees on data loss. Few ways to approach such situations are:
  - The first approach is to identify your workloads and see if they can live with the in-stream message data loss. So, basically losing messaging that are in the queue or currently under processing. This would require a robust client which can replay the message and is not possible in all scenarios.
  - Create a active-active cluster where the same message is directed to both data centers. While this will ensure message replication it can create problems around data duplication.
  - Some services like ServiceBus provides a mechanism where you can create NameSpace pairing to ensure primary data is copied to a secondary region in an asynchronous fashion.
- Service Availability: Last but not the least, ensure that the services that your leveraging are available in paired regions to enable a DR scenario. For example, Azure App Insights is currently available in Southeast Asia but not its paired region East Asia.
Throttling: Up-till now we have been discussing how to ensure the service is up and running, however, in some scenarios you want to assign thresholds to your service so that you can deny requests instead of continue to process them. Throttling pattern is a great way to ensure your service is healthy and not exceeding internal thresholds that you have set for service performance. In case of Serverless a lot of these is done for you by default. For example, based on the Unit model you select the provider service will automatically have a threshold defined and will issue HTTP 429 requests when the thresholds are reached. Additionally, when using Azure Functions in a Consumption plan you can put a throughput threshold per function to define when to throttle your endpoints. Plan for throttling and time-outs on your service to ensure the client have a predictable experience and can handle such response gracefully.

Maintenance

Tooling: One of the key considerations when it comes to Serverless will be whether there is sufficient tooling available for the development team to build an end-end solution. Several things to consider here:
- Programming Language: The choice of language will depend on whether the platform supports it. This becomes especially important when you have a development team with existing skills, for example GoLang is not supported by Azure Functions today. Also, some languages might be in experimental support and will not be ready for Production (e.g. TypeScript).
- Dependency Frameworks: The version of runtime frameworks that you need for your solutions will also be important. Example: Azure Function 1 runtime support Node 6.5.0 for production deployment, however the current LTS version is 9.6+.
- Cross-platform support: development teams who need to deploy on Linux and Windows need to ensure the runtime and Client SDKs are supported on required OS distributions.
- IDE support: check if the development tools are available and integral as part of the IDE. If not then look for third-party extensions available for the scenarios.
  
  A note for Visual Studio Code in case you are developing Azure Functions, it is perhaps the best cross-platform IDE available today with an intuitive Azure Function extension that makes development and deployment to Azure a breeze. If you have not checked it out, download it here.
- The DevOps cycle will be significantly impacted if you have don’t have the right tools in hand. Ensure that the service not just supports a Portal deployment but also command line and integration with CI / CD tools like Jenkins, VSTS, etc.
- Azure Pre-Compiled v/s Scripted functions: A note on Azure pre-compiled v/s Scripted functions. A lot of samples and videos that you see out there use the Azure Portal for development, when you develop in the Portal the function is called a Scripted function. While they are good for sample scenario, when developing a production system, I recommend you create a pre-compiled function using an IDE and deploy it using Azure tooling. A key reason is that scripted versions do not support versioning so every time you run such a function, the runtime creates an assembly and deploy it per function. This not only impacts scale but also makes it difficult to do change management for future iterations.
Monitoring: Another important aspect of Maintenance stems from how you monitor the system. The better the monitoring, the quicker you can find errors and issues and keep the system healthy. Few considerations when it comes to monitoring:
- End-End Telemetry: Most provider services have monitoring built-in which includes capturing events as well as monitoring dashboards such as Azure Monitor. While this is great from a particular service perspective when dealing with the entire solution you need to get data about event flows within the system and not just individual services. Services such as LogAnalytics and OMS greatly help in log aggregation and then displaying meaningful insights about the solution instead of just a single service. Additionally, Application Insights can be used to transmit custom logging data to these log aggregators to ensure end-end telemetry of the system can be obtained.
- Additionally, for custom logging scenarios leverage Semantic Logging frameworks that can assist with the integration of multiple sinks and channels without making changes to your logging API.

Compliance

Standard and Policies: : A Serverless solution is your solution running in a provider infrastructure so it is important to understand the implications around compliance and how much control and configuration you can enable.
- Provider Lock-In: The idea behind Serverless solution is to host your solutions in a provider environment. This by default encourages a provider lock-in since all of the services used by the solution will be specific to the vendor. But is that a bad thing? I would say it depends, in my experience a lot of customers who stick to a cloud do not move from too often or unless they experience some serious limitations and cost benefits. Since this is an infrequent action, I would suggest embracing the vendor services instead of being conservative and thinking about generic approaches. I do not say that because I work for a Cloud Provider, but instead, I have seen customers go down this rabbit hole of being generic and limiting their use of capabilities of a provider service resulting in a solution that could have been much better if they committed themselves to the provider service. This is a big decision for an organization though so carefully assess how you want to proceed.
  
  Azure Function is leading the way towards an Open direction by open sourcing the Function runtime; this enables sophisticated hybrid scenarios as well as portability to other clouds. Hopefully, other cloud vendors will be able to provide a standard runtime, so at-least the custom development on serverless can become portable.
- Regulations: In addition to lock-in, consider any legal implications of using the services.
  - Are there any standards or policies that are required to be adhered to for the data that is being persisted?
  - Are there any security standards need to be respected to ensure data security and compliance at rest?
  - Are there any requirements to ensure data is available in a specific region (e.g., all data must be persisted within a country)?
  Some of above questions can tailor or limit the use of provider services depending on their availability in a region so read the fine print carefully.

Understand the platform constraints

Apart from the customer requirements, it is essential to understand the limitations and throttles of the Serverless platform. This is critical since you are dealing with a bouquet of services and you would want to look at the end-end execution of operations to ensure you can get performance, scale, and fault-tolerance across the stack and not just for a specific service.

The Azure team has done a great job in providing best practices for most of Azure Services, you can check them out here:

Hope this post gave an in-depth tour of the considerations for a Serverless architecture. Finally, remember, as you delve into the Serverless solution, you would realize you have choices but you need to cognizant of each choice, and it can impact your long terms scalability, availability, and reliability of the solution.

Would love to hear you thoughts and comments and If you have guidelines or practices for developing serverless architectures, please do share 🙂 …

Building Serverless API’s with TypeScript and Azure Function Proxies

TL;DR: In this post, we build a microservice that uses Azure Functions and other awesome Serverless technologies provided by Azure. We will cover the following features:

Azure functions currently has support for TypeScript in preview and we will be using the current features available to develop a read/write REST API.
We leverage the Azure Function Bindings to define Input and Output for our functions.
We will look at Azure Function Proxies that provide a way to define consistent routing behavior for our function and API calls.

If you want to jump in; the source is available on GitHub here (https://github.com/niksacdev/sample-api-typescript).

TypeScript support for Azure Functions is in preview state as of now; please use caution when using these in your production scenarios.

Problem Context

We will be building a Vehicle microservice which provides CRUD operations for sending vehicle data to a CosmosDB document store.

The architecture is fairly straightforward and looks like this:

Let’s get started …

Setting up TypeScript support for Azure Functions

VSCode has amazingly seamless support for Azure Functions and TypeScript including a development, linting, debugging, and deployment extension, so it was a no-brainer to use that for our development. I use the following extensions:

Additionally, you will need the following to kick-start your environment:

azure-function-core-tools: You would need these for setting up the function runtime in your local development. There are two packages here, and if you are using a Mac environment like me, you will need the 2.0 preview version.
```
 npm install -g azure-functions-core-tools@core
		
```
Node.js (duh!): Note that the preview features currently works with 8.x.x. I have tried it on 8.9.4 which is the latest LTS (Latest LTS: Carbon), so you may have to downgrade using nvm if you are using the 9.X.X versions.

Interestingly, the Node version supported by Functions deployed in Azure is v6.5.0 so while you can locally play with higher versions you will have to downgrade to 6.5.0 when deploying to Azure as of today!

You can now use the Function Runtime commands or the Extension UI to create your project and Functions. We will use the Extension UI for our development:

Assuming you have installed the extension and connected to your Azure environment, the first thing we do is create a Function project.

Click on Create New Project and then select the folder that will contain our Function App.

The extension creates a bunch of files required for the FunctionApp to work. One of the key files here is host.json which allows you to specify configuration for the Function App. If you are creating HTTPTriggers, some settings that I would recommend tuning to improve your throttling and performance parameters:

{
    "functionTimeout": "00:10:00",
    "http": {
        "routePrefix": "api/vehicle",
        "maxOutstandingRequests": 20,
        "maxConcurrentRequests": 10,
        "dynamicThrottlesEnabled": false
    },
    "logger": {
        "categoryFilter": {
            "defaultLevel": "Information",
            "categoryLevels": {
                "Host": "Error",
                "Function": "Error",
                "Host.Aggregator": "Information"
            }
        }
    }
}

The maxOutstandingRequests can be used to control latency for the function by setting a threshold limit on the max request in waiting and execution queue. The maxConcurrentRequests allows control over concurrent http function requests to optimize resource consumption. The functionTimeOut is useful if you would want to override the timeout settings for the AppService or Consumption Plan which default limit of 5 minutes. Note that configuration in host.json are applied to all functions.

Also note that I have a custom value for route attribute (by default this is api/{functioname}). By adding the prefix, I am specifying that all HTTP functions in this FunctionApp will use the api/vehicle route. This is a good way to set the bounded context for your Microservice since the route will now be applied to all functions in this FunctionApp. You can also use this to define versioning schemes when doing canary testing. This setting can be used in conjunction with the route attribute in a function function.json, the Function Runtime appends your Function route with this default Host route.

Note that this behaviour can be simplified using Azure Function Proxies, we will modify these routes and explore more later in the Azure Function Proxies section.

To know more options available in host.json, refer here

Our project is now created, next, we create our Function.

Click Create Function and follow the onscreen instructions to create the function in the same folder as the Function App.

Since TypeScript is in preview, you will notice a (Preview) tag when selecting the language. This was a feature added in a new build for the extension, if you don’t see TypeScript as the language option, you can enable support for preview languages using the VSCode settings page, specify the following in your user settings:

“azureFunctions.projectLanguage": "TypeScript”

The above will use TypeScript as the default language and will skip the language selection dialog when creating a Function.

Select the HTTP Trigger for our API and then provide a Function Name.
Select Authorization as Anonymous .

Never use Anonymous when deploying to Azure

You should now have a function created with some boilerplate TypeScript code:

The function.json defines the configuration for your function including the Triggers and Bindings; the Index.ts is our TypeScript Function Handler. Since TypeScript is a transpiler, your Function needs the output .js files for deployment and not the .ts file. A common practice is to move these output files into a different directory so you don’t accidentally check them in. However, if you move them to a different folder and run the function locally you may get the following error:

vehicle-api: Unable to determine the primary function script. Try renaming your entry point script to 'run' (or 'index' in the caseof Node), or alternatively you can specify the name of the entry point script explicitly by adding a 'scriptFile' property to your function metadata.

To allow using a different folder, add a scriptFile attribute to your function.json and provide a relative path to the output folder.

Make sure to add the destination folder to .gitignore to ensure the output .js and .js.map files are not checked in.

"scriptFile": "../vehicle-api-output-debug/index.js"

The one thing that does not get added by default is a tsconfig.json and tslint.json. While the function will execute without these, I always feel that having these as part of the base setup helps in better coding practices. Also, since we are going to use Node packages, we will add a packages.json and install the TypeScript definitions for node

npm install @types/node —save-dev

We now have our Function and FunctionApp created, but there is one last step required before proceeding, setting up the debug environment. At this time, VSCode does not provide support for debugging Azure Functions written in TypeScript. However, you can enable support for TypeScript fairly easily. I came across this blog from Tsuyoshi Ushio that describes exactly how to do it.

Now that we have all things running, let’s focus on what our functions are going to do.

Building our Vehicle API

Developing the API is no different from your usual TypeScript development. From a Function perspective, we will split each operation into a Function. There is a huge debate whether you should have a monolith function API or a per operation (GET, POST, PUT, DELETE) API. Both approaches work, but I feel that within a FunctionApp you should try to segregate the service as much as possible, this is to align with the Single Responsibility Principle. Also, in some cases, you may achieve better scale by implementing a pattern like CQRS where your read and write operations go to separate functions. On the flip side, too many small Functions can become a management overhead, so you need to find the right balance. Azure Function Proxies provide a way to surface multiple endpoints using a consistent routing behavior, we will leverage this for our API in the discussion below.

In a nutshell, a FunctionApp is a Bounded Context for the Microservice, each Function is an operation exposed by that Microservice.

For our Vehicle API we will create two functions:

vehicle-api-get
vehicle-api-post

You can also create a Put, Delete similarly.

So, how do we make sure that each API is called only for the designated REST operation? You can define this in the function.json using the methods array.

For example, the vehicle-api-get is a HTTP GET operation and will be configured as below:

{
      "authLevel": "anonymous", --DONT DO THIS
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "route":"",
      "methods": [
        "get"
      ]
},

Adding CosmosDB support to our Vehicle API

The following TypeScript code allows us to access a CosmosDB store and retrieve data based on a Vehicle Id. This represents the HTTP GET operation for our Vehicle API.

import { Collection } from "documentdb-typescript";

export async function run(context: any, req: any) {
    context.log("Entering GET operation for the Vehicle API.");
    // get the vehicle id from url
    const id: number = req.params.id;

    // get cosmos db details and collection
    const url = process.env.COSMOS_DB_HOSTURL;
    const key = process.env.COSMOS_DB_KEY;
    const coll = await new Collection(process.env.COSMOS_DB_COLLECTION_NAME, process.env.COSMOS_DB_NAME, url, key).openOrCreateDatabaseAsync();

    if (id !== 0) {
        // invoke type to get id information from cosmos
        const allDocs = await coll.queryDocuments(
            {
                query: "select * from vehicle v where v.id = @id",
                parameters: [{name: "@id", value: id }]
            },
            {enableCrossPartitionQuery: true, maxItemCount: 10}).toArray();

            //  build the response
            context.res = {
                body: allDocs
                };
    } else {
                context.res = {
                    status: 400,
                    body: `$"No records found for the id: {id}"`
                };
    }

    // context.done();
}

Using Bindings with CosmosDB

While the previous section used code to perform the GET operation, we can also use Bindings for CosmosDB that will allow us to perform operations on our CosmosDB Collection whenever the HTTP Trigger is fired. Below is how the HTTP POST is configured to leverage the Binding with CosmosDB:

{
  "disabled": false,
  "scriptFile": "../vehicle-api-output-debug/vehicle-api-post/index.js",
  "bindings": [
    {
      "authLevel": "anonymous", --DONT DO THIS
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "route": "data",
      "methods": [
        "post"
      ]
    },
    {
      "type": "documentDB",
      "name": "$return",
      "databaseName": "vehiclelog",
      "collectionName": "vehicle",
      "createIfNotExists": false,
      "connection": "COSMOS_DB_CONNECTIONSTRING",
      "direction": "out"
    }
  ]
}

Then in your code, you can simply return the incoming JSON request and Azure Function takes care of pushing the values into CosmosDB.

export function run(context: any, req: any): void {
    context.log("HTTP trigger for POST operation.");
    let err;
    let json;
    if (req.body !== undefined) {
        json = JSON.stringify(req.body);
    } else {
        err = {
            status: 400,
            body: "Please pass the Vehicle data in the request body"
        };
    }
    context.done(err, json);
}

OneClick deployment to Azure using VSCode Extensions

Deployment to Azure from the VSCode extension is straightforward. The interface allows you to create a FunctionApp in Azure and then provides a step by step workflow to deploy your functions into the FunctionApp.

If all goes well, you should see output such as below.

Using Subscription "".
Using resource group "".
Using storage account "".
Creating new Function App "sample-vehicle-api-azfunc"...
>>>>>> Created new Function App "sample-vehicle-api-azfunc": https://<your-url>.azurewebsites.net <<<<<<

00:27:52 sample-vehicle-api-azfunc: Creating zip package...
00:27:59 sample-vehicle-api-azfunc: Starting deployment...
00:28:06 sample-vehicle-api-azfunc: Fetching changes.
00:28:14 sample-vehicle-api-azfunc: Running deployment command...
00:28:20 sample-vehicle-api-azfunc: Running deployment command...
00:28:26 sample-vehicle-api-azfunc: Running deployment command...
00:28:31 sample-vehicle-api-azfunc: Running deployment command...
00:28:37 sample-vehicle-api-azfunc: Running deployment command...
00:28:43 sample-vehicle-api-azfunc: Running deployment command...
00:28:49 sample-vehicle-api-azfunc: Running deployment command...
00:28:55 sample-vehicle-api-azfunc: Running deployment command...
00:29:00 sample-vehicle-api-azfunc: Running deployment command...
00:29:06 sample-vehicle-api-azfunc: Running deployment command...
00:29:12 sample-vehicle-api-azfunc: Running deployment command...
00:29:17 sample-vehicle-api-azfunc: Running deployment command...
00:29:24 sample-vehicle-api-azfunc: Syncing 1 function triggers with payload size 144 bytes successful.
>>>>>> Deployment to "sample-vehicle-api-azfunc" completed. <<<<<<

HTTP Trigger Urls:
  vehicle-api-get: https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle-api-get

Some observations:

The extension bundles everything in the App folder including files like local.settings.json and the output .js directories, I could not find a way to filter these using the extension.
Another problem that I have faced is that currently neither the extension or the CLI provides a way to upload Application Settings as Environment Variable so they can be accessed by code once deployed to Azure, so these have to be manually added to make things work. For this sample, you will need to add the following key-value pairs in the FunctionApp -> Application Settings added through the Azure Portal so they can be available as Environment Variables!

"COSMOS_DB_HOSTURL": "https://your cosmos-url:443/",
"COSMOS_DB_KEY": "your-key",
"COSMOS_DB_NAME":"your-db-name",
"COSMOS_DB_COLLECTION_NAME":"your-collection-name"
"COSMOS_DB_CONNECTIONSTRING":"your-connection-string"

If you are only running it locally, you can use the local.settings.json, there is also a way through CLI to publish the local settings values into Azure using the --publish-local-settings flag, but hey there is a reason these are local values!
The Node version supported by Azure Functions is v6.5.0 so while you can locally play with higher versions, you will have to downgrade to 6.5.0 as of today.

In case you guys have a better way to deploy to Azure, do let me know :).

Configuring Azure Function Proxies for our API

At this point, we have a working API available in Azure. We have leveraged the CQRS approach (loosely) to have a separate Read API and a separate Write API, to the client, however, maintaining code with multiple endpoints can quickly become cumbersome. We need a way to package our API into a facade that is consistent and manageable, this is where Azure Function Proxies comes in.

Azure Function Proxies is a toolkit available as part of the Azure Function stack and provide the following features.:

Building consistent routing behavior for underlying functions in the FunctionApp and can even include external endpoints.
Provides a mechanism to aggregate underlying apis into a single API facade. In a way, it is a lightweight Gateway service to your underlying Functions.
Provide a MockUp Proxy to test your endpoint without having integration points. This is useful when testing the request routing with dummy data.
One of the key aspects added to Proxies is support for OpenAPI which allows more out of box connectors to other services.
Support for Out of Box AppInsights support where a proxy can publish events to AppInsights to generate endpoint metrics for not just functions but also for legacy API’s.

If you are familiar with the Application Request Routing (ARR) stack in IIS, this is somewhat similar. In fact, if you look at the Headers and Cookies for the request processed by the Proxy, you should see some familiar attributes 😉
......
Server →Microsoft-IIS/10.0
X-Powered-By →ASP.NET
......
Cookies: ARRAffinity

Let’s use Function Proxies for our API.

In the previous sections, I showed how we could use the routePrefix in host.json in conjunction with route in function.json. While that approach works, we have to add configuration for each function which can become a maintenance overhead. Additionally, if I want an external API to have the same route path that will not be possible using the earlier approach. Proxies can help overcome this barrier.

Using proxies, we can develop logical endpoints while keeping the configuration centralized. We will use Azure Function Proxies to surface our two functions as a consistent API Endpoint, so essentially to the client, it will look like a single API interface.

Before we continue, we will remove the route attributes we added to our functions and only keep the variable references and change the routeprefix to just "". Our published Function Endpoint(s) now should look something like this:

Http Functions:
        vehicle-api-get: https://sample-vehicle-api-azfunc.azurewebsites.net/{id}
        vehicle-api-post:https://sample-vehicle-api-azfunc.azurewebsites.net/vehicle-api-post/

This is obviously not intuitive, with multiple Functions it can become a nightmare for the client to implement our Service. We create two Proxies that will define the route path and match criteria for our Functions. You can easily create proxies from the Azure UI Portal, but you can also create your proxy.json. The below shows how to define proxies and associate with our Functions.

  {
    "$schema": "http://json.schemastore.org/proxies",
    "proxies": {
        "VehicleAPI-Get": {
            "matchCondition": {
                "route": "api/vehicle/{id}",
                "methods": [
                    "GET"
                ]
            },
            "backendUri": "https://sample-vehicle-api-azfunc.azurewebsites.net/{id}"
        },
        "VehicleAPI-POST": {
            "matchCondition": {
                "route": "/api/vehicle",
                "methods": [
                    "POST"
                ]
            },
            "backendUri": "https://sample-vehicle-api-azfunc.azurewebsites.net/vehicle-api-post"
        }
    }
}

As of today, there is no upload proxy.json functionality in Azure but you can easily copy paste into the Portal Advanced Editor.

We have two proxies defined here. The first is for our GET operation and the other for POST. In both cases, we have been able to define a consistent routing mechanism for selected REST verbs. The key attributes here are the route and backendUri which allows us to map a public route to an underlying endpoint. Note that the backendUri can be anything that needs to be called under the same API facade, so we can club multiple services through a common gateway routing using this approach.

Can you do this with other Services, I would have to say, Yes. You can implement similar routing functionality with Application Gateway, NGINX and Azure API Management. You can also use an MVC framework like Express and write a single function that can do all this routing. So, evaluate the options and choose that works best for your scenario.

Testing our Vehicle API

We now have our Vehicle API endpoints exposed through Azure Function Proxies. We can test it using any HTTP Client. I use Postman for the requests, but you can use any of your favorite clients.

GET Operation

The exposed endpoint from the Proxy is:

https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle/{id}

Our GET request fetches the correct results from CosmosDB

POST Operation

The exposed endpoint from the Proxy is:

https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle/

Our POST request pushes a new record into CosmosDB:

There we have it. Our Vehicle API that leverages Azure Function Proxies and TypeScript is now up and running!

Do have a look at the source code here (https://github.com/niksacdev/sample-api-typescript) and please provide your feedback.

Happy Coding :).

The Zero Config MicroService using Kubernetes and Azure KeyVault

Mom said don’t talk to strangers, in the new world Mom said never share your secret key!

TL;DR: In this blog post, we demonstrate the value of Centralised configuration and secret stores by leveraging Kubernetes and Azure KeyVault for an ASP.NET Core microservice. It allows the team to create a toolchain that developers and ops engineers can use to entirely avoid creation and management of configuration files (appSettings.json, blah.xml) and focus more on the actual application development.

If you are more interested in the source code and how to setup all pieces, you can jump directly to the GitHub repo here(https://github.com/niksacdev/samples.microservice “The Zero Config Microservice”).

The sample provides the following capabilities:

A sample microservice project to demonstrate the use of Azure KeyVault and Kubernetes ConfigMaps for Configuration, it stresses on the importance of separating DevOps functions from Developer functions by having no appSettings, secret.json, blah.json Files in code. All data is injected through Kubernetes or asked specifically from Azure Key Vault.

Other features in the sample:

use of Serilogfor structured logging,

use of the repository for Azure CosmosDB, a generic repository that can be used and extended for CosmosDB Document operations.

deployment of asp.net core microservice container to Kubernetes

The nightmare begins…

It’s 2:00 AM, Adam is done making all changes to his super awesome code piece, the tests are all running fine, he hit commit -> push -> all commits pushed successfully to git. Happily, he drives back home. Ten mins later he gets a call from the SecurityOps engineer, – Adam did you push the Secret Key to our public repo?

YIKES!! that damn blah.config file Adam thinks, how could I have forgotten to include that in .gitignore, the nightmare has already begun ….

We can surely try to blame Adam here for committing the sin of checking in sensitive secrets and not following the recommended practices of managing configuration files, but the bigger question is that if the underlying toolchain had abstracted out the configuration management from the developer, this fiasco would have never happened!

The virus was injected a long time ago…

Since the early days of .NET, there has been the notion of app.config and web.config files which provide a playground for developers to make their code flexible by moving common configuration into these files. When used effectively, these files are proven to be worthy of dynamic configuration changes. However a lot of time we see the misuse of what goes into these files. A common culprit is how samples and documentation have been written, most samples out in the web would usually leverage these config files for storing key elements such as ConnectionStrings, and even password. The values might be obfuscated but what we are telling developers is that “hey, this is a great place to push your secrets!”. So, in a world where we are preaching using configuration files, we can’t blame the developer for not managing the governance of it. Don’t get me wrong; I am not challenging the use of Configuration here, it is an absolute need of any good implementation, I am instead debating the use of multiple json, XML, yaml files in maintaining configuration settings. Configs are great for ensuring the flexibility of the application, config files, however, in my opinion, are a pain to manage especially across environments and soon you end up in Config Hell.

A ray of hope: The DevOps movement

In recent years, we have seen a shift around following some great practices around effective DevOps and some great tools (Chef, Puppet) for managing Configuration for different languages. While these have helped to inject values during CI/CD pipeline and greatly simplified the management of configuration, the blah.config concept has not completely moved away. Frameworks like ASP.NET Core support the notion of appSettings.json across environments, the framework has made it very effective to use these across environments through interfaces like IHostingEnvironment and IConfiguration but we can do better.

Clean code: Separation of Concerns

One of the key reasons we would want to move the configuration away from source control is to delineate responsibilities. Let’s define some roles to elaborate this, none of these are new concepts but rather a high-level summary:

Configuration Custodian: Responsible for generating and maintaining the life cycle of configuration values, these include CRUD on keys, ensuring the security of secrets, regeneration of keys and tokens, defining configuration settings such as Log levels for each environment. This role can be owned by operation engineers and security engineering while injecting configuration files through proper DevOps processes and CI/CD implementation. Note that they do not define the actual configuration but are custodians of their management.
Configuration Consumer: Responsible for defining the schema (loose term) for the configuration that needs to be in place and then consuming the configuration values in the application or library code. This is the Dev. And Test teams, they should not be concerned about what the value of keys are rather what the capability of the key is, for example, a developer may need different ConnectionString in the application but does not need to know the actual value across different environments.
Configuration Store: The underlying store that is leveraged to store the configuration, while this can be a simple file, but in a distributed application, this needs to be a reliable store that can work across environments. The store is responsible for persisting values that modify the behavior of the application per environment but are not sensitive and does not require any encryption or HSM modules.
Secret Store: While you can store configuration and secrets together, it violates our separation of concern principle, so the recommendation is to leverage a separate store for persisting secrets. This allows a secure channel for sensitive configuration data such as ConnectionStrings, enables the operations team to have Credentials, Certificate, Token in one repository and minimizes the security risk in case the Configuration Store gets compromised.

The below diagram shows how these roles play together in a DevOps Inner loop and Outer loop. The inner loop is focussed on the developer teams iterating over their solution development; they consume the configuration published by the outer loop. The Ops Engineer govern the Configuration management and push changes into Azure KeyVault and Kubernetes that are further isolated per environment.

Kubernetes and Azure KeyVault to the rescue

So we have talked a lot about governance elements and the need to move out of configuration files, lets now use some of the magic available in Kubernetes and Azure Key Vault to implement this. In particular, we would be using the following capabilities of these technologies:

Why not just use Kubernetes you may ask?

That’s a valid question since Kubernetes supports both a ConfigMap store and Secret store. Remember our principle around the separation of concerns and how we can ensure enhanced security by separating out configuration from secrets. Having secrets in the same cluster as the configuration store can make them prone to higher risks. An additional benefit is Maintainability. Azure KeyVault gives the ability to provide a distributed “Secure Store as a Service” option that provides not just secret management but also Certificates and Key management as part of the service. The SecOps engineers can lock down access to the store without need of cluster admins permissions which allows for clear delineation of responsibilities and access.

PS: There is a discussion going on in Kubernetes groups to provide under the hood support for saving Kubernetes secrets directly in Azure KeyVault, “when” and “if” that happens we will have the best of both worlds implemented using the abstractions of Kubernetes, fingers crossed!

Let’s get to our scenario now.

The Scenario

We will be building a Vehicle microservice which provides CRUD operations for sending vehicle data to a CosmosDB document store. The sample micro-service needs to interact with the Configuration stores to get values such as connectionstring, database name, collection name, etc. We interact with Azure KeyVault for this purpose. Additionally, the application needs the Authentication token for Azure Key Vault itself, these details along with other Configuration will be stored in Kubernetes.

If you are more interested in the source code and how to setup all pieces, you can jump directly to the GitHub repo here.

Mapping the above diagram to our roles and responsibilities earlier:

The Ops engineer/scripts are the Configuration Custodian and they are the only ones who work in the outer loop to manage all the configuration. They would have CI/CD scripts that would inject these configurations or use popular framework tools to enable the insertion during the build process.
The Vehicle API is the ASP.NET Core 2.0 application and is the Configuration Consumer here; the consumer is interested in getting the values without really worrying about what the value is and which environment it belongs to. The ASP.NET Core framework provides excellent support for this through its Configuration extensibility support. You can add as many providers as you like and they can be bound an IConfiguration object which provides access to all the configuration. In the below code snippet, we provide the configuration to be picked up from environment variables instead of a configuration file. The ASP.NET Core 2.0 framework also supports extensions to include Azure KeyVault as a configuration provider, and under the hood, the Azure KeyVault client allows for secure access to the values required by the application.

 // add the environment variables to config
  config.AddEnvironmentVariables();

 // add azure key vault configuration using environment variables
 var buildConfig = config.Build();

 // get the key vault  uri
 var vaultUri = buildConfig["kvuri"].Replace("{vault-name}",   buildConfig["vault"]);
 // setup KeyVault store for getting configuration values
 config.AddAzureKeyVault(vaultUri, buildConfig["clientId"], buildConfig["clientSecret"]);

AzureKeyVault is the Secret Store for all the secrets that are application specific. It allows for the creation of these secrets and also managing the lifecycle of them. It is recommended that you have a separate Azure KeyVault per environment to ensure isolation. The following command can be used to add a new configuration into KeyVault:

#Get a list of existing secrets
az keyvault secret list --vault-name  -o table  

#add a new secret to keyvault
az keyvault secret set -n MySecret --value MyValue --description "my custom value" --vault-name

Kubernetes is the Configuration Store and serves multiple purposes here:

Creation of the ConfigMaps and Secret: Since we are injecting the Azure KeyVault authentication information using Kubernetes, the Ops engineer will provide these values using two constructs provided in the Kubernetes infrastructure. ConfigMaps and Secrets. The following shows how to add config maps and secrets from the Kubernetes command line:

kubectl create configmap vault --from-literal=vault=   
kubectl create configmap kvuri --from-literal=kvuri=https://{vault-name}.vault.azure.net/
kubectl create configmap clientid --from-literal=clientId= 
kubectl create secret generic clientsecret --from-literal=clientSecret=

The clientsecret is the only piece of secure information we store in Kubernetes, all the application specific secrets are stored in Azure KeyVault. This is comparatively safer since the above scripts do not need to go in the same git repo. (so we don’t check them in by mistake) and can be managed separately. We still control the expiry of this secret using Azure KeyVault, so the Security engineer still has full control over access and permissions.

Injecting Values into the Container: During runtime, Kubernetes will automatically push the above values as environment variables for the deployed containers, so the system does not need to worry about loading them from a configuration file. The Kubernetes configuration for the deployment looks like below. As you would notice, we only provide a reference to the ConfigMaps and Secret that have been created instead of punching in the actual values.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: vehicle-api-deploy #name for the deployment
  labels:
    app: vehicle-api #label that will be used to map the service, this tag is very important
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vehicle-api #label that will be used to map the service, this tag is very important
  template:
    metadata:
      labels:
        app: vehicle-api #label that will be used to map the service, this tag is very important
    spec:
      containers:
      - name: vehicleapi #name for the container configuration
        image: <yourdockerhub>/<youdockerimage>:<youdockertagversion> # **CHANGE THIS: the tag for the container to be deployed
        imagePullPolicy: Always #getting latest image on each deployment
        ports:
        - containerPort: 80 #map to port 80 in the docker container
        env: #set environment variables for the docker container using configMaps and Secret Keys
        - name: clientId
          valueFrom:
            configMapKeyRef:
              name: clientid
              key: clientId
        - name: kvuri
          valueFrom:
            configMapKeyRef:
              name: kvuri
              key: kvuri
        - name: vault
          valueFrom:
            configMapKeyRef:
              name: vault
              key: vault
        - name: clientsecret
          valueFrom:
            secretKeyRef:
              name: clientsecret
              key: clientSecret
      imagePullSecrets: #secret to get details of private repo, disable this if using public docker repo
      - name: regsecret

There we go! We now have a micro-service that is driven by the powers of DevOps and eliminates the need for any configuration files. Do check out the source code in the Github repo and share your feedback.

happy coding 🙂 …

Architecture Blueprint Canvas for IoT

Architecting IoT solutions can become complex very quickly considering the various components involved and multiple interactions between device-services-external sources. In this post, I would like to share a set of Visio templates with the goals to assist a collaborative and brainstorming discussion during IoT design workshops and architectural discussions with customers and teams.

The templates are based on the building blocks of Azure IoT reference architecture but extend it to other attributes such as integration, extensibility etc and can be used for any IoT platform.

The templates are available here: https://github.com/niksacdev/abcforiot

Would love to hear your feedback on relevance on any improvements we can make to these.☺

The template provides the following features:

A systematic approach for eliciting business and technical requirements, identifying priorities and recording decisions and next steps for the following components in an IoT Solution: Devices, Messages, Field Gateway, Protocol Gateway, Device Registry, Device Management, Hot Path Analytics, Warm Path Analytics, Cold Path Analytics, Reporting, Client Applications, Integration and Extensibility.
It allows evaluating nonfunctional requirements including Security, Scalability, Supportability, and Governance.
IoT project teams can expedite their planning and requirement elicitation phase, identify dependencies and design constraints, and solidify design and architecture elements of their IoT Solution.

There are two flavors of the templates available today:

Basic: This is primarily the bare bone template for architecting a solution without documenting any decision points or brainstorming discussion. It is targeted towards an individual architect or a team that has already formulated an architecture and want to document their design across the different IoT components.
Collaborative: This is an advanced version of the basic template. It allows for team collaboration and brainstorming by providing sections for taking a decision and defining action items. A team can leverage the template to brainstorm ideas for problems for each component and then document them as a design decision in the template itself. For more information on how to use the template, refer here

A future goal is to develop some pre-configured scenarios (e.g. Connected Car, Manufacturing etc.) using these templates that can then be used to jump-start IoT Architecture discussions.

The Litmus Test: is your company ready for MicroServices?

A large manufacturing customer recently asked me the question: is microservices the right architecture for our project?

It made me think: it could be but are you the right organization for a microservices architecture?

I agree with a lot of pundits out there that microservices have been a significant disruption in building, deploying and managing complex distributed solutions especially after we identified that the glorified large codebases we built only a few years back were actually evil monoliths now bleeding time and money!

Most discussions around microservices generally lean towards technologies like Design Patterns, Continuous Deployments, Containers, and Orchestrators, etc. However, in my opinion, the move to a microservices-based solution has less to do with the technology adoption and more with the processes within an organization, to put it in other words:

We are treating MicroServices as a solution to a technology problem, rather, it is a technology-aided solution to an organization problem.

In this post, I summarize few guidelines for an organization to consider before thinking about moving to a microservices architecture. I also introduce an approach I call as the Rapid Classification Model for microservices adoption that provides a systematic maturity model for organizations to evaluate and progress in this space. The text is not groundbreaking, nor is it comprehensive, in-fact technology leaders like Martin Fowler , communicate these in almost of all their talks. However, I still see customers being oblivious to this simple truth and jump on the microservices bandwagon only to realize the hard way that they are not ready.

Symptoms of an anti-progressive MicroService organization:

Sorry for the harsh term anti-progressive but I think it’s important for companies to realize the factors which are not letting them move towards a microservices architecture. Let’s start with discussing the symptoms of organizations that may not be ready to make progress to a microservices world as yet. These symptoms highlight the behavior and culture of an organization and how it can impact their decision or move towards a microservices paradigm. While these have nothing to do with the technical aspects of a solution, these factors significantly contribute to the delays that may occur when developing a microservices type solution within these organizations:

Company profile: The first symptom can be fairly obvious, it can be evaluated from the company profile and business domain whether a company would move to adopt a newer paradigm quickly. For example, a 50-year global organization with an average employee force of 10+ years retention will have much bigger challenges to adopting a new technology option as compared to a local mid-size company. Additionally, domains like manufacturing, health care incorporate various legal and compliance norms which can make decision and approval processes slow.
Who’s the boss: In my experience, a dynamic C-Exec driving the change and adoption strategy is always more influential that a hierarchical board or middle-level managers. Wait, what has this to do with MicroServices? Let’s take an example: The CEO of a company initiated using a chat platform for company-wide communication and encouraged its adoption to make it a success. A Developer from the IT team who was doing his personal evaluation posted an idea about a cool feature in Docker that can simplify multi-stage deployment and how it can save the project team time. The Developer Head saw the post and started a small team to see value in the idea, and if can be re-used, that led the incubation of a team to validate the concept, and they realized some instant saving for newer projects. The structure and organization of a company significantly influence the choices and hence the ability to transform into a microservices favorable organization.
Product life-cycle: Any existing product company is always seeking improvements and efficiencies however the type of industry they work in can influence the culture the teams drive regarding releases and deployments. In a Car Manufacturing plant, a typical release cycle was 1-2 year(s) including a 6-month assembly line testing. It works well for equipment, hardware perspective but, because of the culture, the IT team also followed the same model, so the software release cycles were also aligned with model releases. Now, one of the tenants for a microservices architecture is to release stuff fast and engage a continuous learning cycle. An environment, where the versions take 1-2 years will take more time to adapt a more frequent release cycle with Continuous Integration schedule.
Risk takers vs. Risk Averse: does the word Cloud scare the sh’t out of the company CIO? Does your company wait till the next-next version of an OS is released before the deploy a new update? Are you still working on 2003 version of Microsoft Word? These are all signs that the organization is averse towards taking up new technologies and may not be ready for a microservices paradigm that allows developers to embrace new technologies and (if implemented correctly) allow complete technology stack changes without having any downtime to existing services.
Divided we fall: Teams are such an important aspect of a MicroServices based culture that they need to be onboard to have a “solution” strategy and not a single team or owner strategy towards MicroServices(s).
The “Wannabe Syndrome”: In a recent interaction, a hyperactive CIO called us to discuss their next big project that can leverage microservices since he heard that companies like Google, Microsoft, Amazon, and Netflix are leveraging it. Just because it is in the news, does not mean you can implement it in your environment. Depending on the above-discussed factors, it may take a lot of process change before we can commit to a real microservices architecture. Proper process planning and a defined roadmap need to be in place to have successful microservices architecture, this may mean changing processes, shuffling teams, hiring and firing people and purchasing new systems and services. Note that the effort may start as a single team effort to prove value. However, there should be appropriate buy-in on implementing the strategy for the “solution.” Having a half-monolith without an upgrade plan is perhaps, even more, worse and time-consuming in the long run than the monolith itself.

Note that none of the above references stereotype or categorize a particular domain or industry as not being aligned with leveraging MicroServices, these are merely personal observations and there will always be exceptions to these.

MicroServices Adoption Guidelines

Ok, so we have looked at what could prevent an organization from efficiently moving towards the great microservices world. But is there a cure? Well there are certainly few adoption considerations that can assist companies wanting to make progress towards a microservices architecture, let’s talk about some of them:

MindSet

A stagnant mindset customer can most likely never be a successful microservices adopter. Microservices is an evolving design, you cannot constraint it on Day 1. It also refers to the choice of technology, estimation and in some cases selection of team members. These may be upgraded, replaced, deprecated several times during the product life-cycle. For companies which and not known to the Agile and DevOps world, this can be very painstaking as they do not expect things to change so frequently. We need to be fluid enough to make changes to what works right for the project, of course, this does not mean making significant upgrades without putting any thought or proofing out. The point to make here is that if we have an open mindset we can accept feedback so as to improve the system.

Rapid Classification Model for microservices:

In the ever changing world of microservices, I suggest we follow a process I call as the Rapid Classification Model for MicroServices. The context of rapids came to my from my earlier experiences of white water rafting in Rishikesh, India.

In the Rapid classes model, each rapid is given a different class based on its wave strength, riffles, channel and passage clarity. The trainers would first take us to the easy rapids so we can learn to maneuver and understand and adapt to the surrounding. Only when we were confident did we move towards the narrower passages which then became easier to travel. I think the same concept can be applied towards microservices adoption. Here is the microservices rapid classification scheme:

As we can see, in Class I, we take lesser risks while ensuring our foundations are getting stronger. Class II allows us to add multiple teams and more services ensuring we can tackle many complex workloads. Class III-V then are ensuring governance structures, processes and a culture of continuous learning are adopted by the teams. In this way, an organization can systematically move towards a microservices environment with confidence.

Removal of the “Accountant” System

Manual reporting procedures and relying on a project manager to generate an excel report does not work, teams need to take accountability and ownership of delivery. Practices like DevOps needs to be infused in the team rather than Dev Leads and Project Managers holding responsibility.

Automation = WorkLife Balance

Automated tools should not just be used for Development and Testing but also for project management activities like cost and schedule projections and slippages. Dev Leads and Architects need to spend time writing and reviewing code than building PowerPoints and Excel sheets. The more automated processes in the system, the more visibility you can obtain and dare to go home on the “release day”!

Focus on Re-usability

The coder mindset needs to be to write code that can be re-used. The primitive of this is in fact in every programming language out there – a function or a method. It is supposed to be a piece of code that can be re-used by other calling clients or functions – imbibe the principle in all services, and we see an auto-cultivation of frameworks. Importantly don’t hold the team performance accountable on re-use, that will just generate a junk of IP that nobody is going to use.

Velocity spikes

There will always be a learning curve for new and old teams as they pick and choose technology choices, the management needs to factor in those as risks and have risk reserves. Expecting a high velocity from a team starting on using MicroServices.

The Gold Build syndrome

The production environment may only stay for a while before a new change is pushed, so no point maintaining and labeling Gold builds or releases and they will change rapidly. Treat each build as production build and run automated checks to ensure they are working, have canary environments and rollback strategies if things don’t work the way they should.

Flying high on microservices:

What, now you tell me not to use microservices!!! … While microservices have a lot of benefits, they do not come without a cost. New ideas always great but their cost impact analysis should be immediate to understand project impact. This may seem like the old school method of validation but the idea it to vet out something that will work in production at least for a short period and will not require significant re-work or re-engineering when a new version releases. E.g. NodeJS keeps changing, is it really worth keeping up with those changes? – in some cases yes, in other cases maybe not.

It is not a silver-bullet

That’s the point of the microservices architecture in the first place, if there were a silver bullet, it would be a monolith :).

Thank You!

Cross Platform IoT: Developing a .NET Core based simulator for Azure IoT Hub using VS for Mac!

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

The source for the solution is available on GitHub here.

March 7th was a significant milestone for the Microsoft Visual Studio team. With Visual Studio completing 20 years and the launch of Visual Studio 2017, the team demonstrated that Visual Studio continues to lead the path for .NET development on Windows. While the Visual Studio 2017 is not cross-platform and does not work on other OS like MacOS (yet ;)), Microsoft, for some time, has also launched a preview version for another cousin of Visual Studio – The Visual Studio for Mac!!

VS for Mac at first impression seems like a cosmetic redo for the Xamarin Studio (post acquisition of Xamarin by Microsoft) but as the new updates keep coming it seems to be bringing all the goodies of Visual Studio to the native MacOS platform. In this post, I will show how to use Visual Studio for Mac to build a .NET Core solution that will act as a simple device emulator and will send and receive messages commands to Azure IoT Hub.

Wait, don’t we already have VS Code? Yeah, that’s correct, VS Code is there and provides some awesome cross-platform development support. I am a big fan of its simplicity, speed and extension eco-system. #LoveVSCode. However, VS for Mac includes some additional features such as Project templates, Build and Run from IDE support, Native Xamarin App development experience, and now Visual Studio Test Framework support which makes it a step closer to the Visual Studio IDE available in Windows. Which is better? Well, time will tell, for the purpose of this post, however, we will use VS for Mac running on MacOS Sierra (10.12.3).

Getting VS for Mac

VS for Mac is in preview right now, and you should use it for development scenarios and with caution. At the time of writing, VS team had released Preview 5 builds, and that is what we used for this post.

You can download the .dmg from the Visual Studio website here to install the application or use homebrew for the installation:

brew cask install visual-studio

The setup process is fairly straightforward. Once installed, launch the Visual Studio app, and you should be presented with a welcome screen similar to below:

VS for Mac (current build) does not install .NET Core SDK on your machine by default and you will need to manually install it. To install .NET Core SDK on MacOS we will again use homebrew. The VS team provides step by step instructions on how to do this here.

Note that the installation asks for installing openssl which is a dependency for .NET Core, it mainly uses the libcrypto and libssl libraries. In some cases, you may see a warning like this. Warning: openssl is a keg-only and another version is linked to opt. To continue installation use the following command:

brew install --force openssl

Also, ignore the warning Warning: openssl-1.0.2k already installed, it's just not linked. or use brew unlink openssl before executing the above command

Writing our simulator app for IoT Hub

The source for this project is available at the GitHub repo here.

Now that we have VS for Mac and .NET Core SDK installed and setup, we are going to build a .NET Core Simulator app which will be performing the following operations:

Send a message (ingress) using a .NET Core simulator to Azure IoT Hub using AMQP as the protocol.
Receive messages from Azure IoT Hub (egress) using a .NET Core consumer.
Send a command to a device and receive an acknowledgment from the device (coming soon).

The process to create the project is very similar to the experience of Visual Studio on Windows.

Click New Project from the welcome screen to open the New Project dialog. We will select a .NET Core App project of type Console Application. At the time of writing C# and F# are supported applications for .NET Core projects in VS for Mac.
In the next screen, we provide a project and solution name and configure our project for git.

VS for Mac will now setup the project and solution. This first thing it does is to restore any required packages. Since VS for Mac has support for NuGet, it just uses Nuget package restore internally to get all the required dependencies installed. Once all dependencies are restored, you should see a screen similar to below. We have our .NET Core project created in VS for Mac!

Before we start working on the simulator code, let’s ensure we have Source Control enabled for our project. In VS for Mac, the Version Control menu allows you to configure your SVN of Git repo. Note that this functionality is derived from Xamarin Studio, you can use the guidance here to set up version control for your project. We will be using a git repo and GitHub for our project.

Now that we have our project and version control sorted out, let’s start working on the code for our IoT Simulator. As noted earlier, the simulator will perform the following actions:

Send a message to IoT Hub using Device credentials
- Act as a Consumer of the message from IoT Hub
- Receive Commands and send response acknowledgment

The solution has multiple projects and is described in the GitHub repo here. The repo. also discusses how to use the sample. Instead of talking through all the projects, I will talk about the logic to send a message to a device when using .NET Core assemblies, this should give an idea of how to use VS for Mac and .NET Core with IoT Hub.

Sending messages over AMQP

The easiest option to work with IoT Hub is the Device and Service SDKs. Unfortunately, at the time of writing this post, we do not have a .NET Core version of these SDKs. If you try to add the Nuget packages you should get incompatibility errors like below:

Package Microsoft.Azure.Devices.Client 1.2.5 is **not compatible** with netcoreapp1.1 (.NETCoreApp,Version=v1.1). Package Microsoft.Azure.Devices.Client 1.2.5 supports:
net45 (.NETFramework,Version=v4.5)
portable-monoandroid10+monotouch10+net45+uap10+win8+wp8+wpa81+xamarinios10 (.NETPortable,Version=v0.0,Profile=net45+wp8+wpa81+win8+MonoAndroid10+MonoTouch10+Xamarin.iOS10+UAP10)
  uap10.0 (UAP,Version=v10.0)

So what are our options here:

If you want to use the HTTPS protocol, you can build an HTTP Client and run it through the REST API model.
If you want to use AMQP as the protocol, you can use the AMQP Lite library available on Nuget. AMQP Lite has support for .NET Core today and provides underlying functionality to send and receive packets to IoT Hub using AMQP as the protocol. We will be using this for your sample.
If you want to use MQTT, there are few Nuget packages like M2Mqtt that you can try to use. I have not tried them yet.

When the IoT Hub releases the .NET Core versions, they should be the recommended way to process messages with IoT Hub

Adding the Nuget package

Adding NuGet packages in VS for Mac is straightforward. Simply right click on your project -> Add -> Add Nuget package. It opens the NuGet dialog box that allows searching for all available packages. In our case, we search for the AMQP Lite package and add it to our project.

Note that currently all packages are shown including full .NET Framework packages. VS for Mac, however, validates if the package can be added to a .NET Core project, in case the package has binaries or dependencies that rely on full .NET Framework, you should see an error in the package console

**Checking compatibility** for Microsoft.Azure.Devices.Client 1.2.5 with .NETCoreApp,Version=v1.1.
Package Microsoft.Azure.Devices.Client 1.2.5 is not compatible with netcoreapp1.1 (.NETCoreApp,Version=v1.1). Package Microsoft.Azure.Devices.Client 1.2.5 supports:
net45 (.NETFramework,Version=v4.5)
portable-monoandroid10+monotouch10+net45+uap10+win8+wp8+wpa81+xamarinios10 (.NETPortable,Version=v0.0,Profile=net45+wp8+wpa81+win8+MonoAndroid10+MonoTouch10+Xamarin.iOS10+UAP10)
uap10.0 (UAP,Version=v10.0)

Setting up the environment variables

Once we have the AMQP Lite assemblies, we now need our IoT Hub details. You can use the Azure CLI tools for IoT mentioned in my previous post to fetch this information. In the sample, I leverage a JSON configuration handler to dump the configuration in a JSON file and read from it during runtime.

{
   "settings":{
      "connectionStrings":[
         {
            "name":"youriothubname",
            "connectionString":"youriothub.azure-devices.net",
            "sasKey":"yourdevicekey",
            "sasKeyName":"device"
         }
      ],
      "deviceId":"D1234"
   }
}

Sending the message to IoT Hub

The final part of the puzzle is to write code that will open a connection and send a message to IoT Hub. We leverage the AMQP Lite library to perform these actions. In the sample, I follow a Strategy pattern to execute methods on the underlying library, it gives us the flexibility to change underlying implementations to a different library (for example when the IoT Hub SDK become .NET Core compliant) without making a lot of code changes.

// Create a connection using device context
				Connection connection  = await Connection.Factory.CreateAsync(new Address(iothubHostName, deviceContext.Port));
				Session session  = new Session(connection);

				string audience = Fx.Format("{0}/devices/{1}", iothubHostName, deviceId);
				string resourceUri = Fx.Format("{0}/devices/{1}", iothubHostName, deviceId); 
				// Generate the SAS token
				string sasToken = TokenGenerator.GetSharedAccessSignature(null, deviceContext.DeviceKey, resourceUri, new TimeSpan(1, 0, 0));
				bool cbs = TokenGenerator.PutCbsToken(connection, iothubHostName, sasToken, audience);
				if (cbs)
				{
					// create a session and send a telemetry message
					session = new Session(connection);
					byte[] messageAsBytes = default(byte[]);
					if (typeof(T) == typeof(byte[]))
					{
						messageAsBytes = message as byte[];
					}
					else
					{
						// convert object to byte[]
 					}

					// Get byte[] from 
					await SendEventAsync(deviceId, messageAsBytes, session);
					await session.CloseAsync();
					await connection.CloseAsync();
				}

The above code first opens a connection using the AMQPLite Connection type. It then establishes an AMQP session using the Session type and generates the SAS token based on the Device credentials. The sample also uses a protobuf serializer to encode the payload as a byte [] and finally send it to IoT Hub using SendEventAsync.

If you run the samples.iot.simulator.sender .NET Core console app in the sample, it calls the ExecuteOperationAsync to send the payload and the required environment variables to the AMQP Lite library. The results of a successful message sent would be displayed in a console window.

Consuming messages from IoT Hub

The sample also demonstrates the other scenarios such as consuming messages however you can also use an out of box Azure Service like Azure Stream Analytics to process these messages. Here is an example of using Stream Analytics for event processing.

Phew! This was a long post, it demonstrates some of the capabilities as well as challenges of developing .NET Core solutions with VS for Mac. I think VS for Mac is a great addition to the IDE toolkit especially for developers who are used to the Visual Studio Windows environment. There are few rough edges that need here and there but remember this is just a preview! … Happy Coding 🙂

The source for the solution is available on GitHub here.

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

Cross Platform IoT: Troubleshooting IoT Hub using IoT Hub Diagnostics

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

In the previous posts we looked at the Azure IoT CLI and IoT Hub Explore which assisted us in the management of Hub, Devices as well as sending a receiving messages. In this post, we discuss a troubleshooting utility that further assists in IoT Hub development and support.

Why may we need a diagnostics utility?

There are times when things go wrong and you are trying to figure out what could be the issue. The trickiest of all are generally network connectivity issues, whether its a firewall blocking a port or a bandwidth issues resulting in intermittent connections. While IoT Hub supports Operation Monitoring through the Azure Management Portal as well as the IoT Hub Monitoring endpoint, digging into the logs and crash dumps can take multiple support tickets and valuable time. The IoT Hub Diagnostic utility was a recent addition to the IoT Hub toolkit and is a useful “ping” type utility that provides some early feedback on probable issues in case IoT Hub is not behaving as expected.

The tool is also written in node.js and is available as a Open Source (MIT) solution. To install the tool on your machine, run the following command in a terminal:

npm install -g iothub-diagnostics

Initiating the diagnostic tool is fairly simple, all you need is your IoT Hub connection string. There are no other options or commands :).

From my tests, it seems the tool does require the iothubowner permission, I did not try with all policies so it may be possible to restrict access or use a lesser rights policy.

iothub-diagnostics "HostName=youriothub.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=="

Once the connection is established, the tool kicks in to perform a series of tests:

The firsts are network tests to ensure DNS resolution, the utility is just pinging a URL here to ensure connectivity to internet is possible. The default pingURL is at the moment hardcoded to www.microsoft.com. The code does have the option to provide your own DNS or Addresses so you can customize this as desired.
It then checks for ports availability and TLS Cipher requirements, again these tests include a ping on HTTPS (443), the default httpsRequestUrl is hardcoded to https://www.microsoft.com/
The final checks are performed on the IoT Hub itself where first a temporary device is registered and a series of tests are run to ensure D2C (device to cloud) and C2D (cloud to device connectivity using all supported protocols (HTTPS (443), AMQP (5672 and 5671) and MQTT (1833 and 8883) ).

Once the tests are completed, you should see a similar result:

2017-03-16T07:38:51.193Z - info: *******************************************
2017-03-16T07:38:51.196Z - info: * Executing the Microsoft IOT Trace tool. *
2017-03-16T07:38:51.196Z - info: *******************************************
2017-03-16T07:38:51.197Z - info:  
2017-03-16T07:38:51.198Z - info: --- Executing network tests ---
2017-03-16T07:38:51.222Z - info:  
2017-03-16T07:38:51.223Z - info: Starting DNS resolution for host 'www.microsoft.com'...
2017-03-16T07:38:51.254Z - info: --> Successfully resolved DNS to 23.204.149.152.
2017-03-16T07:38:51.255Z - info:  
2017-03-16T07:38:51.255Z - info: Pinging IPV4 address '23.204.149.152'...
2017-03-16T07:38:51.291Z - info: --> Successfully pinged 23.204.149.152
2017-03-16T07:38:51.291Z - info:  
2017-03-16T07:38:51.291Z - info: Sending https request to 'https://www.microsoft.com/'
2017-03-16T07:38:51.444Z - info: --> Completed https request
2017-03-16T07:38:51.445Z - info:  
2017-03-16T07:38:51.445Z - info: --- Executing IOT Hub tests ---
2017-03-16T07:38:54.731Z - info:  
2017-03-16T07:38:54.732Z - info: Starting AMQP Test...
2017-03-16T07:38:59.141Z - info: --> Successfully ran AMQP test.
2017-03-16T07:38:59.142Z - info:  
2017-03-16T07:38:59.142Z - info: Starting AMQP-WS Test...
2017-03-16T07:39:03.460Z - info: --> Successfully ran AMQP-WS test.
2017-03-16T07:39:03.466Z - info:  
2017-03-16T07:39:03.466Z - info: Starting HTTPS Test...
2017-03-16T07:39:08.036Z - info: --> Successfully ran HTTPS test.
2017-03-16T07:39:08.059Z - info:  
2017-03-16T07:39:08.060Z - info: Starting Mqtt Test...
2017-03-16T07:39:11.828Z - info: --> Successfully ran Mqtt test.

In case of any test failure, the appropriate exception are logged and displayed.

Overall, the diagnostic tool is a nifty little utility for quickly checking your environment for common problems around Connectivity and Device operations. Note that the tool does not provides traces or network monitoring support for the IoT Hub servers, it is only for client connectivity and testing your IoT hub basic operations.

There is definitely scope for adding more features and logging support so we can see more details about the types of tests are being run. I would also want to see this integrated into a CI/CD pipeline after extending the basic operation tests with our own custom test suites.

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

Cross Platform IoT: Devices operations using IoT Hub Explorer

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

IoT Hub Explorer is a node.js based cross platform tool that allows management of device for an existing IoT Hub.

This is one of my favorite tools for Azure IoT Hub development and troubleshooting. It is a command line tool that can run on Windows, Mac or Linux and provides an easy way to execute operations on Azure IoT Hub such as device creation, send a message to an existing device, etc. It comes in handy when you are doing Dev. Testing or even demonstrating the capabilities of the Azure IoT service. You can theoretically use this for troubleshooting a production environment as well, but I would recommend having appropriate telemetry and an operational management pipeline for those scenarios. One of the very useful features I like about this tool is the ability to monitor messages on devices for the IoT Hub, basically getting event data and operational statistics about devices in the hub. We will use this tool to create a device and then watch IoT Hub for messages. IoT Hub Explorer is available on GitHub here.

Note that the Azure IoT CLI (covered in the last post) also has support for managing devices and may soon overlap with functionalities of IoT Hub Explorer. When that happens, Azure CLI should become the preferred tool for all IoT Hub operations.

Let’s use IoT Hub Explorer to create and monitor a device. Before we do that, we need to install it. Being a node package, you can install it through npm.

npm install -g iothub-explorer

Since IoT Hub Explorer is a separate utility, we will need to first login using our IoT Hub connection string. Open a bash terminal and enter the following:

iothub-explorer login "HostName=yourhub.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=yourkey"

If you do not have the connection string handy, you can use the az iot hub show-connection-string -g youresourcegroup command described in the previous section to fetch your IoT Hub connection string. The login command should open a temporary session with the IoT Hub token policy provided. The default TTL for this session is 1 hour.

Session started, expires on Wed Mar 15 2017 19:59:05 GMT-0500 (CDT)
Session file: /Users/niksac/Library/Application Support/iothub-explorer/config

Note that the above command uses the connection string for iothubowner policy which gives complete control over your IoT Hub.

Creating a new device

To create a new device using IoT Hub explorer, type the following command:

iothub-explorer create -a

The -a flag is used to auto-generate a Device Id and credentials when creating the device. You can also specify a custom Device Id or a Device JSON file to customize the device creation. There are other options to specify credentials such as the symmetric key and X.509 certificates. We will cover those in a separate IoT Hub Security post, for now, we use the default credentials generated by IoT Hub.

If all goes well, you should see an output similar to below:

deviceId:                   youdeviceId
generationId:               63624558311459675
connectionState:            Disconnected
status:                     enabled
statusReason:               null
connectionStateUpdatedTime: 0001-01-01T00:00:00
statusUpdatedTime:          0001-01-01T00:00:00
lastActivityTime:           0001-01-01T00:00:00
cloudToDeviceMessageCount:  0
authentication: 
  symmetricKey: 
    primaryKey:   symmetrickey1=
    secondaryKey: symmetrickey2=
  x509Thumprint: 
    primaryThumbprint:   null
    secondaryThumbprint: null
connectionString:           HostName=youriothub.azure-devices.net;DeviceId=youdeviceId;SharedAccessKey=symmetrickey=

Few important things, the obvious one here is the connectionString. It provides a unique Device Connection string to talk to the device. The privileges for the device connection string are based on the device policy defined in IoT Hub and gives them DeviceConnect permissions only. The policy-based control allows our endpoints to be secure and limited to use with only the specific device. You can learn more about IoT Hub device security here. Also, notice that the device is enabled but the status is disconnected. What this means is that the device has been successfully registered in IoT Hub. However, there are no active connections to it.

Sending and Receiving messages

Lets initiate a connection by sending a device ingestion request. There are multiple ways in IoT Hub Explorer to send and receive messages. One of the options that are useful is a simulate-device command. The simulate-device command allows the tool to behave like a device ingestion and command simulator. It can be used to send Telemetry custom messages and Commands on behalf of the device. The functionality comes handy when you want to perform some Dev. Integration tests on your device, and you don’t want to write too much code. You can create messages and view the send and receive flow simultaneously. The command also exposes properties such as send-interval and send-count and receive-countwhich allows for configuring the simulation. Note that this is not a penetration or load testing tool, it is enabling a feature for you to perform some initial tests before performing more involved test cases. Let send a series of messages on our created device (from Part 1) and then receive a Command message.

Send a message

The following command sends 5 messages every 2 seconds to a specified Device Id.

niksac$ iothub-explorer simulate-device --send "Hello from IoT Hub Explorer" --device-connection-string "HostName=youriothubname.azure-devices.net;DeviceId=D1234;SharedAccessKey==" --send-count 5 --send-interval 2000

The output for the messages will look like this:

Message #0 sent successfully
Message #1 sent successfully
Message #2 sent successfully
Message #3 sent successfully
Message #4 sent successfully
Device simulation finished.

Monitoring Messages

Another useful feature in IoT Hub Explorer is that it allows you to monitor the event of your device or IoT Hub as a whole. This comes in very handy when you want to troubleshoot your IoT Hub instance. For example, you want to check if messages are flowing correctly in IoT Hub. You can use the monitor-events command to log all events for the device of Hub to the terminal; you can also use the monitor-ops command to monitor the operation endpoint for IoT Hub.

To monitor events, enter the following:

iothub-explorer monitor-events --login "HostName=youriothub.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=="

The above created a listener which is now looking for any activating happing on the entire IoT Hub. As mentioned earlier, you can specify a Device connection string to monitor for a specific device.

Now, when you send a message or command to any device in your IoT Hub, you should start seeing the output in the terminal. For example, if you open a monitor-event listener in a terminal window and then execute the simulate-device --send command again, you should see following output in the terminal:

Monitoring events from all devices...
==== From: D1234 ====
Hello from IoT Hub Explorer
====================
==== From: D1234 ====
Hello from IoT Hub Explorer
====================
==== From: D1234 ====
Hello from IoT Hub Explorer
====================
==== From: D1234 ====
Hello from IoT Hub Explorer
====================
==== From: D1234 ====
Hello from IoT Hub Explorer
====================

There are multiple other useful commands in IoT Hub Explorer such as Import, Export devices, regenerate SAS tokens, device management commands. You should play around with the IoT Hub Explorer commands and options; it can save you from writing some piece of code for common operations.

In the next post, we talk about another useful utility for troubleshooting and diagnostics.

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

Cross Platform IoT: Using Azure CLI with Azure IoT Hub

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

A brilliant addition to the cross platform toolkit from Microsoft is the Azure CLI. The Azure CLI (Command Line Interface) provides an intuitive and rich command line experience to perform various operations on Azure resources. The support for this tool is great, and in some cases, I feel this is better than the PowerShell version of CLI out there :). Azure CLI has two versions today:

The node.js version of the Azure CLI – It has been out for a while and provided support for all resources in Azure including Resource Groups, VMs, Storage Accounts, etc.
A new version which is based on Python was released some time back correctly for better support with Unix / Linux platforms – Azure CLI 2.0. We will be using Azure CLI 2.0 for our samples.

So what’s a CLI got to do with IoT Hub? Well, the team also introduced an IoT module which allows management of IoT Hub operations from the command line. We will show how to enable the IoT Hub module for Azure CLI and then use the commands to create and manage our IoT Hub Service.

There seems to be a feature parity between the two versions, you can use either of them for running the IoT modules. Here is an introduction to the goals behind Azure CLI 2.0.

Let’s start by installing the IoT Components for Azure CLI, we will then use the CLI to create and perform operations on an IoT Hub.

If you have not already installed Azure CLI 2.0, you can install it from here. The link provides instructions for installing CLI on any OS; we will be using it for MacOS. The python scripts ensure dependencies are installed and also sets the PATH variable for the “az” command from a terminal. The script attempts to add the configuration to ~/.bashrc which I have observed does not work well with MacOS Terminal. To ensure that you can use the “az” alias for all Azure CLI commands, you should add the following environment variable to your ~/.bash_profile configuration instead. From any bash terminal enter nano ~/.bash_profile to edit the configuration and type the following and save your settings:

export PATH=\~/bin:$PATH

I don’t see a .bash_profile?
In case you don’t have a .bash_profile (this is common with macOS), you can simply create one using the following:

cd ~/
touch .bash_profile

To make sure Azure CLI is working, open a terminal and type

az --version

This should show a list of all components installed as part of the CLI:

acs (2.0.0)
appservice (0.1.1b5)
batch (0.1.1b4)
cloud (2.0.0)
component (2.0.0)
configure (2.0.0)
container (0.1.1b4)
core (2.0.0)
documentdb (0.1.1b2)
feedback (2.0.0)
keyvault (0.1.1b5)
network (2.0.0)
nspkg (2.0.0)
profile (2.0.0)
redis (0.1.1b3)
resource (2.0.0)
role (2.0.0)
sql (0.1.1b5)
storage (2.0.1)
vm (2.0.0)
Python (Darwin) 2.7.10 (default, Jul 30 2016, 19:40:32) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)]

Note that the IoT component is not installed by default, to install the IoT component, run the following commands:

az login
az account set --subscription  
az component update --add iot
az provider register -namespace Microsoft.Devices

The above commands will login and set your account to the desired Azure subscription. It will then install the IoT components and register the IoT provider. If you now run az --version, you should see the IoT component installed amongst the list.

acs (2.0.0)
appservice (0.1.1b5)
......
iot (0.1.1b3)
......

The IoT component is further divided into two groups device and hub:

The hub group is for commands specific to the IoT hub itself. These include create, update, delete, list, show-connection-string, show-quota-metrics for the creation, etc.
The device group allows for commands that are relevant for device operations. These include create, delete, list, update, show, show-connection-string, import, and export.

You can use az iot device -h or az iot hub -h to see all commands and subgroups.

Now, let’s create our IoT Hub, use the following commands to create a new resource group in the selected subscription and then create a new IoT Hub in the newly created resource group.

az group create --name yourresourcegroupname --location westus
az iot hub create --name youriothubname --location yourlocation --sku S1 --unit 1 --resource-group yourresourcegroupname

Most of the values are self explanatory. I specify the SKU as S1 but if you want to use the free tier you can specify F1 as the SKU (the allowed values are: {F1,S1,S2,S3}. Also, the unit here refers to the IoT Hub units that you want to create with the IoT Hub, 1 unit is more than enough for development purposes. The optional location attribute is the region where your IoT Hub will be created, you can ignore this to use the default location of the resource group. List of available regions (at the time of writing) are: {westus,northeurope,eastasia,eastus,westeurope,southeastasia,japaneast,japanwest,australiaeast,australiasoutheast,westus2,westcentralus}.

If everything goes OK, you should see the response similar to below:

{
  "etag": "AAAAAAC2NAc=",
  "id": "/subscriptions/yournamespace/providers/Microsoft.Devices/IotHubs/youriothubname",
  "location": "westus",
  "name": "youriothubname",
  "properties": {
       ....
      }
    },
    ... 
''   "type": "Microsoft.Devices/IotHubs"
}

Voila!, we just created an IoT Hub from the command line on a Mac! We will use this IoT hub to add devices and perform other operations in our next post.

Some other useful commands in Azure IoT CLI:

To see the configuration again just type az iot hub list -g yourresourcegroupname you should see a list of all IoT Hub created in your resource group. If you omit the resource group, you can see all IoT Hubs in the subscription. To display the details of a specific IoT Hub, you can use az iot hub show -g yourresourcegroupname --name yourhubname.

IoT Hub by default creates a set of policies that can be used to access IoT Hub. To view a list of policies enter az iot hub policy list --hub-name youriothubname. Policies are very important when dealing with IoT Hub operations, you should always leverage the “least privilege” principle and choose your policy based on the operation being conducted. For example, the iothubowner policy grants you admin rights on the IoT Hub and should not be used for example to connect a device.

To get just the connection strings for the created IoT Hubs enter az iot hub show-connection-string -g yourresourcegroupname.

Note that by default the above command returns the connection string for iothubowner policy which gives complete control over your IoT Hub. It is recommended that you define granular policies in IoT Hub to ensure a least privilege access. To do this, use the --policy-name option and provide a policy name.

You can also use az iot device commands to create a device; we will use IoT Hub Explorer to demonstrate those features instead. IoT Hub Explorer provides some additional commands including device management and command and control. Personally, I do see the prospect for all IoT Hub Explorer functionality coming into the future version of the IoT component for Azure CLI.

Looking under the hood of Azure CLI

One of the beautiful things about Azure CLI is that is uses REST API under the hood to create the resources and entities in Azure. If you are curious how the underlying operations are being performed or simply want to troubleshoot for any errors, you can use the -debug option with any command. The debug command provides a trace of what calls are being made and any underlying exceptions that may have occurred during processing. For example, for a command like:

az iot hub show-connection-string -g yourresourcegroupname --debug"

You should see something like below:

Command arguments ['iot', 'hub', 'show-connection-string', '-g', 'yourrgname']
Current active cloud 'AzureCloud'
{'active_directory': 'https://login.microsoftonline.com',
 'active_directory_graph_resource_id': 'https://graph.windows.net/',
 'active_directory_resource_id': 'https://management.core.windows.net/',
 'management': 'https://management.core.windows.net/',
 .......
'storage_endpoint': 'core.windows.net'}
Registered application event handler 'CommandTableParams.Loaded' at 
Registered application event handler 'CommandTable.Loaded' at 
Successfully loaded command table from module 'iot'.
..........
g': 'gzip, deflate'
msrest.http_logger :     'Accept': 'application/json'
msrest.http_logger :     'User-Agent': 'python/2.7.10 (Darwin-16.4.0-x86_64-i386-64bit) requests/2.13.0 msrest/0.4.6 msrest_azure/0.4.7 iothubclient/0.2.1 Azure-SDK-For-Python AZURECLI/2.0.0'
msrest.http_logger :     'Authorization': 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6ImEzUU4wQlpTN3M0bk4tQmRyamJGMFlfTGRNTSIsImtpZCI6ImEzUU4wQlpTN3M0bk4tQmRyamJGMFlfTGRNTSJ9.eyJhdWQiOiJodHRwczovL21hbmFnZW1lbnQuY29yZS53aW5kb3dzLm5ldC8iLCJpc3MiOiJodHRwcz0cy53aW5kb3dzLm5ldC83MmY5ODhiZi04NmYxLTQxYWYtOTFhYi0yZDdjZDAxMWRiNDcvIiwiaWF0IjoxNDg5NjEzODc5LCJuYmYiOjE0ODk2MTM4NzksImV4cCI6MTQ4OTYxNzc3OSwiX2NsYWltX25hbWVzIjp7Imdyb3VwcyI6InNyYzEifSwiX2NsYWltX3NvdXJjZXMiOnsic3JjMSI6eyJlbmRwb2ludCI6Imh0dHBzOi8vZ3JhcGgud2luZG93cy5uZXQvNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjQ3L3VzZXJzL2UyZTNkNDdlLTRiYzAtNDg1Yy04OTE1LWYyMDVkYzRlODY5YS9nZXRNZW1iZXJPYmplY3RzIn19LCJhY3IiOiIxIiwiYWlvIjoiQVFBQkFBRUFBQURSTllSUTNkaFJTcm0tNEstYWRwQ0pwWnVGcnREQ05QTTBrNjB1NVl0RElLTjZ5cklUVHJna0Fod3JuQXJBd2NMQXFqczlNQ0RhazRtM0E2cjN3T09YZ1FxaWZlUVFIRC1TQ0JXOVVJdjNabzFnMERXMElvYWRLRWgtQ0R3X01XY2dBQSIsImFtciI6WyJwd2QiLCJtZmEiXSwiYXBwaWQiOiIwNGIwNzc5NS04ZGRiLTQ2MWEtYmJlZS0wMmY5ZTFiZjdiNDYiLCJhcHBpZGFjciI6IjAiLCJlX2V4cCI6MTA4MDAsImZhbWlseV9uYW1lIjoiU2FjaGRldmEiLCJnaXZlbl9uYW1lIjoiTmlrIiwiaW5fY29ycCI6InRydWUiLCJpcGFkZHIiOiI2NS4zNi44OC4xNjYiLCJuYW1lIjoiTmlrIFNhY2hkZXZhIiwib2lkIjoiZTJlM2Q0N2UtNGJjMC00ODVjLTg5MTUtZjIwNWRjNGU4NjlhIiwib25wcmVtX3NpZCI6IlMtMS01LTIxLTEyNDUyNTA5NS03MDgyNTk2MzctMTU0MzExOTAyMS0xMzUwMDI4IiwicGxhdGYiOiI1IiwicHVpZCI6IjEwMDM3RkZFODAxQjZCQTIiLCJzY3AiOiJ1c2VyX2ltcGVyc29uYXRpb24iLCJzdWIiOiI2Z3d4WUFKem4zM3h6WVhfVWM4cHpNcGc4dzk1NVlHYTJ2VnlpazMtVDZ3IiwidGlkIjoiNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjQ3IiwidW5pcXVlX25hbWUiOiJuaWtzYWNAbWljcm9zb2Z0LmNvbSIsInVwbiI6Im5pa3NhY0BtaWNyb3NvZnQuY29tIiwidmVyIjoiMS4wIn0.jEAMzSd4bV0x_hu8cNnQ7fActL6uIm97U7pkwCz79eaSfEnBdqF8hXEJlwQh9GYL0A3r8olNdjr1ugiVH6Y0RFutn7iD8E-etkVI9btE1aseZ3vvZqYeKPhA1VytBsTpb4XO2ZI094VynTeALoxe7bbuEzl5YaeqtbC5EM0PMhPB04o7K1ZF49nGKvA385MHJU3G-_JT3zV-wdQWDj5QKfkEJ0a9AsQ9fM7bxyTdm_m5tQ4a-kp61r92e1SzcpXgCD7TXRHxrZ2wa65rtD8tRmHt6LOi7a4Yx2wPFUpFoeQAcN7p7RKW6t_Cn8eyyvWrrUXximBcTB4rtQTgXCfVUw'

As you can see, Azure CLI makes the rest calls on our behalf using the current tokens and performs the required operations. Once the responses are received, it formats the response in the desired format and displays it on the terminal.

Here is the output without the --debug option:

[
  {
    "connectionString": "HostName=youriothub.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=yourkey=,
    "name": "youriothub"
  }
]

Another useful operation provided by Azure CLI is the output option. It allows for formatting of output using different parsers. For example, the using the same command above, we can format the output from JSON (default) to a table.

 az iot hub show-connection-string -g yourresourcegroupname -o table"

The output should now be presented in a tabular format:

ConnectionString                                                                                                                        --------------------------------------------------------------------------------------------------------------------------------------  --------------
HostName=youriothub.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=yourkey= 
Name
youriothub

Finally, you can create powerful scripts using Azure CLI using features like query and combine them with conventional unix pipe commands to send the output from one command to another.

For example, the below command uses a query to get all IoT Hub created in the west region under my subscription and then formats the output as a tsv:

az iot hub list --query "[?contains(location,'westus')].{Name:name}" -o tsv

The query language (JMESPath) is very powerful when filtering requests, you can find more details about the query language here.

That’s all for now. In the next post, we will use Azure IoT Hub Explorer to create a device and then perform send and receive operations upon it.

This post is part of a Cross-Platform IoT series, to see other posts in the series refer here.

Cross Platform IoT: Developing solutions using Azure IoT on a Mac!

Not so long ago, a search on the words “Mac” and “Microsoft” would reveal results like rivalry, locked-in platform, competitors, etc. The story has dramatically changed in the last few years especially from a Microsoft perspective. Microsoft is embracing Open Source and Cross Platform and opening up gates for developers and end-users to benefit from the Microsoft platform not just on Windows but across any OS or platform.

In this blog series, I will cover some of the Microsoft Azure IoT Cross-platform tools that are available for developers to build IoT Solutions on several operating systems, most of them are Open Source licensed! My focus will specifically be on development and troubleshooting tools for Azure IoT Hub on a non-Windows platform. We will be covering the following tools:

Azure CLI for IoT
IoT Hub Explorer
IoT Hub Diagnostic Utility
Developing .NET Core apps on Visual Studio for MacOS (Preview)

When should you use these tools

Since most of these tools are cross-platform, they can and should become part of your development and troubleshooting toolkit when developing Azure IoT applications. A few use cases for these utilities include:

Import/export devices as part of a workflow or script.
These may be included as part of your CI/CD builds to verify if the environment is setup and working correctly.
Troubleshooting a Dev / Test environments.
Monitoring devices for events during solution development. This is pretty handy to test if a message event is coming through or a command is being passed without writing any code!
And of course, building and deploying cross-platform applications on any OS platform.

Let’s get started

There are multiple pieces to the puzzle, let’s list them down:

In Part 1, we will install the IoT Module for Azure CLI 2.0 and use the CLI to create a new Resource group and then create an IoT Hub instance in our Azure Subscription.
Part 2, we will use IoT Hub Explorer to perform some common device operations on our created IoT Hub instance and create a device Identity that we can play with.
Part 3, we discuss the IoT Diagnostic tool for troubleshooting IoT Hub.
Finally, in Part 4 we will develop a .NET Core application and connect to our IoT Hub to publish messages and send commands. All of that using Visual Studio for Mac.

Note that in this series we will not connect to an actual device but only show an emulation. The purpose is to demonstrate how we can do rapid development using the provided cross-platform tools. If you are interested in connecting your device to Azure IoT, you can always look at the great samples available here.

	Leonard Gates on High Availability, and Disaste…
	Ashlee D on Using Win10 Core and Azure IoT…
	Sarbjit Singh on The Zero Config MicroService u…
	Boris Tarasov on The Zero Config MicroService u…
	formulahendry on Cross Platform IoT: Devices op…