High Availability, and Disaster Recovery in Serverless solutions, does it matter?

Many a time when I talk to customers about Serverless architectures the topic of High Availability and Disaster Recovery comes up, the perception usually is that in a Serverless world we don’t care about these because, well; we are Serverless! 

TL;DR: In this post, we discuss design and planning considerations for building robust serverless solutions. The goal is to focus on “what” should you plan when it comes to serverless and also provide an overview of “how” to design around these considerations. We cover an array of topics ranging from Ingestion EndPoint(s), Storage, Security and of course High Availability and Disaster Recovery. We also discuss governance considerations such as Compliance, Developer Experience and Release pipeline in the serverless world.

Note that while the references in this post refer to use of Azure Services, the concepts can be applied to any cloud provider and offerings.

Let’s get started!

Characteristics of a robust solution?

A robust solution can mean many things, to scope and level set our discussion we focus on the following characteristics of a solution:

  • Reliability: The system should work correctly in case of failures and faults.
  • Scalability: The system should be able to grow with same levels of performance if not better.
  • Maintainability: The system should organize itself to be productive and modifiable in the future.

Martin Kleppman describes these characteristics in amazing detail in his book Designing Data-Intensive Applications, a highly recommended read if you have anything to do with building scalable and quality software.

The Planning Sprint

The first question when thinking about Serverless comes around — do I need to plan for Serverless?. The answer is: If you are building a production quality solution then absolutely Yes.

The planning considerations, however, change from what you would do in a typical data center oriented architecture. The focus is not on how many servers I need to scale or how will I handle replication but rather what are the thresholds for the provider service and how much reliability can be provided by it. In my opinion, this is the right definition of being serverless; Your focus has changed to an abstraction of the underlying infrastructure, and you worry more about the service capabilities and thresholds than about the underlying hardware or virtualization.

Serverless Planning = Plan for service capabilities and thresholds and not for the infrastructure that runs the service.

I highly recommend that you run a planning sprint to determine requirements and how they will affect the provider service constraints. Firstly, a planning sprint (or Sprint 0) will give an opportunity to decide if Serverless makes sense for your workloads (the discussion of Serverless v/s Containers should happen here). Second, it also allows you to analyze the capabilities of the service to determine if you are choosing the right service for our job. Finally, it addresses concerns about geographical reach, compliance, and data sovereignty and future scale of the solution.

The what of Serverless

Below are areas to focus during the planning sprint, asking these questions allow us to look at things to consider when building serverless solutions:

These are guidelines and you may have more categories and questions based on unique requirements.

Ingestion Endpoint(s):

Understand how public and internal endpoints will handle requests.

  • Typical Size of a message? (Average, Max)?: Getting the average and max helps understand two aspects for scale: how much forecasted messages should I expect over a sustained period and what will be the max message size I need to accommodate during processing. This can impact which service you choose for consumption and processing of these messages, e.g., Azure IoT Hub today support message sizes up-to 256 KB so if you want to send larger messages, you will employ other techniques such as file upload or using Azure Storage Blobs. Understanding the size can also help decide if we split our messages before sending or can we trim the message itself so unnecessary bits are not sent over the wire improving bandwidth and processing times.

To know more about service throttles and limits for all Azure services, refer here

  • Who will send the messages? : Understand clients for your services or application:
    • Plan for client(s) uniquely when required: The planning for a passive client like a browser will require fewer considerations around scale as compared to a Device which can send continuous streams of data and reach the throttle of a provider service quickly.
    • Is there a direct connection (socket) possible?: This can determine how many active connections you need to configure in your public endpoint and whether the provider services will be able to handle them. It will also provide opportunity to tweak your services for optimum scale. For example, you can use the below configuration in Azure Functions to increase the incoming request pipeline and concurrent requests:
{
   "http":{
  "routePrefix":"api",
  "maxOutstandingRequests":20,
  "maxConcurrentRequests":10,
  "dynamicThrottlesEnabled":false
   }
}
  • Is there an intermediary between the client and endpoint: This could be a gateway server on-premise, an ISP VPN or a server running on Edge. The message processing and identity for messages coming from a gateway will be different as compared to a direct connection. Additionally, you may need to do IP Whitelisting, VNET configuration and hence need to understand if the provider service supports such functionality (refer to Offloading tasks to multiple services below for details on this topic).
    • Burst mode: An additional thing to consider here is Burst mode scenarios where all the clients start sending messages at a particular time series increasing the load significantly and triggering a threshold for a downstream service. While true serverless services like Azure Functions are built to handle such scenarios, there could be a lag due to the provisioning of multiple consumption units and may also result in time-outs. In such a case you may want to dissect your workloads or move specific clients onto a dedicated tier to allow better distribution of the incoming requests.
    • Offloading tasks to multiple services : In a Serverless Compute model we deal with Functions which can be considered as small single task that make up a business functionality. Since functions are lightweight, it is an good approach to offload some of the work required in message exchanges to other components. For example, when building REST API’s you can offload tasks like Load balancing, SSL Termination, Vanity URL, DNS resolution to an API Gateway, similarly you can offload authentication and authorisation to Identity service like Azure AD, the management of services can be offloaded to a service like API management. Finally, the geo-redundancy can be achieved using a service like Azure Traffic Manager. By using a bouquet of services, the Serverless Compute layer (aka Azure Functions) can focus solely on responding to triggers or handling events, and the remaining eco-system can work on ensuring the robustness of the solution.
  • What message format(s) are we dealing with?: The consideration here is whether the downstream services supports the message formats you want to send. Example, Azure IoT Hub today allows you to send binary data, but if you are analyzing data using Azure Stream Analytics (ASA), it supports CSV, JSON, AVRO as message formats today. So if you are sending data in BSON, or your proprietary format you will have to transform the payload before ASA can process the messages. You can use Azure Logic Apps to do the transformation, but now your architecture has changed and has more moving parts to manage.
  • Can we do batching at the site:: Batching small messages (e.g., Telemetry coming from a thermostat) is always recommended since it saves bandwidth and optimizes parallelism in downstream services. When possible try to batch, however, do consider the size limits of the service. Another consideration here whether the downstream service can process messages in batches since this can impact the load levelling of the solution. In most cases, this should not be a problem, but it is worth considering each service capability to process batches before making the decision.
  • Are there any workflow requirements to define a correlation between the data?: In the Serverless world, we are driving an Event Driven architecture for our solution. While the events model provides great loose coupling between our services, they also make it difficult to manage things like transactions, workflows, and correlation between components. Plan for how the incoming messages will need to be processed when state needs to be transferred across services. Ensure you use an orchestrator like Azure Logic Apps or Azure Durable Functions when component-to-component interaction is required. Additionally, leverage cloud patterns like Retry, Circuit Breaker, and Saga to ensure you can replay or rollback events in case of failures.
  • Are there any quality of service (QoS) attributes that apply to messages : Most Services provided by a cloud provider provide At-Least once messaging as a guarantee. This is primarily because of the considerations around the CAP theorem and the cost involved to build infrastructure that will provide higher guarantees. At-Least will work well for most interactions especially if you have appropriate retry logic and message handling employed in your service. If you have scenarios where message reliability is a must; first, think twice on why do you need such guarantees, in most case you won’t! In case you still convince yourself of a higher guarantee like exactly once, isolate the payloads that require such guarantee and use them sparingly.
  • Are there any specific protocols required to be able to send data to the cloud: A lot of IoT systems have custom protocols instead of the popular HTTP protocol. Consider the protocols supported by the provider services during service technical feasibility. In case the protocol is not supported you may have to build your custom protocol gateway layers which can impact your decision to use a Serverless service v/s building your custom component.
  • Frequency in which messages are being sent?: In a provider Service world, you are bound by the Units that you deploy to a Service (e.g., IoT Hub Units, Cosmos DB Request Units, Stream Units, etc.). The idea is a pattern known as the Scale Unit pattern which provides predictable monitoring and deployment of service units as your scale up, out or down. Since each service is bottled in a unit based model, you need to have consideration around how the incoming message will impact the units you have allocated for your service. But, in a true serverless world, this should not matter since the platform should automatically scale up or out, right? While it is true for Serverless services like Azure Functions (Consumption Plan), it does not apply to all services today. Also even in the case of core serverless services, there is going to be some degradation or lag when a new consumption unit gets deployed based on your load. While this lag is usually minimal (in ms), it can impact your response time if you are running a mission-critical application.

Storage

How data is stored and processed by downstream services.

  • Real Time v/s Batch Routing: Determine how the downstream systems will process the incoming messages. Your choice of service needs to align with how soon the data needs to be processed. For example, if you are processing data in motion, you need a near real-time service such as Azure Stream Analytics to process the data before it goes to other downstream systems. Similarly, if you are processing records over a period, you would instead want to employ Azure Time Series for processing. It is recommended to conceptually design your system using a model like the Lambda architecture and then decide which platform services match your requirements better.
  • Does the incoming data has any tags to classify / categories data?: Platform Services are getting more smarter as they learn about the needs of customers. You should explore features within these services that can provide out of box solution to complicated logic processing algorithm and then enrich your incoming messages to enable the use of the services. For example, if you want to route your incoming device data based on message properties or message body, IoT Hub provides a feature called Message Routing which can send messages to different downstream services based on a parameter in your message. It is handy if you are employing Hot Path vs. Cold Path Analytics since the same stream can now be sent to multiple downstream services without writing a single line of code.
  • Retention Policies and Archival: A lot of times planning for archival can be challenging but if you know how your data is growing and how much of it will move into cold storage you can employ some neat features provides by the platform services to reduce your cost and improve performance. For Example, Azure Storage Blob support a Tier based feature which allows you to move data from Hot, Cool to Archive Tiers, the pricing of each tier significantly varies and allows reducing data cost instead of using a single plan for both current and archival data.
  • Storage used by Serverless Compute: Azure Functions use storage accounts especially Blobs, Tables for its internal operations. What this means is that your Azure Function performance can be impacted by Storage limits and IOPS. Also, while developing Azure Functions, you need to plan for associated storage accounts including separating them per Function App, handling logs separately. If you are using Azure Durable Function, they leverage Azure Storage Queues for state management so you will need to consider additional implication when using Azure Functions.

Security

Leverage Security Threat models and Fault-Tolerant guidelines to prevent malicious attacks on your solution.

  • Transport and Messaging: Consider security at both layers
    • Almost all provider services by default provide a secure transport channel of communication (HTTPS, AMQPS, etc.) for communication. Leverage this as a standard.
    • Consider your authentication scheme and whether the service supports the negotiation through that scheme. Example, Azure Functions by default provide token-based security (Function, Anonymous, and Admin), if you need additional security such oAuth, you can leverage services like Azure API Management that can enable more secure scenarios.
    • When thinking about using Third-party authentication schemes (e.g., Facebook Google, etc.) consider their SLAs. If you solely rely on a provider that has no SLA, your users may get locked out in case the external services goes down.
    • Think end-end security and not just public endpoints. With a Serverless architecture, you will end up with a bunch of services that talk with each other to provide a solution. Some of these services may not have public endpoint however they still need to be secured to ensure the end-end protection of services.
  • Encryption and Encoding: Two key considerations if you have a system that encrypts or encode data when passing events between systems.
    • Custom processes will be required if you are processing messages using platform services since they support standard formats
    • The message size will increase and can impact overall persistence targets as well as response times because of encrypting/decrypting and encoding/decoding procedures.

    Most platform services are secured at the transport layer so use these techniques sparingly for specific workloads where data security is a must. Note, I am not recommending here that you should loosen your security procedure but rather spend time in choosing your workloads and classify which messages need encryption. A way to plan for this is to build a Security and Fault Tolerance model and determine which messages can have a significant impact in case the system is compromised.

  • PII data: Whether your application has a public or internal endpoint; if users are accessing it, you need to think about their Privacy. This becomes a little tricky when using platform services since your solution is deployed on an infrastructure where the provider will also have a privacy policy. Understand the privacy policies described by the platform and align with your policy.
  • MTTR: Build a Mean Time To Respond strategy when it comes to security. You cannot stop hackers from always fiddling with your public services (especially if you are famous). With a Service provider, this becomes even lesser control for your organization. In the worst case scenario, your service or the platform provider service gets compromised, plan for a response strategy where you can limit the attack surface. For example, have proper monitoring in place and use analytics to determine variations in patterns, in case a change is detected block the impacted users, devices and issue patches through the automated build that limits the widespread of the issue.

Availability and Disaster Recovery

  • Availability out of box: The good thing about living in the serverless world is that you get availability out of the box. All services provide high availability capabilities and in most cases either autoscale or provide easy configuration to handle workloads. So technically, most of it is taken care. However, when thinking about availability, don’t restrict to the SLA provided by a “single” service; instead, focus on the end-end solution. Since we are dealing with multiple services, ensure that your solution uptime is not impacted by the aggregate SLA provided by a combination of provider services.
  • Transient Fault Handling: Serverless services provide some level of protection against transient failures through internal implementations of the Retry and Circuit Breaker patterns. For example, the WebJobs SDK which is the basis of Azure Functions provides these as part of the platform runtime.
    • In addition to the default services, you can also use frameworks like Polly in your custom code to enable implementation of such patterns.
    • Not all services provide transient fault handling capabilities so ensure you have appropriate measures on the calling end of the services. (e.g. EventHub triggers today does not have a automatic retry and the calling function needs to ensure retry logic.)
  • Disaster Recovery (DR):: There is minimal DR capabilities provided by the platform services today so if you are looking at a complete DR solution you will have to do extensive planning. Let’s look at some of these by breaking the DR conversation into the following components:
    • Serverless Compute: Azure Functions lies under this umbrella and will constitute any custom code that you are running as part of your solution. A recommendation here is to use Stateless functions as much as possible, once you do that you can enable Disaster recovery by leveraging a service like Azure Traffic Manager. As of today, Azure Functions can be configured as an AppService in Traffic Manager and allows you to use any Routing strategy. Watch out for my next post on how to configure DR for Azure Functions to get more details.
    • Data Replication: All data storage services in Azure include Azure SQL, CosmosDB, Azure Storage provide geo-replication of data across Azure data centers. These can be enabled to ensure that all data at rest can be moved to a different paired region and is available in case of a data center failure. Note that you will have to plan for the consistency pattern for the data based on your workloads, for example, if you choose eventual consistency there could be a possibility of data loss due to asynchronous replication.
    • In-stream Processing: When we think about in-stream processing, I refer to Message queues and Job pipelines like Azure Stream Analytics. This is the tricky part when it comes to using provider services. Almost none of these services provide a message replication solution and even if they do there is minimal guarantees on data loss. Few ways to approach such situations are:
      • The first approach is to identify your workloads and see if they can live with the in-stream message data loss. So, basically losing messaging that are in the queue or currently under processing. This would require a robust client which can replay the message and is not possible in all scenarios.
      • Create a active-active cluster where the same message is directed to both data centers. While this will ensure message replication it can create problems around data duplication.
      • Some services like ServiceBus provides a mechanism where you can create NameSpace pairing to ensure primary data is copied to a secondary region in an asynchronous fashion.
    • Service Availability: Last but not the least, ensure that the services that your leveraging are available in paired regions to enable a DR scenario. For example, Azure App Insights is currently available in Southeast Asia but not its paired region East Asia.
  • Throttling: Up-till now we have been discussing how to ensure the service is up and running, however, in some scenarios you want to assign thresholds to your service so that you can deny requests instead of continue to process them. Throttling pattern is a great way to ensure your service is healthy and not exceeding internal thresholds that you have set for service performance. In case of Serverless a lot of these is done for you by default. For example, based on the Unit model you select the provider service will automatically have a threshold defined and will issue HTTP 429 requests when the thresholds are reached. Additionally, when using Azure Functions in a Consumption plan you can put a throughput threshold per function to define when to throttle your endpoints. Plan for throttling and time-outs on your service to ensure the client have a predictable experience and can handle such response gracefully.

Maintenance

  • Tooling: One of the key considerations when it comes to Serverless will be whether there is sufficient tooling available for the development team to build an end-end solution. Several things to consider here:
    • Programming Language: The choice of language will depend on whether the platform supports it. This becomes especially important when you have a development team with existing skills, for example GoLang is not supported by Azure Functions today. Also, some languages might be in experimental support and will not be ready for Production (e.g. TypeScript).
    • Dependency Frameworks: The version of runtime frameworks that you need for your solutions will also be important. Example: Azure Function 1 runtime support Node 6.5.0 for production deployment, however the current LTS version is 9.6+.
    • Cross-platform support: development teams who need to deploy on Linux and Windows need to ensure the runtime and Client SDKs are supported on required OS distributions.
    • IDE support: check if the development tools are available and integral as part of the IDE. If not then look for third-party extensions available for the scenarios.

      A note for Visual Studio Code in case you are developing Azure Functions, it is perhaps the best cross-platform IDE available today with an intuitive Azure Function extension that makes development and deployment to Azure a breeze. If you have not checked it out, download it here.

    • The DevOps cycle will be significantly impacted if you have don’t have the right tools in hand. Ensure that the service not just supports a Portal deployment but also command line and integration with CI / CD tools like Jenkins, VSTS, etc.
    • Azure Pre-Compiled v/s Scripted functions: A note on Azure pre-compiled v/s Scripted functions. A lot of samples and videos that you see out there use the Azure Portal for development, when you develop in the Portal the function is called a Scripted function. While they are good for sample scenario, when developing a production system, I recommend you create a pre-compiled function using an IDE and deploy it using Azure tooling. A key reason is that scripted versions do not support versioning so every time you run such a function, the runtime creates an assembly and deploy it per function. This not only impacts scale but also makes it difficult to do change management for future iterations.
  • Monitoring: Another important aspect of Maintenance stems from how you monitor the system. The better the monitoring, the quicker you can find errors and issues and keep the system healthy. Few considerations when it comes to monitoring:
    • End-End Telemetry: Most provider services have monitoring built-in which includes capturing events as well as monitoring dashboards such as Azure Monitor. While this is great from a particular service perspective when dealing with the entire solution you need to get data about event flows within the system and not just individual services. Services such as LogAnalytics and OMS greatly help in log aggregation and then displaying meaningful insights about the solution instead of just a single service. Additionally, Application Insights can be used to transmit custom logging data to these log aggregators to ensure end-end telemetry of the system can be obtained.
    • Additionally, for custom logging scenarios leverage Semantic Logging frameworks that can assist with the integration of multiple sinks and channels without making changes to your logging API.

Compliance

  • Standard and Policies: : A Serverless solution is your solution running in a provider infrastructure so it is important to understand the implications around compliance and how much control and configuration you can enable.
    • Provider Lock-In: The idea behind Serverless solution is to host your solutions in a provider environment. This by default encourages a provider lock-in since all of the services used by the solution will be specific to the vendor. But is that a bad thing? I would say it depends, in my experience a lot of customers who stick to a cloud do not move from too often or unless they experience some serious limitations and cost benefits. Since this is an infrequent action, I would suggest embracing the vendor services instead of being conservative and thinking about generic approaches. I do not say that because I work for a Cloud Provider, but instead, I have seen customers go down this rabbit hole of being generic and limiting their use of capabilities of a provider service resulting in a solution that could have been much better if they committed themselves to the provider service. This is a big decision for an organization though so carefully assess how you want to proceed.

      Azure Function is leading the way towards an Open direction by open sourcing the Function runtime; this enables sophisticated hybrid scenarios as well as portability to other clouds. Hopefully, other cloud vendors will be able to provide a standard runtime, so at-least the custom development on serverless can become portable.

    • Regulations: In addition to lock-in, consider any legal implications of using the services.
      • Are there any standards or policies that are required to be adhered to for the data that is being persisted?
      • Are there any security standards need to be respected to ensure data security and compliance at rest?
      • Are there any requirements to ensure data is available in a specific region (e.g., all data must be persisted within a country)?

      Some of above questions can tailor or limit the use of provider services depending on their availability in a region so read the fine print carefully.

Understand the platform constraints

Apart from the customer requirements, it is essential to understand the limitations and throttles of the Serverless platform. This is critical since you are dealing with a bouquet of services and you would want to look at the end-end execution of operations to ensure you can get performance, scale, and fault-tolerance across the stack and not just for a specific service.

The Azure team has done a great job in providing best practices for most of Azure Services, you can check them out here:

Hope this post gave an in-depth tour of the considerations for a Serverless architecture. Finally, remember, as you delve into the Serverless solution, you would realize you have choices but you need to cognizant of each choice, and it can impact your long terms scalability, availability, and reliability of the solution.

Would love to hear you thoughts and comments and If you have guidelines or practices for developing serverless architectures, please do share 🙂 …

Building Serverless API’s with TypeScript and Azure Function Proxies

TL;DR: In this post, we build a microservice that uses Azure Functions and other awesome Serverless technologies provided by Azure. We will cover the following features:

  • Azure functions currently has support for TypeScript in preview and we will be using the current features available to develop a read/write REST API.
  • We leverage the Azure Function Bindings to define Input and Output for our functions.
  • We will look at Azure Function Proxies that provide a way to define consistent routing behavior for our function and API calls.

If you want to jump in; the source is available on GitHub here (https://github.com/niksacdev/sample-api-typescript).

TypeScript support for Azure Functions is in preview state as of now; please use caution when using these in your production scenarios.

Problem Context

We will be building a Vehicle microservice which provides CRUD operations for sending vehicle data to a CosmosDB document store.

The architecture is fairly straightforward and looks like this:

Let’s get started …

Setting up TypeScript support for Azure Functions

VSCode has amazingly seamless support for Azure Functions and TypeScript including a development, linting, debugging, and deployment extension, so it was a no-brainer to use that for our development. I use the following extensions:

Additionally, you will need the following to kick-start your environment:

  • azure-function-core-tools: You would need these for setting up the function runtime in your local development. There are two packages here, and if you are using a Mac environment like me, you will need the 2.0 preview version.
     npm install -g azure-functions-core-tools@core
    		
  • Node.js (duh!): Note that the preview features currently works with 8.x.x. I have tried it on 8.9.4 which is the latest LTS (Latest LTS: Carbon), so you may have to downgrade using nvm if you are using the 9.X.X versions.

Interestingly, the Node version supported by Functions deployed in Azure is v6.5.0 so while you can locally play with higher versions you will have to downgrade to 6.5.0 when deploying to Azure as of today!

You can now use the Function Runtime commands or the Extension UI to create your project and Functions. We will use the Extension UI for our development:

Assuming you have installed the extension and connected to your Azure environment, the first thing we do is create a Function project.

Click on Create New Project and then select the folder that will contain our Function App.

The extension creates a bunch of files required for the FunctionApp to work. One of the key files here is host.json which allows you to specify configuration for the Function App. If you are creating HTTPTriggers, some settings that I would recommend tuning to improve your throttling and performance parameters:

{
    "functionTimeout": "00:10:00",
    "http": {
        "routePrefix": "api/vehicle",
        "maxOutstandingRequests": 20,
        "maxConcurrentRequests": 10,
        "dynamicThrottlesEnabled": false
    },
    "logger": {
        "categoryFilter": {
            "defaultLevel": "Information",
            "categoryLevels": {
                "Host": "Error",
                "Function": "Error",
                "Host.Aggregator": "Information"
            }
        }
    }
}

The maxOutstandingRequests can be used to control latency for the function by setting a threshold limit on the max request in waiting and execution queue. The maxConcurrentRequests allows control over concurrent http function requests to optimize resource consumption. The functionTimeOut is useful if you would want to override the timeout settings for the AppService or Consumption Plan which default limit of 5 minutes. Note that configuration in host.json are applied to all functions.

Also note that I have a custom value for route attribute (by default this is api/{functioname}). By adding the prefix, I am specifying that all HTTP functions in this FunctionApp will use the api/vehicle route. This is a good way to set the bounded context for your Microservice since the route will now be applied to all functions in this FunctionApp. You can also use this to define versioning schemes when doing canary testing. This setting can be used in conjunction with the route attribute in a function function.json, the Function Runtime appends your Function route with this default Host route.

Note that this behaviour can be simplified using Azure Function Proxies, we will modify these routes and explore more later in the Azure Function Proxies section.

To know more options available in host.json, refer here

Our project is now created, next, we create our Function.

Click Create Function and follow the onscreen instructions to create the function in the same folder as the Function App.

  • Since TypeScript is in preview, you will notice a (Preview) tag when selecting the language. This was a feature added in a new build for the extension, if you don’t see TypeScript as the language option, you can enable support for preview languages using the VSCode settings page, specify the following in your user settings:
“azureFunctions.projectLanguage": "TypeScript”

The above will use TypeScript as the default language and will skip the language selection dialog when creating a Function.

  • Select the HTTP Trigger for our API and then provide a Function Name.
  • Select Authorization as Anonymous .

Never use Anonymous when deploying to Azure

You should now have a function created with some boilerplate TypeScript code:

  • The function.json defines the configuration for your function including the Triggers and Bindings; the Index.ts is our TypeScript Function Handler. Since TypeScript is a transpiler, your Function needs the output .js files for deployment and not the .ts file. A common practice is to move these output files into a different directory so you don’t accidentally check them in. However, if you move them to a different folder and run the function locally you may get the following error:
vehicle-api: Unable to determine the primary function script. Try renaming your entry point script to 'run' (or 'index' in the caseof Node), or alternatively you can specify the name of the entry point script explicitly by adding a 'scriptFile' property to your function metadata.

To allow using a different folder, add a scriptFile attribute to your function.json and provide a relative path to the output folder.

Make sure to add the destination folder to .gitignore to ensure the output .js and .js.map files are not checked in.

"scriptFile": "../vehicle-api-output-debug/index.js"
  • The one thing that does not get added by default is a tsconfig.json and tslint.json. While the function will execute without these, I always feel that having these as part of the base setup helps in better coding practices. Also, since we are going to use Node packages, we will add a packages.json and install the TypeScript definitions for node
npm install @types/node —save-dev
  • We now have our Function and FunctionApp created, but there is one last step required before proceeding, setting up the debug environment. At this time, VSCode does not provide support for debugging Azure Functions written in TypeScript. However, you can enable support for TypeScript fairly easily. I came across this blog from Tsuyoshi Ushio that describes exactly how to do it.

Now that we have all things running, let’s focus on what our functions are going to do.

Building our Vehicle API

Developing the API is no different from your usual TypeScript development. From a Function perspective, we will split each operation into a Function. There is a huge debate whether you should have a monolith function API or a per operation (GET, POST, PUT, DELETE) API. Both approaches work, but I feel that within a FunctionApp you should try to segregate the service as much as possible, this is to align with the Single Responsibility Principle. Also, in some cases, you may achieve better scale by implementing a pattern like CQRS where your read and write operations go to separate functions. On the flip side, too many small Functions can become a management overhead, so you need to find the right balance. Azure Function Proxies provide a way to surface multiple endpoints using a consistent routing behavior, we will leverage this for our API in the discussion below.

In a nutshell, a FunctionApp is a Bounded Context for the Microservice, each Function is an operation exposed by that Microservice.

For our Vehicle API we will create two functions:

  • vehicle-api-get
  • vehicle-api-post

You can also create a Put, Delete similarly.

So, how do we make sure that each API is called only for the designated REST operation? You can define this in the function.json using the methods array.

For example, the vehicle-api-get is a HTTP GET operation and will be configured as below:

{
      "authLevel": "anonymous", --DONT DO THIS
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "route":"",
      "methods": [
        "get"
      ]
},

Adding CosmosDB support to our Vehicle API

The following TypeScript code allows us to access a CosmosDB store and retrieve data based on a Vehicle Id. This represents the HTTP GET operation for our Vehicle API.

import { Collection } from "documentdb-typescript";

export async function run(context: any, req: any) {
    context.log("Entering GET operation for the Vehicle API.");
    // get the vehicle id from url
    const id: number = req.params.id;

    // get cosmos db details and collection
    const url = process.env.COSMOS_DB_HOSTURL;
    const key = process.env.COSMOS_DB_KEY;
    const coll = await new Collection(process.env.COSMOS_DB_COLLECTION_NAME, process.env.COSMOS_DB_NAME, url, key).openOrCreateDatabaseAsync();

    if (id !== 0) {
        // invoke type to get id information from cosmos
        const allDocs = await coll.queryDocuments(
            {
                query: "select * from vehicle v where v.id = @id",
                parameters: [{name: "@id", value: id }]
            },
            {enableCrossPartitionQuery: true, maxItemCount: 10}).toArray();

            //  build the response
            context.res = {
                body: allDocs
                };
    } else {
                context.res = {
                    status: 400,
                    body: `$"No records found for the id: {id}"`
                };
    }

    // context.done();
}

Using Bindings with CosmosDB

While the previous section used code to perform the GET operation, we can also use Bindings for CosmosDB that will allow us to perform operations on our CosmosDB Collection whenever the HTTP Trigger is fired. Below is how the HTTP POST is configured to leverage the Binding with CosmosDB:

{
  "disabled": false,
  "scriptFile": "../vehicle-api-output-debug/vehicle-api-post/index.js",
  "bindings": [
    {
      "authLevel": "anonymous", --DONT DO THIS
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "route": "data",
      "methods": [
        "post"
      ]
    },
    {
      "type": "documentDB",
      "name": "$return",
      "databaseName": "vehiclelog",
      "collectionName": "vehicle",
      "createIfNotExists": false,
      "connection": "COSMOS_DB_CONNECTIONSTRING",
      "direction": "out"
    }
  ]
}

Then in your code, you can simply return the incoming JSON request and Azure Function takes care of pushing the values into CosmosDB.

export function run(context: any, req: any): void {
    context.log("HTTP trigger for POST operation.");
    let err;
    let json;
    if (req.body !== undefined) {
        json = JSON.stringify(req.body);
    } else {
        err = {
            status: 400,
            body: "Please pass the Vehicle data in the request body"
        };
    }
    context.done(err, json);
} 

OneClick deployment to Azure using VSCode Extensions

Deployment to Azure from the VSCode extension is straightforward. The interface allows you to create a FunctionApp in Azure and then provides a step by step workflow to deploy your functions into the FunctionApp.

If all goes well, you should see output such as below.

Using Subscription "".
Using resource group "".
Using storage account "".
Creating new Function App "sample-vehicle-api-azfunc"...
>>>>>> Created new Function App "sample-vehicle-api-azfunc": https://<your-url>.azurewebsites.net <<<<<<

00:27:52 sample-vehicle-api-azfunc: Creating zip package...
00:27:59 sample-vehicle-api-azfunc: Starting deployment...
00:28:06 sample-vehicle-api-azfunc: Fetching changes.
00:28:14 sample-vehicle-api-azfunc: Running deployment command...
00:28:20 sample-vehicle-api-azfunc: Running deployment command...
00:28:26 sample-vehicle-api-azfunc: Running deployment command...
00:28:31 sample-vehicle-api-azfunc: Running deployment command...
00:28:37 sample-vehicle-api-azfunc: Running deployment command...
00:28:43 sample-vehicle-api-azfunc: Running deployment command...
00:28:49 sample-vehicle-api-azfunc: Running deployment command...
00:28:55 sample-vehicle-api-azfunc: Running deployment command...
00:29:00 sample-vehicle-api-azfunc: Running deployment command...
00:29:06 sample-vehicle-api-azfunc: Running deployment command...
00:29:12 sample-vehicle-api-azfunc: Running deployment command...
00:29:17 sample-vehicle-api-azfunc: Running deployment command...
00:29:24 sample-vehicle-api-azfunc: Syncing 1 function triggers with payload size 144 bytes successful.
>>>>>> Deployment to "sample-vehicle-api-azfunc" completed. <<<<<<

HTTP Trigger Urls:
  vehicle-api-get: https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle-api-get

Some observations:

  • The extension bundles everything in the App folder including files like local.settings.json and the output .js directories, I could not find a way to filter these using the extension.
  • Another problem that I have faced is that currently neither the extension or the CLI provides a way to upload Application Settings as Environment Variable so they can be accessed by code once deployed to Azure, so these have to be manually added to make things work. For this sample, you will need to add the following key-value pairs in the FunctionApp -> Application Settings added through the Azure Portal so they can be available as Environment Variables!
"COSMOS_DB_HOSTURL": "https://your cosmos-url:443/",
"COSMOS_DB_KEY": "your-key",
"COSMOS_DB_NAME":"your-db-name",
"COSMOS_DB_COLLECTION_NAME":"your-collection-name"
"COSMOS_DB_CONNECTIONSTRING":"your-connection-string"
  • If you are only running it locally, you can use the local.settings.json, there is also a way through CLI to publish the local settings values into Azure using the --publish-local-settings flag, but hey there is a reason these are local values!
  • The Node version supported by Azure Functions is v6.5.0 so while you can locally play with higher versions, you will have to downgrade to 6.5.0 as of today.

In case you guys have a better way to deploy to Azure, do let me know :).

Configuring Azure Function Proxies for our API

At this point, we have a working API available in Azure. We have leveraged the CQRS approach (loosely) to have a separate Read API and a separate Write API, to the client, however, maintaining code with multiple endpoints can quickly become cumbersome. We need a way to package our API into a facade that is consistent and manageable, this is where Azure Function Proxies comes in.

Azure Function Proxies is a toolkit available as part of the Azure Function stack and provide the following features.:

  • Building consistent routing behavior for underlying functions in the FunctionApp and can even include external endpoints.
  • Provides a mechanism to aggregate underlying apis into a single API facade. In a way, it is a lightweight Gateway service to your underlying Functions.
  • Provide a MockUp Proxy to test your endpoint without having integration points. This is useful when testing the request routing with dummy data.
  • One of the key aspects added to Proxies is support for OpenAPI which allows more out of box connectors to other services.
  • Support for Out of Box AppInsights support where a proxy can publish events to AppInsights to generate endpoint metrics for not just functions but also for legacy API’s.

If you are familiar with the Application Request Routing (ARR) stack in IIS, this is somewhat similar. In fact, if you look at the Headers and Cookies for the request processed by the Proxy, you should see some familiar attributes 😉

......
Server →Microsoft-IIS/10.0
X-Powered-By →ASP.NET
......
Cookies: ARRAffinity

Let’s use Function Proxies for our API.

In the previous sections, I showed how we could use the routePrefix in host.json in conjunction with route in function.json. While that approach works, we have to add configuration for each function which can become a maintenance overhead. Additionally, if I want an external API to have the same route path that will not be possible using the earlier approach. Proxies can help overcome this barrier.

Using proxies, we can develop logical endpoints while keeping the configuration centralized. We will use Azure Function Proxies to surface our two functions as a consistent API Endpoint, so essentially to the client, it will look like a single API interface.

Before we continue, we will remove the route attributes we added to our functions and only keep the variable references and change the routeprefix to just "". Our published Function Endpoint(s) now should look something like this:

Http Functions:
        vehicle-api-get: https://sample-vehicle-api-azfunc.azurewebsites.net/{id}
        vehicle-api-post:https://sample-vehicle-api-azfunc.azurewebsites.net/vehicle-api-post/

This is obviously not intuitive, with multiple Functions it can become a nightmare for the client to implement our Service. We create two Proxies that will define the route path and match criteria for our Functions. You can easily create proxies from the Azure UI Portal, but you can also create your proxy.json. The below shows how to define proxies and associate with our Functions.

  {
    "$schema": "http://json.schemastore.org/proxies",
    "proxies": {
        "VehicleAPI-Get": {
            "matchCondition": {
                "route": "api/vehicle/{id}",
                "methods": [
                    "GET"
                ]
            },
            "backendUri": "https://sample-vehicle-api-azfunc.azurewebsites.net/{id}"
        },
        "VehicleAPI-POST": {
            "matchCondition": {
                "route": "/api/vehicle",
                "methods": [
                    "POST"
                ]
            },
            "backendUri": "https://sample-vehicle-api-azfunc.azurewebsites.net/vehicle-api-post"
        }
    }
}

As of today, there is no upload proxy.json functionality in Azure but you can easily copy paste into the Portal Advanced Editor.

We have two proxies defined here. The first is for our GET operation and the other for POST. In both cases, we have been able to define a consistent routing mechanism for selected REST verbs. The key attributes here are the route and backendUri which allows us to map a public route to an underlying endpoint. Note that the backendUri can be anything that needs to be called under the same API facade, so we can club multiple services through a common gateway routing using this approach.

Can you do this with other Services, I would have to say, Yes. You can implement similar routing functionality with Application Gateway, NGINX and Azure API Management. You can also use an MVC framework like Express and write a single function that can do all this routing. So, evaluate the options and choose that works best for your scenario.

Testing our Vehicle API

We now have our Vehicle API endpoints exposed through Azure Function Proxies. We can test it using any HTTP Client. I use Postman for the requests, but you can use any of your favorite clients.

GET Operation

The exposed endpoint from the Proxy is:

https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle/{id}

Our GET request fetches the correct results from CosmosDB

POST Operation

The exposed endpoint from the Proxy is:

https://sample-vehicle-api-azfunc.azurewebsites.net/api/vehicle/

Our POST request pushes a new record into CosmosDB:

There we have it. Our Vehicle API that leverages Azure Function Proxies and TypeScript is now up and running!

Do have a look at the source code here (https://github.com/niksacdev/sample-api-typescript) and please provide your feedback.

Happy Coding :).