Your web browser is out of date. Update your browser for more security, speed and the best experience on this site.

Update your browser
CapTech Home Page

Blog May 9, 2018

Alexa Tutorial - Developing Skills with Azure Functions

Background

As Alexa devices become more popular and more people become acclimated to voice powered systems, we are seeing a growing trend to invest in this new interaction model. Although the typical path for developing Alexa skills has been to leverage AWS Lambda Functions, in this blog I will demonstrate how any backend API system, specifically Azure Functions, can be used instead. This may be appealing to companies and developers that already use Azure or ones that want to develop libraries that can be used across other voice systems such as Cortana and Google Home.

The skill I will be developing is called Roll the Dice. This skill allows users to roll a die with a specified number of sides and return the results of the roll. In this example, it will only support rolling one die at a time. This walkthrough focuses on the interactions and bootstrapping of the Alexa Skill and Azure Function API so the logic of rolling a die will be simplified. The ability to roll a die is already a built-in skill for Alexa and other voice systems.

To implement this system or to follow along to implement a similarly architected system you will need an Amazon Developer account, a Microsoft Azure account and subscription, and Visual Studio 2017 v15.3 or later which has the Azure development workload with support for Azure Functions. A free trial of Azure will be sufficient to develop and test this skill.

Development

Creating the Alexa Skills

First, we will create the skill in the Amazon Developer Portal so that we have a basic understanding of the structure of the request before creating the API.

  1. Login to the Amazon Developer Portal.
  2. Navigate to "Alexa Skills Kit."
    Alexa Skills Kit
  3. Navigate to "Create Skill" on the right side of the screen.
  4. Follow the steps in the wizard to create the skill. I am naming this skill "Roll the dice" and selecting the "Custom" skill model.
  5. Select "Invocation" from the Interaction Model menu and provide a name. My invocation name will be the same as the skill name: "Roll the dice."
  6. Select "Intents" from the Interaction Model menu and add a new Intent. This will be a custom intent named "RollDice". Below are some of the utterances that I provided:
    • Roll a die with {DiceType} sides
    • Roll a {DiceType} sided die
    • Roll a dice
    These utterances help Amazon identify the varying parts of a spoken request. This skill will be simple and does not require many sample utterances but the more complicated the request and the more parameters needed, the more utterances you should provide. To read more about generating sample utterances you can read this blog on the Amazon Developer Portal. Since we are accepting the parameter {DiceType} that will dictate the number of sides for the die, I will add that as a numeric Intent Slot so that Alexa can interpret that into a parameter for the backend API to use.
  7. Finally, I will select "Build the model" to compile the utterances and intent slots to check for any errors.

Creating the Azure Function

Next we will create the Azure Function and develop the code that will be used to support the skill. As you will see this is simply an API that accepts and returns an object represented in JSON. Although we are deploying this as an Azure Function, this could be used in an ASP.NET Web API project and deployed to IIS or containerized to Docker.

  1. Login to the Azure Portal and create a new Function App Resource. Ensure your app name is unique.
  2. Below are the settings that I used for the new Azure Function.
  3. Create a new Azure Functions project. I named mine "RollTheDice."
  4. Open Visual Studio and create a new project.
  5. Select "HTTP Trigger" with "None" as the storage account and "Function" as the Access Rights. We are using the V1 / .NET Framework template instead of the V2 / .NET Core template to support the NuGet package we will be using. If .NET Core is preferred then you will need to use another NuGet Package that has the Alexa objects predefined such as Alexa.Net. That specific NuGet Package does require minor alterations to the HTTPRequestMessage passed into the Function.
  6. Add AlexaSkillsKit.NET and Newtonsoft.Json NuGet Packages to the project.
  7. Add a class that will map to the Azure Function, RollTheDice.
    using System.Net.Http;
    using System.Threading.Tasks;
    using Microsoft.Azure.WebJobs;
    using Microsoft.Azure.WebJobs.Extensions.Http;
    using Microsoft.Azure.WebJobs.Host;
    
    namespace RollTheDice
    {
     public static class RollTheDice
     {
     [FunctionName("RollTheDice")]
     public static async Task Run([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)]HttpRequestMessage req, TraceWriter log)
     {
     var speechlet = new RollTheDiceSpeechlet();
     return await speechlet.GetResponseAsync(req);
     }
     }
    }
      
  8. Add the speechlet class that will process the intents, RollTheDiceSpeechlet.
    using AlexaSkillsKit.Speechlet;
    using AlexaSkillsKit.UI;
    using System;
    
    namespace RollTheDice
    {
     public class RollTheDiceSpeechlet : SpeechletBase, ISpeechletWithContext
     {
     public SpeechletResponse OnIntent(IntentRequest intentRequest, Session session, Context context)
     {
     try
     {
     // Default to 6 sides if not specified
     if (!int.TryParse(intentRequest.Intent.Slots["DiceType"].Value, out int numSides))
     numSides = 6;
    
     var rollResults = new Random().Next(Math.Max(1, numSides - 1)) + 1; // Account for random returning '0'
     return new SpeechletResponse
     {
     ShouldEndSession = false,
     OutputSpeech = new PlainTextOutputSpeech { Text = $"I rolled a {numSides} sided die and got a {rollResults}." }
     };
     }
     catch (Exception ex)
     {
     return new SpeechletResponse
     {
     ShouldEndSession = false,
     OutputSpeech = new PlainTextOutputSpeech { Text = ex.Message }
     };
     }
    
     }
    
     public SpeechletResponse OnLaunch(LaunchRequest launchRequest, Session session, Context context)
     {
     return new SpeechletResponse
     {
     ShouldEndSession = false,
     OutputSpeech = new PlainTextOutputSpeech { Text = "Welcome to the Roll the Dice. Ask me to roll the dice." }
     };
     }
    
     public void OnSessionEnded(SessionEndedRequest sessionEndedRequest, Session session, Context context)
     {
     return;
     }
    
     public void OnSessionStarted(SessionStartedRequest sessionStartedRequest, Session session, Context context)
     {
     return;
     }
     }
    } 
  9. Go to Build > Publish from the application menu.
  10. Select "Start."
  11. Select Existing and click "Publish."
  12. Ensure you are signed in to the Microsoft Account associated with your Azure subscription. Select the subscription and then navigate to the Azure Function that was created.
  13. Select "Ok" and it will build and publish to Azure.

Integrate and Test

Now that the Alexa Skill is created and the Azure Function application is deployed we need to integrate the two and test.

  1. Sign back into the Amazon Developer Portal and go to your skill's "Endpoint" configuration.
  2. Paste the URL of the new function as found in the function's Azure Portal overview. Since Azure provides SSL certificates, make sure to select the subdomain SSL trusted certificate option.
  3. Build the interaction by selecting interaction model and "Build Model."
  4. Go to the "Test" tab and test your skill!

Conclusion

Although Amazon makes it very easy to use AWS Lambda functions as the backend for Alexa skills it is also easy to use other deployment options and frameworks as long as they can accept a standard HTTP request. If you plan on developing a skill in the future I encourage you to consider all options. Serverless applications are usually best for isolated and stateless skills but you may find that more complex skills with processing and state may be more performant and/or cheaper deployed as a standard API on a server. You can also add an Alexa API to an existing service to more easily leverage your existing libraries - there is always an option to migrate in the future.