How to create your first Alexa skill

By Roman Schejbal
Published 8 years ago

AlexaEcho

For the last couple of weeks, Graham, Marcel, Sinem and I, from Red Badger, have been experimenting with Amazon’s Alexa Echo Dot. An Electric Hockey Puck that uses voice recognition powered by Amazon Alexa voice assistant.

In this post, I’d like to explain how one goes about creating their first Alexa skill.

Unboxing

The first thing we need to do after unboxing is to download the Alexa app from respective app store. Follow the instructions to connect it to WiFi. Once connected, Alexa should be ready and listening for requests, questions or commands.

One caveat

By the time of writing this blog, if you want to run a custom Alexa skill on your local device, you’ll need to set the device to US English. That took us some hard googling to find out, so you don’t have to.

Designing a voice user interface

There are some best practices when designing the user interface for Alexa and I’d recommend to follow them. You can find them in the documentation with examples.

I’ll just list two here that I consider crucial:

Make it clear that the user needs to respond - means that after you present options to the user, make sure you ask a question so they know that they are expected to say something
Don’t assume users know what to do - i’ve already mentioned this, basically make sure to give and clearly present the options to the user so they know how to answer / control your Alexa skill

Defining the voice interface

Amazon Developer Portal (ADP) is the place where we setup our skill. It’s a separate thing from AWS console and as far as we know, there isn’t a way of updating Alexa Skill programmatically. (Which makes us quite sad because we do really like to automate our deployment.)

When creating a skill in the developer portal, first thing we need to define is the Name and the Invocation Name of our skill which is pretty self-explanatory. In our case, we put “Jarvis” into both fields and proceed to the next step.

The interaction model

Alexa’s Interaction model consists of three elements:

1.2 Intent Schema

What is an intent? Intents are actions that users can do with your skill. And intent schema is a simple JSON definition of those intents.

This is the schema for our Jarvis skill:

{

"intents": [

{

"intent": "MakeFood",

"slots": [

{

"name": "food",

"type": "AMAZON.Food"

}

]

},

{

"intent": "AnswerFood",

"slots": [

{

"name": "food",

"type": "AMAZON.Food"

}

]

},

]

}

We have two arbitrary intent definitions:

MakeFood -- this intent is triggered when user asks Jarvis to make food straight away, like saying "Hey Alexa, tell Jarvis to make pancakes," as we’ll see in the Utterances definition
AnswerFood -- whereas this intent represents a simple answer to a Jarvis’s question like "pancakes" and can only be triggered if the session has been already initialized as we’ll see later in our code

Slot / (Custom) slot types

Think about slot as an intent’s argument. You can have multiple slots for an intent. For both of these intents above we use Amazon’s built-in Slot Type called AMAZON.Food - which is basically some predefined list of food. We could have a custom slot type but then we’d have to list the food options manually. I think that Alexa only uses these definitions to distinguish what intent should be triggered and what to feed into each slot as a value. So, it’ll also work with things that are not in the list, like shoes for example.

Utterances

These are what people say to interact with our skill or basically a voice-to-intent mapping if you prefer.

Again, this is the setup for Jarvis:

MakeFood to cook {Food}

MakeFood to make {Food}

AnswerFood I'm thinking {Food}

AnswerFood I'd like {Food}

AnswerFood I want {Food}

AnswerFood {Food}

Once we finish configuring our Interaction Model, we get to the Configuration step -- this is where we connect Alexa to our Lambda endpoint.

Build and host code

Login to the AWS Console and navigate to AWS Lambda. Click the region drop-down and select either US East (N.Virginia) or EU (Ireland) as Lambda functions for Alexa skills must be hosted in either one of these two.

Our Lambda needs to return a JSON response that looks something like this - this is actually the output we need to get when Jarvis is invoked without a command. I.e. “Alexa, open Jarvis” or “Alexa, ask Jarvis” and so on.

{

version: '1.0',

sessionAttributes: {},

response: {

outputSpeech: {

type: 'PlainText',

text: ‘Jarvis can cook food for you, what would you like?’

},

reprompt: {

outputSpeech: {

type: 'PlainText',

text: ‘What did you say you would like to eat?’

},

},

shouldEndSession: false,

}

}

So, the full source code for our Jarvis would look like this. (We still need to compile it with babel before shipping it to Lambda).

const makeResponse = (text, reprompt = false, shouldEndSession = true) => ({

version: '1.0',

sessionAttributes: {},

response: {

outputSpeech: {

type: 'PlainText',

text

},

reprompt: reprompt ? {

outputSpeech: {

type: 'PlainText',

text: reprompt

},

} : {},

shouldEndSession,

}

});

export const handler = function (event, context, callback) {

const { type, session } = event.request;

if (type === 'LaunchRequest') {

context.succeed(makeResponse(

'Jarvis can cook food for you, what would you like?',

'What did you say you would like to eat?',

false

));

} else if (type === 'IntentRequest') {

const { intent: { name, slots } } = event.request;

if (session.name === 'AnswerFood' && !session.new && slots.food) {

// make your call to a cooking service here

context.succeed(makeResponse(`${slots.food.value}, that's great. I'm on it sir.`));

} else if (session.name === 'MakeFood' && slots.food) {

// make your call to a cooking service here

context.succeed(makeResponse(`${slots.food.value}, that's great. I'm on it sir.`));

} else {

context.succeed(makeResponse(

'I did not understand your request. For now I can only cook, what would you like to eat?',

'What did you say you would like to eat?',

false

));

}

} else if (type === 'SessionEndedRequest') {

context.succeed('Good bye');

}

};

Testing the skill

Once we have our Lambda live and ready, we’ll go back to the Amazon’s developer portal and fill our Lambda’s id into the form, hit the "Next" button which will bring us into the Testing section. Type down one of our defined utterances and click on "Ask Jarvis" -- does it work? If so, your skill should as well be installed on Alexa Echo / Echo dot and you should be able to test it straight away.

Roman Schejbal, software engineer, Red Badger.

No Comments

Comments are closed.

How to create your first Alexa skill

Recent Headlines

Why using a VPN is becoming more important than ever

Druva launches new AI agents to help boost cyber resilience

UK drops demands for back door access to encrypted Apple data

96 percent of organizations worry about the impact of shorter certificate life

Microsoft adds a COPILOT function to Excel

OWC's latest portable SSD offers up to 8TB in a palm-sized design

Windows 11’s dark mode remains a work in progress for Microsoft

Most Commented Stories

Extended Windows 10 support means ditching your local account for a Microsoft Account

UpDownTool lets you move from Windows 11 to Windows 10 in just 5 clicks -- without losing any data

Saying no to Windows 11 just got easier -- Operese automatically transfers your Windows 10 files and settings to Linux

Google makes cheaper YouTube Premium Lite available more widely

Opera files antitrust complaint against Microsoft in Brazil, alleging unfair browser restrictions on Windows

Microsoft Recall is bad at filtering sensitive information

Google is injecting more AI into searches with Web Guide

High Court rejects Wikipedia challenge to UK online safety rules