Hey Alexa, tell me something About yourself.
Hey there, I’m your virtual assistant developed by Amazon. I’m a cloud-based AI, I’m capable of doing voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information, and a lot more.
Alexa, that sounds interesting, now tell me more about virtual assistants.
So get ready to dive in!! here you go…
A voice assistant is a digital assistant that uses voice recognition, language processing algorithms, and voice synthesis to listen to specific voice commands and return relevant information or perform specific functions as requested by the user.
Alexa, what are the technologies used by virtual Assistants?
Hey user, you know! Virtual Assistant uses Artificial intelligence and voice recognition to increase its accuracy and effectiveness of the results. The technology behind it is so fascinating.
Voice recognition technology identifies a speaker and authenticates that he or she is indeed that individual. Unlike speech recognition, which identifies the words spoken, voice recognition analyzes countless patterns and elements that distinguish one person’s voice from another. Voice recognition is now being used in every facet of our lives, personally and professionally.
Artificial intelligence is using machines to simulate and replicate human intelligence.AI introduces learning, problem-solving, and decision-making capabilities of the human mind. There were four approaches later developed that defined AI, Thinking humanly/rationally, and acting humanly/rationally. While the first two deal with reasoning, the second two deal with actual behavior. Modern AI is typically seen as a computer system designed to accomplish tasks that typically require human interaction.
Machine learning is a subset of AI applications that learns by itself. It reprograms itself, as it digests more data, to perform the specific task it’s designed to perform with increasingly greater accuracy. Machine learning applications (also called machine learning models) are based on a neural network, which is a network of algorithmic calculations that attempts to mimic the perception and thought process of the human brain.
Natural language processing (NLP):
Virtual assistants use natural language processing (NLP) to match user text or voice input to executable commands. Many continually learn using artificial intelligence techniques including machine learning.
Hybrid Emotion Inference Model (HEIM):
Humans can very well get an idea of emotions through the tone of the voices of others. An ML model called Hybrid Emotion Interference Model (HEIM) involves Latent Dirichlet Allocation (LDA) to extract text features and a Long Short-Term Memory (LSTM) to model the acoustic features, is deployed, to reveal the kind of emotions behind our voice.
Also, Virtual Assistants have 2 different approaches: Task-oriented and knowledge-oriented approaches.
A task-oriented approach is using goals to tasks to achieve what the user needs. This approach often integrates itself with other apps to help complete tasks. A knowledge-oriented approach is the use of analytical data to help users with their tasks. This approach focuses on using online databases and already recorded knowledge to help complete tasks.
Alexa, It was great learning about Virtual Assistants. Can you tell me how Alexa works technically?
Alexa works on Analysation of an “order”:
Well to make Alexa do any task you have to first say the wake word, which by default is “Alexa,” or press the action button on your device. (You can change the wake word to ‘Amazon,’ ‘Echo,’ or ‘Computer.’) Only after detecting the wake word, Alexa listens to your request.
For that, the devices use built-in technology that matches what you’ve said to the acoustic patterns of the wake word. This technology is called “keyword spotting.” After detecting the wake word, the device sends your request to Amazon’s secure cloud, and the cloud’s powerful capabilities verify the wake word and your request is being processed. After confirmation, an answer to your request is sent back.
The invocation name is the keyword used to trigger a specific “skill”. Users can combine the invocation name with an action, command, or question. All the custom skills must have an invocation name to start it.
Utterance: ‘Taurus’ is an utterance. Utterances are phrases the users will use when requesting Alexa. Alexa identifies the user’s intent from the given utterance and responds accordingly. So basically the utterance decides what the user wants Alexa to perform.
Hey Alexa, wait a minute, first of all, tell me what is a Alexa skill?
Well, Skills are voice-driven Alexa capabilities. Alexa skills are like apps. You can enable and disable skills, using the Alexa app or a web browser, in the same way, that you install and uninstall apps.
Alexa, can I create my skill?
Yes for sure, by using Alexa Skills Kit (ASK).
ASK is a software development framework that enables you to create content, called skills.
Alexa, how can I access a skill?
A user accesses content in a skill by asking Alexa to invoke the skill. Alexa is always ready to invoke new skills. When a user says the wake word, “Alexa,”. the device streams the speech to the Alexa service in the cloud. Alexa recognizes the speech, determines what the user wants, and then sends a request to invoke the skill that can fulfill the request. The Alexa service handles speech recognition and natural language processing. Your skill runs as a service on a cloud platform. Alexa communicates with your skill by using a request-response mechanism over the HTTPS interface. When a user invokes an Alexa skill, your skill receives a POST request containing a JSON body. The request body contains the parameters necessary for your skill to understand the request, perform its logic, and then generate a response.
Moreover, Every Alexa skill has a voice interaction model that defines the words and phrases users can say to make the skill do what they want. This model determines how users communicate with and control your skill.
Alexa supports two types of voice interaction models:
Pre-built voice interaction model — In this model, ASK defines the set of words users say to invoke a skill.
Custom voice interaction model — The custom model gives you the most flexibility but is the most complex. You design the entire voice interaction. With the custom model, you typically must define every way a user might communicate the same request to your skill.
With either type of voice interaction model, you develop your skill to receive voice requests, process the request, and respond appropriately.
For more information:
Keep learning, keep growing!!