Boost your tech industry knowledge with our FREE RESOURCES - Explore our collection
Back to all articles

September 24, 2023 - 8 minutes

Beyond Siri: The Evolution of Natural Language Processing in AI

Discover how voice assistants are the product of natural language processing advancements in recent years.

Ironhack - Changing The Future of Tech Education

Artificial Intelligence

When you think of artificial intelligence, you probably think of talking houses and robots that can do absolutely everything for us. And while movies and popular culture have led us to believe that robots will one day be capable of doing everything we can (and probably better than we can, too), artificial intelligence is so much more than that; it can recommend the next show for us to watch on our favorite streaming service or even track common symptoms across cancer patients to suggest the right treatment. 

While artificial intelligence does have lots of interesting and useful applications, it’s hard to ignore the super cool artificial intelligence advancements like ChatGPT, Amazon’s Alexa, or Apple’s Siri. And in fact, natural language processing is one of the most transformative advancements in artificial intelligence that really makes us feel like there’s a robot with us, helping to make our lives easier. 

But what is natural language processing? Is it separate from AI? How do Siri and Alexa actually work? We’ll dive into these burning questions and many more in this article. 

What is Natural Language Processing?

Let’s start at the very beginning; natural language processing is part of the branch of artificial intelligence that uses both machine and linguistic knowledge to teach computers how to understand both text and spoken word, just like humans can. 

While this might sound very straightforward, natural language processing is more challenging than what meets the eye. Why? Well, human language consists of: 

  • Extreme variation from speaker to speaker: think about everyone you know who has an accent, be it a native speaker from a different region, a fluent speaker who has a slight hint of an accent, or someone who is learning to speak your language. Then, take into consideration the amount of dialects that exist and you’ll realize the challenge that understanding spoken word means. 

  • Mistakes, intentional or not: people frequently make mistakes when writing, but especially when talking, incorporating slang or jargon that can completely change the meaning of a sentence provides quite the challenge for computers. 

  • Irony and sarcasm: irony and sarcasm are staples in most people’s vocabulary, typically signaled by a facial expression or change in tone of voice. But computers, which can only process the exact words it’s being fed, struggle to understand exactly what the speaker means. 

  • Multiple meanings: lots of words have multiple meanings and its desired meaning depends on many things: the surroundings of the speaker, their body language, facial expressions, or tone of voice. This ambiguity forces computers to decide what the speaker’s intention is, despite not having everything they need to know to make that decision. 

To help computers tackle these challenges and fully understand exactly what they’re being fed, natural language processing scientists work to help computers become even more capable of properly identifying and responding to human speech through the following methods: 

  • Speech recognition: for a program to respond or understand what a human is saying, the first step is to convert spoken words into text for the computer to digest. Seems simple enough, right? It isn’t, because this involves separating mumbled or mispronounced words, in addition to understanding various dialects or accents. 

  • Grammatical tagging: computers have a better chance at properly understanding human speech if it can identify the part of speech, such as a verb, noun, or adjective. Through teaching computers what words mean when used in specific grammatical structures, they better understand human speech. 

  • Named entity recognition: instructing computers to identify certain words such as countries or names of places helps avoid any confusion and recognize patterns when it comes to mentioning specific places/names. 

  • Sentiment analysis: this is one of the toughest to enact perfectly, as it requires a deep understanding of human irony and sarcasm usage; showing computers examples of when emotions affect the meaning of the sentence help the overall analysis of the speech be more accurate. 

Now that you’re clear on what NLP is and the challenges we face, let’s review the history of NLP and see how we’ve arrived at the NLP we know today. 

The History of Natural Language Processing

Artificial intelligence began as futuristic ideas of robots and only became a reality in the 1950s as computers began to exhibit intelligent behavior that scientists like Alan Turner researched to harness the power of artificial intelligence. Although NLP study began in the 1950s, it really took off in the 1980s as machine learning algorithms were introduced to assist with language processing. 

Machine learning consists of providing machines with vast amounts of data that they can process and then make sense of. Through this, they identify trends and patterns that can help them make better decisions and analyses. Over the years, NLP has become even more advanced, thanks to more and more data. 

Natural Language Processing Today 

Today, one of the most common examples of natural language processing is Siri, Alexa, and other voice assistants. Let’s discover how NLP technology has created this seemingly personal assistant that’s ready to assist us with whatever we need–and can understand our speech. 

Major companies like Apple, Amazon, and Google have all created their own versions of virtual assistants, so we’ll be referring to them generally and discussing the technology behind them that makes them so useful–and innovative. When first released, people tended to have fun with the technology and ask for jokes or simply carry a conversation, but the truth is that these voice-guided virtual assistants are incredibly valuable.

Voice assistants can: 

  • Follow your instructions to open an app, play a certain song, or make a purchase 

  • Give you directions to your destination 

  • Engage in a brief conversation 

  • Read from web pages/apps

  • Tell jokes and imitate human reactions 

  • Give you reminders as scheduled 

These may seem like minor things that you could do on your own and yes, you probably could. But voice-guided virtual assistants bring the following advantages to your life: 

  • Increasing accessibility by not requiring any physical actions; in addition, for those who are visually impaired, read aloud options permit the user to fully use their device. 

  • Handling simple tasks like making a dinner reservation, sending a text, or setting an alarm. 

  • Automating repetitive and common tasks so that your time is free to handle more challenging or intense responsibilities. 

How voice assistants actually work 

Now that you know what they can do and why they’re so valuable, let’s break down the actual processing of voice assistants to better understand how natural language processing technology works here. 

Step 1: Voice Recognition 

This might seem like the simplest part of the entire process, but it’s actually the most complicated and requires advanced technology. Why? Because of everything we mentioned before; understanding the vast variations in human language is an incredible challenge and one that took a while to become a reality. 

To make sense of your words, the software collects your voice, converting it into a data file and sending it to servers. We didn’t mention this earlier, but another challenge is that the voice assistant must separate your voice from your atmosphere, eliminating any other speakers or ambient noise that complicates the process of understanding you. 

Step 2: Connecting to Servers 

Once your words have been converted into a file and sent to the server, it’s time for the magic to happen. Your words go through the different methods we mentioned above like grammatical tagging, speech recognition, and sentiment analysis to identify both what words you’re saying and what your true meaning is. 

Major companies like Apple and Google already have giant databases of resources to help with your query, but can also simply use the internet to find the answer to your question (which is why an internet connection is required for using virtual assistants!).

Step 3: Understanding your Meaning 

With a clear understanding of what words you spoke, the software is now tasked with understanding why you said that, needing to figure out the meaning behind your words, taking into account your tone of voice, inflections, and any irony. And natural language processing takes front stage here, making sense of your meaning, not just the words. 

Drawbacks of natural language processing 

We’ve advanced significantly in recent years, but there’s still a lot of work to be done to improve the technology. As of today, these are the most common drawbacks with technologies like Alexa and Siri: 

  • Listening issues: in loud environments or when multiple people are talking around you, voice assistants may be unable to hear you or even determine who it should be listening to. 

  • Dialect/accent issues: users who have a strong accent, be it in their local dialect or second language, can struggle to be understood by voice assistants, especially if they aren’t fully familiar with how to accent the words properly. 

  • Language limitations: despite being known worldwide, most voice assistants are limited to English or a few other languages like Spanish, French, and German. For voice assistants to be truly universal and inclusive, all languages must be included. However, this will require a significant investment into the data required to make NLP work in these languages. 

  • Internet access: because the software pulls from the internet to evaluate speech, internet access is required to use voice assistants, which can limit both their accessibility and function.

As you can tell, natural language processing has advanced significantly over recent years, culminating in the creation of something that can help make our lives easier every single day. And at Ironhack, we’ve recently revamped our curriculum to reflect these changes in the tech world, ensuring our graduates are prepared to enter the workforce with the knowledge they need to land their dream jobs. 

If you’re interested in taking that first step towards learning what you need to know about artificial intelligence, you’re in the right place. Check out our bootcamps today and become the next AI expert. 

Related Articles

Recommended for you

Ready to join?

More than 10,000 career changers and entrepreneurs launched their careers in the tech industry with Ironhack's bootcamps. Start your new career journey, and join the tech revolution!