It Is Easy to Recognise Speech

Wait A minute! What Did You Say?

It Is Easy to Wreck A Nice Beach

Or

It Is Easy to Recognise Speech

We have all seen voice digital assistants released on public devices: Alexa, Siri, Cortana — you got it, name a word ending with an ‘a’ or an ‘i’! This does not mean voice recognition is easy. It isn’t. Accuracy in speech recognition remains a challenge even though breakthrough improvement have been introduced with machine learning.

Digital Voice Assistant To the Rescue

Google, Amazon (Alexa), and Apple (Siri) use their own API in the cloud to recognise speech and then process your request.

Timeline, from most recent:

You’re probably thinking:

I like accuracy, but isn’t it possible to run this operation locally on a device and achieve the same performance level?

The error rate, or WER (Word Error Rate), is measured in the number of words that are wrongly converted from audio to text. The web giants are using their huge data set in order to train voice recognition models using real-life data. This data is often free to access: consider TEDx, for example, with 1,700 video recordings, or the audio streams you can find in YouTube content.

Don’t Try This At Home

Training requires TB of data and is computationally intensive. Warning: Don’t Try This at Home: your room will heat up quickly! A nicer, and more energy-efficient approach, is to run this operation in a properly-cooled data center. You will save money by using pay-as-you-go hardware resources billed by the hour, and released when not in use.

On the other hand, inference uses a trained model with new data. With its own memory size, and processor constraints, smartphones and other embedded devices can run inference locally if they can load a machine learning model and have the right software support. With Apple’s Core-ML release, this capability is now accessible to every iOS developer.

Learn more: Read our previous article to learn more about training and inference

Add 3 Digital Voice Functions in 2020

The digital voice assistant field is moving fast, with more integration in apps and devices as we have seen with Microsoft and Alexa.

The applications foreseen with voice processing capabilities include:

  • Searching for moments. As our video library size is increasing, can we search efficiently in our library based on a keyword, or a sentence, or a sentiment?
  • On the B2B side, voice and AI can introduce significant changes in customer relationships. Salesforce is running a demo to include sentiment analysis in their Einstein solution.
  • Getting access to customer complaint descriptions, and searching efficiently in this knowledge base is a manual and error-prone process, driving operation costs up. What would you change in your organisation if you could have search and AI capabilities on the feedback given by your customers?

 

The voice of the customer is important to capture client expectations, preferences and aversions. Your operations teams can benefit from advanced voice and AI features.

You can add a Sixth Sense in your VoIP services, learn more: ivoip.io

 

What is so hype about artificial intelligence?

Artificial Intelligence has inspired many science fiction novels and movies. Is it still science fiction?

No, not anymore. AI is alive! Nowadays, artificial intelligence is disrupting all sectors of the economy which invest heavily in R&D. This includes energy, automotive, transport and finance. Technology advances have allowed AI to develop very strongly in recent years.

Beyond this evolution, how can AI impact your business?

The evolution of artificial intelligence

Computer scientist Geoffrey Hinton, who studies neural networks used in artificial intelligence applications, poses at Google’s Mountain View, Calif, headquarters on March 25, 2015.

The building block in artificial intelligence is called a neural network (NN), and was inspired by the human brain because of its specifics. Although the neural network theory has been around since 1943 (McCullough & Pitts), neural networks were not performing well. Neural nets were missing an efficient computing power to drive them. In 2006, Geoffrey Hinton (godfather of AI and today leading the Google Brain research team) changed the course of history. At the time, his neural network application outperformed the most advanced voice recognition techniques. While the concept is generic, it can be applied to many research fields. It has evolved into “deep learning”.

Over 10 years, technology development continues to accelerate, and now achieves superhuman performance. For instance, The UK artificial intelligence startup Deepmind — bought by Google in 2014 — was developing software to automatically win in every Atari video game (Mnih et al., 2015). Awesome? Even more impressive is Google achievement in May 2017, when the AlphaGo application beat the top 5 Go players in the world. A superhuman achievement, again!

Smartphone devices have more powerful CPUs than last decade computers. And consequently AI technology now has the potential to fit into every embedded device in our pocket.

Yet, AI is still in its infancy! AI is now accessible to everyone and the community is growing fast. The main benefit of AI is to understand digital data. After all, it is built on top of Big-Data.

What about your data set and your use-case?

Machine learning is the art of teaching an algorithm to perform a highly specific operation for us. Neural nets, and deep learning are pure mathematical algebra. They require lots of calculation, processing power, and data. Without enough data, the machine learning model will not be able to converge and therefore will not be able to find an optimal solution. Again, AI requires lots of data and clever management of this data!

Trust our machine learning wizards!

Whatever we do, we generate data! There are plenty of data sources available today. These data sources can be used for research as well as within production environments. This is excellent news for ML enthusiasts and ML wizards seeking new challenges, like us!

Over the past two years, ML has given outstanding results in several computer science fields including voice recognition, image classification, time series analysis and fraud detection. Our work is to create smart data technologies for the telecommunications and IoT industries, for your business and your requirements.

With data science teams located both in Annecy (France) and Tokyo (Japan), we are ready to dedicate our expertise to the success of your project.

We are an experienced Elasticsearch partner and ML wizards ready to write a data success story together.

For more information, please contact us at: info@redmintnetwork.fr

 

Throw Away Your Code And Gather More Data

The Big Hype in Machine Learning Is Not Going to Stop

Why is that? Well maybe (and that is only our humble opinion) because we realise we can solve problems, difficult problems, with machine learning (ML), that we haven’t been able to solve in the past. This capability opens a new era in data science and technology. This is like a brand new Enlightenment Age, there are new discoveries every month and people are really excited about it!

 

 

Throw Away Your Code And Gather More Data!

During the last 40 years, engineers and scientist used to describe and solve difficult problems with complex algorithms. This approach takes time, effort, and state-of-the-art techniques to progress slowly. A common example is image classification. Say you have to write an application to split apple and oranges. You could:

  • extract color information, shape information, texture information, etc,
  • put the features together and write clever statistics; and,
  • Guess the label using the above feature statistics.

Excellent! You now have apples and oranges, and the error rate is below 5%. But what if you add lemons to the equation? Your code, and rules written to discriminate between labels, will have to change.

With machine learning, the focus is shifting to the data. The data decides which features are important and which ones are not. Think about it in two steps:

  1. Training. This step requires gathering enough data and formatting it properly. The more data, the more accurate the ML model will be. When training is supervised, we need to assign it explicit labels: This is a Lemon image: use ‘lemon’ label; This is an Orange image: use ‘orange’ label etc. The output of the training step is commonly called a model. With training, we have been able to teach our model the correct answer to our problem.
  2. Inference. A model is available, so let’s test it. Will this model be able to infer the correct answer when it sees new lemons? This process is called inference.

This second step, inference, is fast and cheap. Once the model is loaded into memory, matrix multiplications and floating point calculations will do the math. This concept is now general availability and is already running inside embedded devices: Apple has released core-ML, an API that runs in iOS and lets developers run ML inference directly on the device. Other embedded vendors, ARM included, have yet to release similar API support in their SoC unless it is available from a compatible library.

The hard step is training, as it requires time and strong data science knowledge. Training is data and computation intensive: you’ll want specialised and dedicated hardware! Finding the best model to fit a given data set is therefore time consuming. Without the right hardware (or cloud ML remote compute power), the most difficult problems will need months of computation and the results may not be the ones you expect.

Leaving The Past Behind

If you don’t have it yet, you don’t know what you’re missing. With image classification, the imageNet group has achieved nearly 100% accuracy. Even better, Google has released Inception after using its internal resources to train on a giant image dataset and the model is available (Apache License 2.0) on Github.

The same applies to digit classification, and the popular MNIST data set. Accuracy (inference accuracy) is beyond any other technique used in the past.

Everywhere you look, you will find difficult problems that computer science hasn’t been able to solve in the last decade. It is time to take a fresh look at them. These are exciting times to dig deeper and create new features and new solutions, with an edge on competition!