Have you ever felt that your voice was heard? Many of us, after just discussing them, get ads that are mysteriously attached to the conversation. Today, microphones are everywhere: smartphones, computers, watches, voice assistants, and so on. To overcome this intrusive hearing, a team of three machine learning experts at Columbia University has developed an algorithm that produces sounds that are almost inaudible to humans, but can shake frequencies and prevent our own devices from spying on us. This work was presented at the last ICLR conference.
Natural language processing is a field of artificial intelligence (AI). It lets you copy human voice into text, understand it and make a request or respond to a conversation like Siri or Alexa. AI algorithms are mainly divided into two groups: recognition and generation. In natural language processing, recognition comes down to word analysis and understanding, and generation comes down to synthesis. This work covers both areas, including a particularly innovative touch: predictive attack.
In previous softwares, the algorithms used to prevent evesdropping were not effective enough in real-time conversation situations. In fact, the characteristics of the sound signal change when the attack is carried out. These changes make it almost impossible for a machine to keep up with a person’s words. The main challenges are optimization and speed. The algorithm must be able to predict changes in pitch and speech speed. Humans are unpredictable, and machines must be prepared and adapted to it.
“Our algorithm manages to prevent a harmful microphone from picking up your sound properly in 80% of cases. It works even when we don’t know anything about the harmful microphone, such as its location or even the software used. “ The first author of this study was Mia Chikiar, an assistant professor of computer science.
A word that fits the prophecy
How to achieve this performance? Creates “prophetic attacks”. It transmits a signal using a computer, made with Hertzian frequency which varies according to the vocal characteristics of the speaker (at frequencies close to 16 kHz; a slightly audible sound). Algorithms first learn to “recognize” human speech and predict what will happen. It then only needs to create a perfect word model, which is suitable for prediction, which is added to the speech by an automatic speech recognition tool to make it unrecognizable.
To implement this system of predictive attacks, the tool uses deep learning (Acquiring deep knowledge): An artificial neural network that creates a database for combined sounds and frequencies. This system improves with the multiplication of data studied over time.
This algorithm remains at the prototype stage for the moment: much work remains to be done to make this system democratic and accessible to all.. Ideally, the team wants the technology to be expanded to languages other than English and the system to be downloaded to any electronic device you want to secure in order for it to act as an app.
However, this study also shows some ethical issues: we need to create an algorithm that protects us from other algorithms. “It simply came to our notice then that we were finally happy to build a car, but forgot to design a steering wheel and brakes., Explains Gianbo Shi, a computer scientist at the University of Pennsylvania in Philadelphia (USA). This is in fact evidence of a systematic regulatory failure against the massive collection of data for the purpose of targeted marketing. Even if these anti-spy systems are widely used in the future, other people will almost certainly try to adjust their recognition methods to overcome annoying fuss or reverse their effects.
Odyssey by Piettre
Opening picture: Examples of sonograms used by researchers (Credit Mia Chikiar / Columbia University).