Google, Amazon and Apple claims that its AI-based virtual assistants make it easy on the smartphone or at home. Last month, a couple from the Waasmunster region of Belgium unexpectedly learned the real workings of these so-called automated assistants.
Tim Verheyden, a journalist with the Belgian public channel VRT, contacted the couple with a mysterious audio file. To their surprise, they clearly heard the voice of their son and grandchild, as captured by Google's virtual assistant on a smartphone.
Verheyden claims to have had access to the file and over 1,000 other people through a Google provider that is part of a paid global workforce to control the audio captured by the assistant from devices including smart speakers, phones and security cameras. A recording contained the couple's address and other information suggesting that they were grandparents.
Most of the recordings reviewed by VRT, including the one referring to the Waasmunster couple, were intended; users have asked for weather information or pornographic videos, for example. WIRED reviewed the transcripts of files shared by VRT, which released a report on its findings on Wednesday. According to the broadcaster, in about 150 recordings, the assistant seems to have gone wrong after misinterpreting his message.
Some of these fragments captured phone calls and private conversations. They include announcements indicating that someone needs the bathroom and discussions about personal matters, including a child's growth rate, wound healing and love life. from one person.
Google says that it transcribes a fraction of the audio from the wizard to improve its automated voice processing technology. However, the sensitive data contained in the recordings and instances of Google's smoothly listening algorithms make it uncomfortable for some people – including the worker who shared audio with VRT and some privacy experts -. Privacy experts believe that Google's practices could violate the European Union's confidentiality rules, known as GDPR, introduced last year, which offer special protections for the protection of privacy. sensitive data such as medical information and require transparency on how personal data is collected and processed.
Following a Bloomberg article, VRT started talking with the Google entrepreneur. He describes how Amazon's Alexa audio – including unwanted recordings – is transcribed by company staff and entrepreneurs from Boston, Costa Rica and India. The Google subcontractor said that he was transcribing about 1,000 clips a week in Dutch and Flemish and that he was concerned about the sensitivity of some recordings. He showed VRT how he had connected to a private version of a Google app called Crowdsource to access the records that were assigned to him.
In one case, the contractor reported having transcribed a recording in which a woman felt she was in distress. "I thought physical violence was involved," he said in the English subtitles of VRT's video report. "It becomes real people you listen to, not just voices." The contractor adds that Google has not provided clear guidance on what workers should do in such cases.
In a statement, a Google spokesman said the company had opened an investigation because the subcontractor had breached the data security rules. According to the statement, Google uses "language experts from around the world" to transcribe the audio recordings of the company's assistant, but checks only about 0.2% of all recordings, which are not associated to user accounts.
Google reviewers may not see the account data, but they still have the ability to hear very private information, for example about health. Jef Ausloos, a researcher at the Center for Computing and Intellectual Property Law at the University of Leuven, Belgium, told VRT that Google's system may not comply with the GDPR, which requires consent explicit for the collection of health data.
Michael Veale, a technology policy researcher at the London-based Alan Turing Institute, said the disclosures did not appear to meet the requirements of the GDPR, even for data considered non-sensitive. The group of national data protection regulators in charge of the GDPR application stated that companies had to be transparent about the data collected and their processing. "You have to be very specific about what you are implementing and how," says Veale. "I think Google did not do it because it would look scary."
Google spokesman said the company would consider how it could clarify to users how the data is used to improve the company's speech technology.
Veale has filed a complaint Apple's Siri with the Irish data regulator, arguing that the service was in violation of the general regulations, because users did not have access to Siri's recordings. He added that Apple had replied that its systems handled the data with enough care so that the audio files of its own voice would not be considered as personal data. Google and Amazon allow users to view and delete their records. Amazon now allows users to call "Alexa delete everything I've said today" to purge your history.
Corrected 19/10/19 at 19h ET: The Google entrepreneur who spoke on Belgian television said that he was viewing 1,000 audio clips a week. An earlier version of this article indicated that he was reviewing 1,000 clips per month.