Google defends its practice of letting human employees, most of whom appear to be contract workers around the world, listening to audio recordings of conversations between users and its Google Assistant software. The answer comes after the revelations of the Belgian public broadcaster VRT NWS explains in detail how country contractors sometimes listen to sensitive audio data captured by Google Assistant in an accident.
In a blog post published today, Google states that it takes precautions to protect the identity of users and that it has "several protections to prevent" "false acceptances" , that is when Google Assistant activates on a device such as a Google Home speaker. without the appropriate wake-up word having been intentionally verbalized by a user.
The company has also asked human workers to analyze these conversations to help Google's software work in multiple languages. "This is an essential part of the speech technology development process and is needed to create products such as the Google Assistant," writes David Monsees, product manager for the company. Google search team, who drafted the ticket.
"We have just learned that one of these language reviewers has violated our data security policies by disclosing confidential Dutch audio data," adds Monsees, referencing audio clips with which the Belgian contractor shared VRT NWS. "Our security and privacy teams have been activated on this issue, are investigating and we will take action. We are currently conducting a comprehensive review of our warranties in this space to prevent such misconduct from reoccurring. "
In addition, Google claims that only 0.2% of all audio samples are reviewed by language experts. "Sound clips are not associated with user accounts as part of the review process, and reviewers should not transcribe background conversations or other noises, only transcripts directed to Google." ", adds Monsees.
Google adds that it provides users with a wide variety of tools to review the audio stored by Google Assistant devices, including the ability to manually remove these snippets and configure time-out timers. automatic. "We are still working to improve the way we explain our privacy settings and practices to users, and we will explore opportunities to further clarify how data is used to improve speech technology," concludes Monsees.
This blog does not discuss the number of global applications that workers around the world are considering for general improvements in natural language, not just to make sure the translations are accurate.
Artificial intelligence actors generally understand that human annotators are needed to make sense of raw AI training data, and that these employees are employed by companies like Amazon and Google, where they have access to both audio conversations and text transcriptions. some conversations between users and devices in the smart home. Thus, users can review the exchanges, annotate the data correctly and log any errors so that software platforms such as Google Assistant and Amazon Alexa can improve over time.
But neither Amazon nor Google have ever been completely transparent about this, which has sparked a number of controversies over the years, which have intensified further in recent months. Since Bloomberg reported in April that Amazon was using a lot of contract workers for Alexa's training, big tech companies in the smart home sector were forced to understand how these AI products and platforms were being developed, maintained and improved over the years. time.
If you want this data to be deleted, you must overcome many obstacles. And in the case of Amazon and Alexa, some of this data is stored indefinitely even after a user has decided to remove the audio, the company revealed last week. Google's privacy controls seem more robust than those of Amazon – Google allows you to completely disable the storage of audio data. But the two companies are now fighting for the general public to become aware of how artificial intelligence software is beta tested and tinkered in real time, while powering devices in our rooms, kitchens and living rooms.
In this case, a Belgian news agency reported having identified about 150 Google Assistant records on 1,000 extracts provided by a contractor who had been accidentally captured, without the word "alarm clock" being spoken. It is disconcerting that the employee in question, who was able to easily obtain this data, violating the privacy of users and the apparent guarantees of Google, is disconcerting. What is even more questionable is the way in which the worker says he was able to gather sensitive events in the user's home, as a potential threat of physical violence captured by false acceptance when the worker heard a voice. female who seemed to be in distress.
It is clear that owning a Google Home device or similar Assistant device and allowing it to listen to your sensitive daily conversations and verbalized Internet requests involves at least one type of privacy breach. The use of any Google product does this because the company earns money by collecting this data, storing it and selling targeting ads. But these results contradict claims by Google that it seems to do everything in its power to protect the privacy of its users and that its software does not listen unless the last word is said. Clearly, someone is actually listening elsewhere in the world. And sometimes they are not supposed to be.