Belgian public broadcaster VRT NWS responded this week to a report revealing that subcontractors now have access to Google Assistant voice recordings, including those containing sensitive information, such as addresses, conversations between parents and children, business calls and other content. kinds of private information. As a result of this report, Google indicates that it is now preparing to conduct an investigation and take action against the supplier who leaked this information to the press point.
In a blog post, the company said it was joining language experts from around the world, reviewing and transcribing a "small set of questions" to help Google better understand different languages.
Only about 0.2% of all audio samples are reviewed by language experts. These snippets are not associated with Google Accounts during the review process, says the company. Other conversations or background noises are not supposed to be transcribed.
The broadcaster had listened to over 1,000 recordings and had discovered that 153 of them were accidental in nature. In other words, it was clear that the user did not intend to ask Google for help. In addition, the report revealed that it was often possible to determine the identity of a user, as the records themselves would reveal personal data. Some recordings contain extremely sensitive information, such as "bedroom conversations", medical inquiries or people in what appear to be domestic violence situations, to name a few.
Google has championed the transcription process as a necessary element to provide voice assistant technologies to its international users.
But instead of focusing on its lack of transparency to consumers and who actually listens to their voice data, Google said it was attacking itself.
"[Transcription] This is an essential part of the speech technology development process, and it is necessary to create products such as Google Assistant, "writes David Monsees, Google's Product Manager of Search, in the program. blog post. "We have just learned that one of these language reviewers violated our data security policies by disclosing confidential Dutch audio data. Our security and privacy teams have been activated on this issue, are investigating and we will take action. We are conducting a comprehensive review of our warranties in this space to prevent such misconduct from occurring again, "he said.
As voice assistance devices become more and more a part of everyday consumer life, there is growing interest in how technology companies handle voice recordings, who is listening, what recordings are being kept and for how much of time.
This is not a problem that only Google is facing.
Earlier this month, Amazon responded to a US senator's investigation into how she handled consumers' voice recordings. The investigation followed a CNET investigation that revealed that Alexa records were kept unless users manually erase them, and that some voice transcripts were never deleted. In addition, a Bloomberg report recently revealed that Amazon workers and contractors had access to the records, as well as an account number, the user's first name, and the serial number of the device.
In addition, a coalition of consumer privacy groups recently filed a complaint with the US Federal Trade Commission that Amazon Alexa would violate the COPPA (US Children's Online Privacy Protection Act) by not obtaining the necessary consent for the use of the children's data by the company.
Neither Amazon nor Google have done everything in their power to alert consumers to the use of voice recordings.
Problems with lack of disclosure and transparency may be another sign for US regulators that high-tech companies are not able to make responsible consumer data privacy decisions.
The timing of the news is not ideal for Google. According to reports, the US Department of Justice is preparing for a possible antitrust investigation into Google's business practices and is closely monitoring the company's behavior. Given this increased scrutiny, one might think that Google would revise its privacy policies with a fine-tooth comb – especially in areas recently under fire, such as voice data strategies of consumers – in order ensure that consumers understand the nature of their data. to be stored, shared and used.
Google also notes today that people have the means to not have their audio data stored. Users can either completely disable audio data storage, or choose to automatically delete data every three months or every 18 months.
The company also said that its work would better explain how these voice data are used in the future.
"We are still working to improve the way we explain our privacy settings and practices to users, and we will explore opportunities to further refine how data is used to improve speech technology," said Monsees.