Google’s updated messaging and dialer apps for Android devices collect and send data to Google without express notification and consent and without offering the option to opt-out, which may violate EU privacy laws.
According to a research paper “What data do Google dialer and messaging apps on Android send to Google? » [PDF]by Douglas Leith, Professor of Computer Science at Trinity College Dublin, Google Messages (for text messages) and Google Dialer (for phone calls) sent data about user communications to Google Services’ Play clearcut logging service and Google’s Firebase Analytics service.
“The data sent by Google Messages includes a hash of the message body, allowing the sender and recipient to be linked when exchanging messages,” the document reads. “The data sent by Google Dialer includes the time and duration of the call, which in turn connects the two handsets involved in a call. Phone numbers are also sent to Google.
The time and duration of other user interactions with these applications are also transmitted to Google. And Google offers no way to object to this data collection.
Google Messages (com.google.android.apps.messaging) is installed on over a billion Android phones. It is offered by AT&T and T-Mobile on Android phones in the US and comes preinstalled on newer phones from Huawei, Samsung and Xiaomi. Similarly, Google Dialer (aka Phone by Google, com.google.android.dialer) has the same reach.
Both preinstalled versions of these apps, the document says, lack app-specific privacy policies that explain what data is collected — something Google requires of third-party developers. And when a request was made via Google Takeout for Google account data associated with the apps used for testing, the data provided by Google did not include observed telemetry data.
Both apps currently have links to Google’s consumer privacy policy on Google Play that aren’t app-specific and not necessarily obvious to those who get the preinstalled apps.
From the Messages app, Google takes the message content and a timestamp, generates a SHA256 hash, which is the output of an algorithm that maps the human-readable content to an alphanumeric digest, and then passes part of the hash, specifically a 128-bit truncated value , to Google’s Clearcut Logger and Firebase Analytics.
Hashes are designed to be difficult to undo, but in the case of short messages, Leith said he thinks some of them could be undone to recover some of the message content.
“Colleagues tell me yes, in principle it will probably be possible,” Leith said in an email to The Register Today. “The hash contains an hourly timestamp, so hashes would need to be generated for all combinations of timestamps and target messages and compared to the observed hash for a match – feasible, I think, for short messages given the power of modern computations.” »
The dialer app also logs incoming and outgoing calls, as well as the time and duration of the call.
As the document states, Google Play Services discloses that it collects some data related to security and fraud prevention, maintaining the Google Play Services APIs and core services, and providing Google services such as syncing, bookmarks, and contacts. However, it does not detail or explain the collection of message content or callers and call recipients. As the document states, “little details are given on the actual data collected”.
“I was surprised to see that this data was being collected by these Google apps,” Leith said.
Leith shared his findings with Google last November and said he’s had several discussions with Google’s director of engineering for Google Messages about the proposed changes.
The document outlines nine recommendations Leith made and six changes Google has made or plans to make to address the concerns raised in the document. Changes that Google has agreed to:
- Redesigned app onboarding flow so users are notified that they are using a Google app and given a link to Google’s consumer privacy policy.
- Stopped collecting the sender’s phone number through the CARRIER_SERVICES log source, ICCID 5 SIM, and a hash of the sent/received message body by Google Messages.
- Stop logging call events in Firebase Analytics from Google Dialer and Messages.
- Shift the collection of telemetry to use the least durable identifier available whenever possible, rather than tying it to a user’s Android persistent ID.
- Be clear about when caller ID and anti-spam protection are enabled and how to disable them, while also looking for a way to use less information or fuzzy information for security features.
Google confirmed to The Register on Monday that the newspaper’s accounts of its interactions with Leith are accurate. “We welcome partnerships – and feedback – from academics and researchers, including those at Trinity College,” said a Google spokesperson. “We have worked constructively with this team to address their feedback and we will continue to do so. »
The document raises questions about whether Google’s apps are GDPR compliant, but warns that legal conclusions are beyond the reach of technical analysis. We asked Google if they thought their apps were compliant with GDPR obligations, but got no response.
We have worked constructively with this team to address their feedback and will continue to do so.
Leith said it’s unclear whether Google’s commitments fully address the concerns he’s expressed.
Specifically, they say they will introduce a toggle in the Messages app to allow users to opt-out of data collection, but that this opt-out doesn’t cover data Google deems “essential,” meaning they say it remains certain We will collect data even if users opt out,” he said. “In my testing, I had already turned off the collection of Google data by disabling the Google Usage and Diagnostics option in the phone settings, and therefore the data I reported was already considered significant by Google. I think we’ll have to wait and see. »
Leith said there are two major issues related to the Google Play service, which is installed on almost all Android phones outside of China.
“The first is that log data sent from Google Play Services is tagged with the Google Android ID, which can often be linked to an individual’s true identity – so the data is not anonymized,” he explained. Second, we know very little about the data sent by Google Play services and the purposes. This study is the first to shed some light, but it’s just the tip of the iceberg. » ®
Add update
In a follow-up comment two days after this article was published, a Google spokesperson said the data was collected for diagnostic purposes: