I see nothing in wrong with this as long as the checkbox is prominently presented, preferably on it's own "page" of the installer, it can be easily updated post-installation, and the scope of agreed upon data collection isn't increased for a given installation after agreement.
> Any user can simply opt out by unchecking the box, which triggers one
simple POST stating, “diagnostics=false”. There will be a corresponding
checkbox in the Privacy panel of GNOME Settings to toggle the state of this.
By POST do they mean a POST to the same web endpoint? Why would opting out requiring notifying them of your installation. I get that they'd want to track installation counts but would expect opting out to be entirely silent. Or do they mean that opting out once you've opted in would notify them of such?
> And to reiterate, the service which stores this data would never store IP
addresses.
Any idea what country / jurisdiction they plan to host this data and service?
> Apport would be configured to automatically send anonymous crash reports without user interruption.
I'm unfamiliar with apport, but crash reports may contain partial memory dumps with information you don't want collected.
Also,
> Popcon would be installed. This will allow us to spot trends in package usage and help us to focus on the packages which are of most value to our users.
Far less worrying, but I can see some people cringing on that one.
Barring those two this is the kind of checkbox presented on install I'd definitely encourage people to leave checked.
I hope they checked this with their lawyers, otherwise they will be sued out of existence when EU GDPR directive comes into effect(only applies to EU citizens).
Opt-out is not legal anymore... I'm not saying they are not allowed to collect, they can, just not by default. And because they are requesting "Location", they fall under GDPR even in lamest terms.
They don't collect any personal data. The GDPR defines "personal data" with
"‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;"
Nothing in the collected data is "relating to an identified or identifiable natural person", so the checkbox is not even necessary.
> The data collected could be used to identify a single person.
> How many people have the exact same combination of CPU, amount of RAM, location, disk sizes,GPU vendor/model, disk layout and installed applications ?
> If the data is unique enough, it's PII.
You can not connect a unique combination of cpu, gpu, disk size, with me, y0ghur7_xxx, if you don't collect my ip address or my name in any other way, so no, it's not "personal data" as per GDPR.
They may well be, the issue is something called re-identification. Even with anonymised data, even when using differential privacy, re-identification can be possible if you're not very careful. The more data you have, the more likely re-identification is.
A unique random cookie is not PII automatically. It is only PII if it can be associated with something like a name (not limited to a name). Anonymous data with unique identities is not under the consent requirements of the GDPR.
It's interesting. On its own it won't let you identify anyone, even though it's unique. But if there is out there any database mapping computer configurations to identities then it does. There doesn't seem to be a reason to believe that such a database does not exist.
What about crash reports that will be send automatically to Canonical? I hope they would take precautions to scrub personal information from the data before sending them.
I'm unconvinced that what they're collecting is personal information. The closest would probably be location, but that's very coarse and is based on user selection not IP (which isn't stored). I might be wrong though, what data were you thinking could be problematic?
I'm not a lawyer, but my understanding is that anything that identifies consumer or group of consumers falls under GDPR.
General Data Protection Regulation, it's not only about personal information, but what data is collected by companies, who has access to it and why/how is it processed.
I agree that they are not collecting personal information. They are collecting information about your computer system. According to the ICO definition of personal data, it means "any information relating to an identifiable person who can be directly or indirectly identified in particular by reference to an identifier." (https://ico.org.uk/for-organisations/guide-to-the-general-da...).
> We would like to add a checkbox to the installer, exact wording TBD, but along the lines of “Send diagnostics information to help improve Ubuntu”. This would be checked by default.
This is about a choice presented as a checkbox that is checked by default during installation, not about a feature silently enabled by default you have to then opt out of.
The GPDR asks for a deliberate action to opt-in (“silence, pre-ticked boxes or inactivity” should not constitute consent). So they should make it a very prominent choice to opt-in. No more opt-out in Europe.
> “silence, pre-ticked boxes or inactivity” should not constitute consent
Could you give a reference for that?
I found this with a couple of examples[0] but my understanding so far is that as long as it's a separate isolated step dedicated to that question with clearly worded language explaining the collection, followed by a very explicit "I agree" button or checkbox, then it's fine. The step itself (not the checkbox) constitutes wilful agreement, as it requires action for that dedicated step: it's definitely not "inactivity" nor "silent". Here I assume the button to move to the next step is not simply labeled "next step", nor is it actionable by simply pressing the enter or space key, possibly by mistake.
I find your workflow reasonable, but it would fall short under 'free'. Negative and positive action are two different things and in this case the wording is quite clear ("clear affirmative act", "pre-ticked box [..] should not").
Opt-in by GDPR means that you have to specifically click on checkbox, button, url and say "I consent on collection of this data(list of things that will be collected) on behalf Canonical LTD."
Consent is just one way to legally process personal data. If their basis for processing it is something other than consent, an opt-out is them going above their responsibility.
True. However, it's hard to argue diagnostics data for Ubuntu would be performance of contract (art 6.1: b). It could be argued legitimate interest (arg 6.1: f), but then the right to object comes into play.
Yes, it definitely wouldn’t be performance of a contract!
I have no insight into their reasoning here, I just wanted to bring this fact up given I see people talking about consent all the time when that is not the only basis allowed under the law.
A safe bet in this case would be a "double opt-in" approach, in which they have a checkmark during the installer, and then send an interactive notification asking you to verify your decision after the installation. In such a case, tracking would need to be enabled only after the user has clicked "yes" on a notification.
The idea here is that the user has to do something explicitly to enable tracking. Clicking "next" on an installer isn't an explicit user action.
The data collected is probably not fit to be classified as personal data. GDPR does not automatically forbid collection of anything involved with a user otherwise things like page popularity ranking would not be possible. There was a court ruling stating that IP addresses could be considered personal data because an ISP would have a log that associated the IP with a person. As long as the are only collecting things like "what packages are installed", "how powerful is the system", "when did it last update", and anonymize the last section of the IP they should be fine.
Speaking with an expert on this recently, she is under the impression that individual opt-in boxes for every distinct type of data will be necessary. So a single box will be insufficient.
"We would like to add a checkbox to the installer, exact wording TBD, but
along the lines of “Send diagnostics information to help improve Ubuntu”.
This would be checked by default."
That seems like a transparent way of collecting what appears to be not very personal data.
This is almost nothing compared to what Microsoft with Windows 10 collects. And beyond that, Canonical is honest and presents an opt out, while Microsoft does waaaay more without even telling, and hides all the many opt-out's in loads of hidden setting dialogs.
Still, it's good to keep a sharp eye on Canonical. I have not forgotten about the "lens" sending your local query to 3d parties like amazon by default..
That was shitty but they stopped doing it years ago (in 16.04 specifically), and it was easily disabled when it was there.
Meanwhile Windows 10 just keeps getting more and more invasive, and make it more and more difficult to turn off. Disabling web search in the start menu requires using either the registry editor or group policy now! And when you search for how to disable things everything you find is out of date because they keep changing it every 6 months. The trend has been going from actual option in settings menu -> require registry edit or group policy -> system ignores the registry and it becomes completely impossible to disable. I was pretty mad when they did this to the lock screen and it continues to be a mild annoyance to me every day.
I'm surprised they didn't do this already. This is crucial data for them to actually improve the product.
Only thing I'm a bit concerned about is automatically sending crash reports. This highly depends on the programming language the crashing program was written in. It is very much possible for a program to include sensitive or personally identifying information in the stack traces right?
One thing in Ubuntu's favour, they are upfront and clear about their intentions. So long as switching the data collection off actually does switch it off, I am fine with them going ahead with this.
It will also enable a program called Popcorn that will track and report the relative frequency of your app usage. That is somewhat personal according to some people at least. And this option toggles Apport to send back crash reports without asking permission each time, which means it could(?) leak sensitive data in memory?
I think you mean popcon? Short for popularity contest[1]. It's been part of Debian forever, not sure when it first showed up but it was part of Woody at least. With Debian you could configure it to use PGP and encrypt the reports you send in. I generally disabled it on work computers but left it on at home. That probably didn't help them as much but I just couldn't see leaving this type of feature on for servers and computers that weren't mine.
Yes, that's true. I do think that that's the better way to do it.
I understand Canonical is trying to make a user oriented system that "Just Works" and collecting this data can certainly help with that but I do think that this should be Opt In, not Out. If nothing else to support the precedent that, in general, Opt In is better for privacy that Opt Out.
I've seen popcon in Debian before, tbh I thought it was already in Ubuntu! Some will be happy with and it and some won't, as long as its optional and presented clearly I don't see an issue. Sending crash reports without asking permission... that should probably be independently set from the rest of this opt-out.
This generally sounds resonable, but I'm somewhat concerned about that part:
> Apport would be configured to automatically send anonymous crash reports without user interruption.
Crash reports could easily contain sensible and private data (e.g. file names/paths or usernames) and I wouldn't be comfortable sending them to an unknown group of people by default. Canonical would have to convince the user that anonymization is working really reliable for me to activate that feature.
Right now I'm always drawn between fear of private data leaking and my will to support them in fixing annoying crashes (nemo crashes almost daily for me).
In addition to the GDPR concerns mentioned in sibling posts, this type of collection is likely covered by Article 5.3 of the EU ePrivacy Directive, which requires consent for storing or reading information from end-users' devices (also known as the "cookie rule").[0] The Dutch Data Protection Authority recently applied this rule to Microsoft's collection of telemetry data through Windows 10.[1] Notably, this rule is not limited to personal data; it applies to all "information."
Why not ? Some people just don't care, and these who don't care don't care to check the option. So to me it's more logical (in order to have more data) to check the box by default.
Same as KozmoNau7 reply, you should opt-in to send data rather than opt-out, which is the same case as the Windows and .NET telemetry in which people took a very strong stance against it, now, if Canonical wants to do the same we should at least view them with the same eyes we see Microsoft's movements.
"Any user can simply opt out by unchecking the box"
They should at least do it the other way round and have people opt in instead of opt out. Still I would prefer a distro that does not have such functionality in the first place.
Since Ubuntu did their first "phone home" experiments in 2015, I stayed away from it. And I will keep it that way. It was just too much of a user hostile action. Shuttleworth even defended it. Showed me that his values are not compatible with mine.
Linux Mint has been a very user friendly replacement so far.
I wish GDPR allowed citizens to personally sue companies and take a cut of the winnings like the disability regs here. It would give the regulation teeth fron day 1 and send a strong message.
Please use the original, less inflammatory title: "more diagnostics data from desktop."
Understandably, Canonical wants to collect diagnostic data useful for software development and bug fixing. Other major OS providers like Google, Microsoft, and Apple do this.
I like the fact that Canonical is doing it in an open manner: anyone can opt out by unchecking a box at installation or afterwards on the privacy panel; and they will make aggregate diagnostic data available to the public.
> So you ask the user during install. Then the data is sent on first
> boot. At what point can the user inspect the data, given that some of
> it can't be collected until after installation is finished? It seems
> like the first opportunity will be after it has been sent, unless you
> ask the user a second time. So why not just ask them on first boot,
> when you have already gathered all the data? That way user can inspect
> the data there and then before deciding how to answer.
Yes, I think the first opportunity would be after it has been sent. I'm
generally against asking more questions on login though, I think it would
be clunky.
Do I understand it correctly that they want to send the data the first time unconditionally after installation, and only then present the user with the opt-out choice? I really hope I understood it wrong.
> Any user can simply opt out by unchecking the box, which triggers one simple POST stating, “diagnostics=false”. There will be a corresponding checkbox in the Privacy panel of GNOME Settings to toggle the state of this.
By POST do they mean a POST to the same web endpoint? Why would opting out requiring notifying them of your installation. I get that they'd want to track installation counts but would expect opting out to be entirely silent. Or do they mean that opting out once you've opted in would notify them of such?
> And to reiterate, the service which stores this data would never store IP addresses.
Any idea what country / jurisdiction they plan to host this data and service?