Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
More diagnostics data from desktop (ubuntu.com)
60 points by Santosh83 on Feb 15, 2018 | hide | past | favorite | 73 comments


I see nothing in wrong with this as long as the checkbox is prominently presented, preferably on it's own "page" of the installer, it can be easily updated post-installation, and the scope of agreed upon data collection isn't increased for a given installation after agreement.

> Any user can simply opt out by unchecking the box, which triggers one simple POST stating, “diagnostics=false”. There will be a corresponding checkbox in the Privacy panel of GNOME Settings to toggle the state of this.

By POST do they mean a POST to the same web endpoint? Why would opting out requiring notifying them of your installation. I get that they'd want to track installation counts but would expect opting out to be entirely silent. Or do they mean that opting out once you've opted in would notify them of such?

> And to reiterate, the service which stores this data would never store IP addresses.

Any idea what country / jurisdiction they plan to host this data and service?


The only potentially dangerous step I see is:

> Apport would be configured to automatically send anonymous crash reports without user interruption.

I'm unfamiliar with apport, but crash reports may contain partial memory dumps with information you don't want collected.

Also,

> Popcon would be installed. This will allow us to spot trends in package usage and help us to focus on the packages which are of most value to our users.

Far less worrying, but I can see some people cringing on that one.

Barring those two this is the kind of checkbox presented on install I'd definitely encourage people to leave checked.


I suppose it to have a "total number of installs" statistic.


They can already get an approximate number by counting downloads of package updates. It's deceptive to send any information if the user opted out.


I doubt it's possible since many systems update from closest mirrors, most of which are not Canonical's servers.


But the point is that's none of their business.


I hope they checked this with their lawyers, otherwise they will be sued out of existence when EU GDPR directive comes into effect(only applies to EU citizens).

Just check the "Consent" section at https://gdprchecklist.io/

Opt-out is not legal anymore... I'm not saying they are not allowed to collect, they can, just not by default. And because they are requesting "Location", they fall under GDPR even in lamest terms.


They don't collect any personal data. The GDPR defines "personal data" with

"‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;"

Nothing in the collected data is "relating to an identified or identifiable natural person", so the checkbox is not even necessary.


> Nothing in the collected data is "relating to an identified or identifiable natural person", so the checkbox is not even necessary.

The data collected could be used to identify a single person.

How many people have the exact same combination of CPU, amount of RAM, location, disk sizes,GPU vendor/model, disk layout and installed applications ?

If the data is unique enough, it's PII.


> The data collected could be used to identify a single person.

> How many people have the exact same combination of CPU, amount of RAM, location, disk sizes,GPU vendor/model, disk layout and installed applications ?

> If the data is unique enough, it's PII.

You can not connect a unique combination of cpu, gpu, disk size, with me, y0ghur7_xxx, if you don't collect my ip address or my name in any other way, so no, it's not "personal data" as per GDPR.


And if a crash report for Firefox/Iceweasel/Chromium/Brave is sent, that contains a memory dump including your username?

Seems like enough to make the connection, if someone got ahold of all the data.


The wrote in the mail that crash reports are anonymised.


They may well be, the issue is something called re-identification. Even with anonymised data, even when using differential privacy, re-identification can be possible if you're not very careful. The more data you have, the more likely re-identification is.


A unique random cookie is not PII automatically. It is only PII if it can be associated with something like a name (not limited to a name). Anonymous data with unique identities is not under the consent requirements of the GDPR.


It's interesting. On its own it won't let you identify anyone, even though it's unique. But if there is out there any database mapping computer configurations to identities then it does. There doesn't seem to be a reason to believe that such a database does not exist.


What about crash reports that will be send automatically to Canonical? I hope they would take precautions to scrub personal information from the data before sending them.


I'm unconvinced that what they're collecting is personal information. The closest would probably be location, but that's very coarse and is based on user selection not IP (which isn't stored). I might be wrong though, what data were you thinking could be problematic?


I'm not a lawyer, but my understanding is that anything that identifies consumer or group of consumers falls under GDPR.

General Data Protection Regulation, it's not only about personal information, but what data is collected by companies, who has access to it and why/how is it processed.


I agree that they are not collecting personal information. They are collecting information about your computer system. According to the ICO definition of personal data, it means "any information relating to an identifiable person who can be directly or indirectly identified in particular by reference to an identifier." (https://ico.org.uk/for-organisations/guide-to-the-general-da...).

Nothing like that is being collected.


> We would like to add a checkbox to the installer, exact wording TBD, but along the lines of “Send diagnostics information to help improve Ubuntu”. This would be checked by default.

This is about a choice presented as a checkbox that is checked by default during installation, not about a feature silently enabled by default you have to then opt out of.


The GPDR asks for a deliberate action to opt-in (“silence, pre-ticked boxes or inactivity” should not constitute consent). So they should make it a very prominent choice to opt-in. No more opt-out in Europe.


> “silence, pre-ticked boxes or inactivity” should not constitute consent

Could you give a reference for that?

I found this with a couple of examples[0] but my understanding so far is that as long as it's a separate isolated step dedicated to that question with clearly worded language explaining the collection, followed by a very explicit "I agree" button or checkbox, then it's fine. The step itself (not the checkbox) constitutes wilful agreement, as it requires action for that dedicated step: it's definitely not "inactivity" nor "silent". Here I assume the button to move to the next step is not simply labeled "next step", nor is it actionable by simply pressing the enter or space key, possibly by mistake.

[0]: https://dma.org.uk/article/gdpr-in-practice-tick-box-consent...


It's right there in the regulation under (32) [0]

I find your workflow reasonable, but it would fall short under 'free'. Negative and positive action are two different things and in this case the wording is quite clear ("clear affirmative act", "pre-ticked box [..] should not").

[0] http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX...


Opt-in by GDPR means that you have to specifically click on checkbox, button, url and say "I consent on collection of this data(list of things that will be collected) on behalf Canonical LTD."


Consent is just one way to legally process personal data. If their basis for processing it is something other than consent, an opt-out is them going above their responsibility.


True. However, it's hard to argue diagnostics data for Ubuntu would be performance of contract (art 6.1: b). It could be argued legitimate interest (arg 6.1: f), but then the right to object comes into play.


Yes, it definitely wouldn’t be performance of a contract!

I have no insight into their reasoning here, I just wanted to bring this fact up given I see people talking about consent all the time when that is not the only basis allowed under the law.


That's still tracking by default.

A safe bet in this case would be a "double opt-in" approach, in which they have a checkmark during the installer, and then send an interactive notification asking you to verify your decision after the installation. In such a case, tracking would need to be enabled only after the user has clicked "yes" on a notification.

The idea here is that the user has to do something explicitly to enable tracking. Clicking "next" on an installer isn't an explicit user action.


I believe GDPR specifically disallows "checked by default" for consent.


The data collected is probably not fit to be classified as personal data. GDPR does not automatically forbid collection of anything involved with a user otherwise things like page popularity ranking would not be possible. There was a court ruling stating that IP addresses could be considered personal data because an ISP would have a log that associated the IP with a person. As long as the are only collecting things like "what packages are installed", "how powerful is the system", "when did it last update", and anonymize the last section of the IP they should be fine.


Speaking with an expert on this recently, she is under the impression that individual opt-in boxes for every distinct type of data will be necessary. So a single box will be insufficient.


Oh, and it is most definitely opt in. Ubuntu will not get away with that.


"We would like to add a checkbox to the installer, exact wording TBD, but along the lines of “Send diagnostics information to help improve Ubuntu”. This would be checked by default."

That seems like a transparent way of collecting what appears to be not very personal data.


This is almost nothing compared to what Microsoft with Windows 10 collects. And beyond that, Canonical is honest and presents an opt out, while Microsoft does waaaay more without even telling, and hides all the many opt-out's in loads of hidden setting dialogs.

Still, it's good to keep a sharp eye on Canonical. I have not forgotten about the "lens" sending your local query to 3d parties like amazon by default..


Canonical is no saint,

https://www.eff.org/deeplinks/2012/10/privacy-ubuntu-1210-am... (Amazon Ads and Data Leaks in ubuntu 12.10)

https://arstechnica.com/information-technology/2012/12/richa... (Richard Stallman calls Ubuntu “spyware” )

https://www.wired.com/2013/11/fixubuntu/ (Linux Outfit Canonical Launches Campaign to Silence Privacy Critic)


That was shitty but they stopped doing it years ago (in 16.04 specifically), and it was easily disabled when it was there.

Meanwhile Windows 10 just keeps getting more and more invasive, and make it more and more difficult to turn off. Disabling web search in the start menu requires using either the registry editor or group policy now! And when you search for how to disable things everything you find is out of date because they keep changing it every 6 months. The trend has been going from actual option in settings menu -> require registry edit or group policy -> system ignores the registry and it becomes completely impossible to disable. I was pretty mad when they did this to the lock screen and it continues to be a mild annoyance to me every day.


Nobody is a saint; not even the saints. But making an effort to do the right thing, no matter their history, should be respected.


I'm surprised they didn't do this already. This is crucial data for them to actually improve the product.

Only thing I'm a bit concerned about is automatically sending crash reports. This highly depends on the programming language the crashing program was written in. It is very much possible for a program to include sensitive or personally identifying information in the stack traces right?


One thing in Ubuntu's favour, they are upfront and clear about their intentions. So long as switching the data collection off actually does switch it off, I am fine with them going ahead with this.

Other OSes have much to learn...


Agreed; telemetry is valuable, and I have no problem with it... provided there is an option to turn it off.


Title is editorialized and should be changed. They are not saying they will do it, they would like to do it and they are asking for feedback.


Well, Apple and Microsoft already have that data about my systems, so jump on board.

And if you don't want Canonical having this data you're probably savvy enough to uncheck the box, or you're already some other flavour.


The article actually asks for input of the collection of diagnostic data.

The title by Santosh83 says instead "Ubuntu will collect data about your system by default, starting with 18.04".


Seems like can opt out easily, and the information given appears reasonable so will quite happily check this box.


It will also enable a program called Popcorn that will track and report the relative frequency of your app usage. That is somewhat personal according to some people at least. And this option toggles Apport to send back crash reports without asking permission each time, which means it could(?) leak sensitive data in memory?


I think you mean popcon? Short for popularity contest[1]. It's been part of Debian forever, not sure when it first showed up but it was part of Woody at least. With Debian you could configure it to use PGP and encrypt the reports you send in. I generally disabled it on work computers but left it on at home. That probably didn't help them as much but I just couldn't see leaving this type of feature on for servers and computers that weren't mine.

[1]https://popcon.debian.org/


Debian's installer requires explicit opt-in consent to install popcon. It's not enabled by default.


Yes, that's true. I do think that that's the better way to do it.

I understand Canonical is trying to make a user oriented system that "Just Works" and collecting this data can certainly help with that but I do think that this should be Opt In, not Out. If nothing else to support the precedent that, in general, Opt In is better for privacy that Opt Out.


I've seen popcon in Debian before, tbh I thought it was already in Ubuntu! Some will be happy with and it and some won't, as long as its optional and presented clearly I don't see an issue. Sending crash reports without asking permission... that should probably be independently set from the rest of this opt-out.


This generally sounds resonable, but I'm somewhat concerned about that part:

> Apport would be configured to automatically send anonymous crash reports without user interruption.

Crash reports could easily contain sensible and private data (e.g. file names/paths or usernames) and I wouldn't be comfortable sending them to an unknown group of people by default. Canonical would have to convince the user that anonymization is working really reliable for me to activate that feature.

Right now I'm always drawn between fear of private data leaking and my will to support them in fixing annoying crashes (nemo crashes almost daily for me).


> private data (e.g. file names/paths or usernames)

Can you give an example of a path, file or user name that would cause problems to you (or anyone else) if Ubuntu engineers see it when fixing bugs?


PDFs from my bank start with my full name. Submitting a path to such file (or just a name of the file) would identify me automatically.

I'm fine with Canonical knowing me as an "Ubuntu user", I'm not fine with Canonical knowing my full name and that I use Ubuntu.


> /home/elon_musk/cancer_prognosis

> /home/elon_musk/tesla_bankruptcy_docs


~/Pictures/Gay_porn/


How is an Ubuntu engineer seeing that some random computer somewhere has a gay porn folder a problem for you specifically?


Would you mind printing your output of "tree /" so we can take a look?


I never claimed it was. But you seem to be claiming that literally nobody would have a problem with being outed this way.


~/valvedev/hl3


In addition to the GDPR concerns mentioned in sibling posts, this type of collection is likely covered by Article 5.3 of the EU ePrivacy Directive, which requires consent for storing or reading information from end-users' devices (also known as the "cookie rule").[0] The Dutch Data Protection Authority recently applied this rule to Microsoft's collection of telemetry data through Windows 10.[1] Notably, this rule is not limited to personal data; it applies to all "information."

[0] http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:020... [1] https://www.autoriteitpersoonsgegevens.nl/sites/default/file...


Seems sane and reasonable amount of data collection.


Welcome to Microsoft's world :)

I don't mind sharing data for software that I use for free but this kinds of bothers me a lot:

> This would be checked by default

Lol no, I know where are you going on this but please do NOT enable this by default.


Why not ? Some people just don't care, and these who don't care don't care to check the option. So to me it's more logical (in order to have more data) to check the box by default.


GDPR expressly forbids default checked boxes or "silent acceptance", setups which require opt-out to prevent tracking/data collection.

Data collection is OK, but it must be opt-in only.

You're only looking at it from a "more data = better" perspective, instead of a privacy perspective, which is a lot more important.


It has been mentioned already above.

We are talking here about anonymized diagnostic data, not private indentifiable information.


Crash reports can contain any data that happens to be in memory, including private identifiable information.

Additionally, "anonymized" data frequently turns out to be identifiable when enough of it is collected.


Same as KozmoNau7 reply, you should opt-in to send data rather than opt-out, which is the same case as the Windows and .NET telemetry in which people took a very strong stance against it, now, if Canonical wants to do the same we should at least view them with the same eyes we see Microsoft's movements.


Microsoft released products with opt-in tracking.

Canonical sent an email to a public list asking for feedback.

Your comment would stand if Ubuntu 18.04 was already out and this tracking was already in place, but it's not. It's being discussed.


"Any user can simply opt out by unchecking the box"

They should at least do it the other way round and have people opt in instead of opt out. Still I would prefer a distro that does not have such functionality in the first place.

Since Ubuntu did their first "phone home" experiments in 2015, I stayed away from it. And I will keep it that way. It was just too much of a user hostile action. Shuttleworth even defended it. Showed me that his values are not compatible with mine.

Linux Mint has been a very user friendly replacement so far.


Honestly fine with this, mostly given how bad 17.10 is. These folks are basically flying blind — a mess of a release.


I wish GDPR allowed citizens to personally sue companies and take a cut of the winnings like the disability regs here. It would give the regulation teeth fron day 1 and send a strong message.


It should really be Opt-In. It just tastes like Microsoft-Behavior.


Please use the original, less inflammatory title: "more diagnostics data from desktop."

Understandably, Canonical wants to collect diagnostic data useful for software development and bug fixing. Other major OS providers like Google, Microsoft, and Apple do this.

I like the fact that Canonical is doing it in an open manner: anyone can opt out by unchecking a box at installation or afterwards on the privacy panel; and they will make aggregate diagnostic data available to the public.


  > So you ask the user during install. Then the data is sent on first
  > boot. At what point can the user inspect the data, given that some of
  > it can't be collected until after installation is finished? It seems
  > like the first opportunity will be after it has been sent, unless you
  > ask the user a second time. So why not just ask them on first boot,
  > when you have already gathered all the data? That way user can inspect
  > the data there and then before deciding how to answer.

  Yes, I think the first opportunity would be after it has been sent.  I'm
  generally against asking more questions on login though, I think it would
  be clunky.
Do I understand it correctly that they want to send the data the first time unconditionally after installation, and only then present the user with the opt-out choice? I really hope I understood it wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: