Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wait, copilot operates as some privileged user (that can bypass audit?), not as you (or better, you with some restrictions)

That can’t be right, can it?



As someone else mentioned the file isnt actually accessed by copilot, rather copilot is reading the pre-indexed contents of the file in a search engine...

Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google


Oh, so there's a complete copy (or something that can be reassembled into a copy) completely OUTSIDE of audit controls. That's so much worse. :0


It's roughly the same problem as letting a search engine build indexes (with previews!) of sites without authentication. It's kinda crazy that things were allowed to go this far with such a fundamental flaw.


Yep. Many years ago I worked at one of the top brokerage houses in the United States, they had a phenomenal Google search engine in house that made it really easy to navigate the whole company and find information.

Then someone discovered production passwords on a site that was supposed to be secured but wasn’t.

Found such things in several places.

The solution was to make searching work only if you opted-in your website.

After that internal search was effectively broken and useless.

All because a few actors did not think about or care about proper authentication and authorization controls.


I'm unclear on what the "flaw" is - isn't this precisely the "feature" that search engines provide to both sides and that site owners put a ton of SEO effort into optimizing?


If you have public documents, you can obviously let a public search engine index them and show previews. All is good.

If you have private documents, you can't let a public search engine index and show previews of those private documents. Even if you add an authentication wall for normal users if they try to open the document directly. They could still see part of the document in google's preview.

My explanation sounds silly because surely nobody is that dumb, but this is exactly what they have done. They gave access to ALL documents, both public and private, to an AI, and then got surprised when the AI leaked some private document details. They thought they were safe because users would be faced with an authentication wall if they tried to open the document directly. But that doesn't help if copilot simply tells you all the secret in it's own words.


You say that, but it happens — "Experts Exchange", for example, certainly used to try to hide the answers from users who hadn't paid while encouraging search engines to index them.


That's not quite the same. Experts Exchange wanted the content publicly searchable, and explicitly allowed search engines to index it. In this case, many customers probably aren't aware that there is a separate search index that contains much of the data in their private documents that may be searchable and accessible by entities that otherwise shouldn't have access.


That's not necessarily what happened in the article. He wasn't able to access private docs. He was just able to tell Copilot to not send an audit log.


> Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google

Not my domain of expertise, but couldn't you at some point argue that the indexed content itself is an auditable file?

It's not literally a file necessarily, but if they contain enough information that they can be considered sensitive, then where is the significant difference?


Not only could you do that, you should do that.


That makes sense on a technical level, but from a security and compliance perspective, it still doesn't really hold up


Usage of Ai's almost by definition need everything indexed at all times to be useful, letting one rummage through your stuff without 100% ownership is just madness to begin with and avoiding deep indexing would make the shit mostly useless unless regular permission systems were put in (and then we're kinda back at were we were without AI's).


> I don't say I've visited a website when I've found a result of it in Google

I mean, it depends on how large the index window is, because if google returned the entire webpage content without leaving (amp moment), you did visit the website. fine line.


The challenge then is to differentiate between "I wanted to access the secret website/document" and "Google/Copilot gave me the secret website/document, but it was not my intention to access that".


Access is access. Regardless of whether you intended to view the document, you are now aware of its content in either case, and an audit entry must be logged.


Strongly agree. Consider the case of a healthcare application where, during the course of business, staff may perform searches for patients by name. When "Ada Lovelace" appears even briefly in the search results of a "search-as-you-type" for some "Adam _lastname", has their privacy has been compromised? I think so, and the audit log should reflect that.

I'm a fan of FHIR (a healthcare api standard, but far from widely adopted), and they have a secondary set of definitions for Audit log patterns (BALP) that recommends this kind of behaviour. https://profiles.ihe.net/ITI/BALP/StructureDefinition-IHE.Ba...

"[Given a query for patients,] When multiple patient results are returned, one AuditEvent is created for every Patient identified in the resulting search set. Note this is true when the search set bundle includes any number of resources that collectively reference multiple Patients."


What's the solution then? Chain 2 AIs, the first one is fine tuned on / has RAG access to your content telling a second that actually produces content what files are relevant (and logged)?

Or just a system prompt "log where all the info comes from"...


Someone please confirm my idea (or remedy my ignorance) about this rule of thumb:

Don't train a model on sensitive info, if there will ever be a need for authZ more granular than implied by access to that model. IOW, given a user's ability to interact w/ a model, assume that everything it was trained on is visible to that user.


Sure sounds like, for Microsoft, an audit log is optional when it comes to cramming garbage AI integrations in places they don't belong.


In Windows, if a process has Backup privilege it can bypass any permissions, and it is not audited by default due to it would create too much audit volume by actual backup applications. Any process that has this privilege can use it, but the privilege is disabled by default, so it would require deliberate enablement. It is fairly easy to enable in managed code like C#. Same goes for Restore privilege.


No need to go that far if any random app can read your entire user directory and everyone just accepts elevation prompts without reading them.


... and? In an audited environment you'd carefully vet how the backups work. That functionality is inside the security boundary so to speak.

I don't believe it's integrated with (any bypass of) auditing but the same "ignore permissions" capability exists on Linux as CAP_DAC_READ_SEARCH and is primarily useful for the same sort of tasks.


So... basically like when Delve was first introduced and was improperly security trimming things it was suggesting and search results.

... Or ... a very long-time ago, when SharePoint search would display results and synopsis's for search terms where a user couldn't open the document, but could see that it existed and could get a matching paragraph or two... Best example I would tell people of the problem was users searching for things like: "Fall 2025 layoffs"... if the document existed, then things were being planned...

Ah Microsoft, security-last is still the thing, eh?


I would say "insecure by default".

I talked to some Microsoft folks around the Windows Server 2025 launch, where they claimed they would be breaking more compatibility in the name of their Secure Future Initiative.

But Server 2025 will load malicious ads on the Edge start screen[1] if you need to access a web interface of an internal thing from your domain controller, and they gleefully announced including winget, a wondeful malware delivery tool with zero vetting or accountability in Server 2025.

Their response to both points was I could disable those if I wanted to. Which I can, but was definitely not the point. You can make a secure environment based on Microsoft technologies, but it will fight you every step of the way.

[1] As a fun fact, this actually makes Internet Explorer a drastically safer browser than Edge on servers! By default, IE's ESC mode on servers basically refused to load any outside websites.


I've always felt that Microsoft's biggest problem is the way it manages all of the different teams, departments, features, etc. They are completely disconnected and have competing KPIs. I imagine the edge advertising team has a goal to make so much revenue, and the security team has a goal to reduce CVEs, but never the twain shall meet.

Also you probably have to go up 10 levels of management before you reach a common person.


Just because malware authors have used winget doesn't mean package managers are virus-infested by default since it's used to deliver plenty of MS's own tools, you just need to be restrictive (or do you remove apt-get from Debian decendent distros also?).

100% agreed on the Edge-front page showing up on server machines being nasty though, server deployments should always have an empty page as the default for browsers (Always a heart-burn when you're trying to debug issues some newly installed webapp and that awful "news" frontpage pops up).


I really need to emphasize winget is way, way different than a Linux software repository. Debian's repository is carefully maintained and packages have to reach a level of notability for inclusion. Even the Microsoft Store uses overseas reviewers paid by Microsoft to review if store apps meet their guidelines.

winget has none of that. winget is run by one Microsoft dude who when pressed about reviewing submissions gave some random GitHub users who have not been vetted moderator powers. There is no criteria for inclusion, if you can pack it and get it by the automated scanner, it ships. And anyone can submit changes to any winget package: They built a feature to let a developer restrict a package be only updated by a trusted user but never implemented it. (Doing so requires a "business process" but being a one-man sideshow that winget is, setting that up is beyond Microsoft's ability.)

winget is a complete joke that no professional could stand for if they understand how amateur hour it is, and the fact it is now baked into every Windows install is absolutely embarrassing. But I bet shipping it got that Microsoft engineer a promotion!


What stands out to me is that winget has the appearance and is often perceived as a package manager, yet it's more of a CLI front end to an index, and that index seems to either point to the windows store or a URL to download a regular setup file which it'll run silently (adobe acrobat is the example that springs to mind).


Exactly! It's like curl|bash for Windows but where you don't even see the URL.


Is that any different than chocolatey, scoop, or homebrew casks?


homebrew casks have acceptance criteria.


100% agree on the home-page nastiness too.

Also, in Edge the new tab page is loaded from MS servers, even if you disable all the optional stuff. It looks like something local (it doesn't have a visible url) but this is misleading. If you kill your internet connection you get a different, simpler new tab page.

The Edge UI doesn't let you pick a different new tab page but you can change it using group policy.


Servers don't have Desktop GUI, so there is no way you can run a browser on a real server installation. That's done specifically to limit the attack surface. This applies to almost all Windows Server roles except very few such as ADFS which Microsoft is struggling to migrate for decades. Definitely to the root of all security - AD DC.

If you've elected to create a Frankenstein of a domain controller and a desktop/gaming PC and are using it to browse any websites, all consequences are entirely on you.


Hi! It sounds like you are not a systems engineer! Let me help:

When installing Windows Server, there is a "core" experience and a "desktop" experience option. The former is now the default, but nearly all enterprise software not made by Microsoft (and some that is made by Microsoft) require the latter. Including many tools which expect to run on domain controllers! Some software says it requires the GUI but you can trick into running without if you're clever and adventurous.

No GUI is definitely the future and the way to go when you can, but even the most aggressive environments with avoiding the GUI end up with a mix of both.

Speaking of a gaming PC, Edge on Windows Server is so badly implemented, I have a server that is CPU pegged from a botched install of "Edge Game Mode" a feature for letting you use Edge in an overlay while gaming. I don't think it should have been auto installed on Windows Server, but I guess those engineers at Microsoft making triple my salary know better!


Windows technicians are only proficient in ClickOps, so, yes. It has a GUI.


Tell that to all that old .NET Framework and other server code relying on various more or less random Windows features to do their jobs in enterprises.


Insecure by default. I remember in the previous place I worked we used ASP webforms. One of the major headaches I had to deal with is that by default, microsoft allows all users to view a page. I had to create huge scripts to go through the entire pagetree and check each's one's rights (moving up directories also because of course we also have cascading positive and negative rights), and output the results in the audits we did automagically each week.

One of the major issues was we could never properly secure the main page, because of some fuckery. At the main page we'd redirect to the login if you weren't logged in, but that was basically after you'd already gone through the page access validation checks, so when I tried to secure that page you wouldn't be redirected. I can't remember how, or even if I solved this...



That was a laugh-out-loud moment in that film.


lol. I’ve avoided MS my entire (30+ year) career. Every now and then I’m reminded I made the right choice.


I woke up to MS in 2023[0]. Never again.

[0]: https://www.scottrlarson.com/publications/publication-transi...


Brilliant.


No, it accesses data with the users privilege.


Are you telling me I, a normal unprivileged user, have a way to read files on windows that bypasses audit logs?


I'm guessing they are making an implicit distinction between access as the user, vs with the privs of the user.

In the second case, the process has permission to do whatever it wants, it elects to restrain itself. Which is obviously subject to many more bugs then the first approach.


If there is a product defect? Sure.

The dude found the bug, reported the bug, they fixed the bug.

This isn’t uncommon, there bugs like this frequently in complex software.


I think you just defined away the entire category of vulnerability known as "privilege escalation".


This isn’t an example of escalation. Copilot is using the user’s token similar to any other OAuth app that needs to act on behalf of the user.


If that is true, then how did it not get logged? The audit should not be under the control of the program making the access.


You're conflating two issues. The Purview search used to get the bad result wasn't clear, so unsure what system is doing the logging.


If someone (Copilot, in this case) has built a search index that covers all the files on your computer, and left it accessible to your user account... yes


Judging by what I've been seeing in the field in the last half-decade, this doesn't surprise me one bit. Zero forward thinking and comprehensive analysis of features before they are built, with tickets just being churned out by incessant meetings that only end because people get tired. And the devs just finish the tickets without ever asking why a feature is being built or how it actually has to interact with the rest of the system.

Multiply that by years, by changing project managers and endless UX re-writes, huge push for DEI over merit, junior & outsourced-heavy hires and forced promotions, and you end up getting this mess that is "technically" working and correct but no one can quantify the potential loss and lack of real progress that could have been made if actual competent individuals were put in charge.


It's not necessarily that Copilot has superuser access, it's more like the audit system isn't wired tightly enough to catch all the ways Copilot can retrieve data


I can only assume that Microsoft/OpenAI have some sort of backdoor privileges that allows them to view our messages, or at least analyze and process them.

I wouldn't be surprised.


I've disabled copilot i don't even find it useful. I think most people who use copilot have not see "better".


Do you mean the code completions or the agentic chat interface?

The latter is at least sort of usable for me, while the former is an active hindrance in the sense that it delays the appearance of much-more-useful Intellisense completions.

Having said that, even the agentic chat is not really a win for me at work. It lacks ... something that it needs in order to work on our large C++ codebase. Maybe it needs fine-tuning? Maybe it just needs tools that only pull in relevant bits of context (something like the Visual Studio "peek definition" so that it doesn't context-rot itself with 40 thousand lines of C++)? IDK.

For personal projects Claude Code is really good at the C++ type system, although inclined to bail before actually completing the task it's given.

So I feel like there's potential here.

But as you say, stock Copilot is Not It.


Have you met Microsoft?

This is the organization that pushed code-signing as their security posture for a decade.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: