Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Neural-hash-collider – Find target hash collisions for NeuralHash (github.com/anishathalye)
623 points by anishathalye on Aug 19, 2021 | hide | past | favorite | 351 comments


The README (https://github.com/anishathalye/neural-hash-collider#how-it-...) explains in a bit more detail how these adversarial attacks work. This code pretty much implements a standard adversarial attack against NeuralHash. One slightly interesting part was replacing the thresholding with a differentiable approximation. I figured I'd share this here in case anyone is interested in seeing what the code to generate adversarial examples looks like; I don't think anyone in the big thread on the topic (https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...) has shared attack code for NeuralHash in particular yet.


Nicely done. Thank you for sharing.

I'd like to share the following paper for anyone else who may be interested. It is about watermarking rather than a preimage attack.

"Adversarial Embedding: A robust and elusive Steganography and Watermarking technique" https://arxiv.org/abs/1912.01487

Unfortunately, the existence of invisible watermarking demonstrates a separate attack on the hash. Instead of a preimage attack, this might be able to change the hash of an image that is suspected of already being a match. A true-positive would be changed into a false-negative.


Well done ! Here is my version that uses Scipy lbfgs-b optimizers : https://gist.github.com/unrealwill/d64d653a7626b825ef332aa3b...


Ongoing related threads:

Apple defends anti-child abuse imagery tech after claims of ‘hash collisions’ - https://news.ycombinator.com/item?id=28225706 - Aug 2021 (401 comments)

Hash collision in Apple NeuralHash model - https://news.ycombinator.com/item?id=28219068 - Aug 2021 (662 comments)

Convert Apple NeuralHash model for CSAM Detection to ONNX - https://news.ycombinator.com/item?id=28218391 - Aug 2021 (177 comments)

(I just mean related to this particular project. To list the threads related to larger topic would be...too much.)


The integrity of this entire system now relies on the security of the CSAM hash database, which has just dramatically increased in value to potential attackers.

All it would take now, is for one CSAM hash to be known to the public, then uploading collided iPhone wallpapers to wallpaper download sites. That many false positives will overload whatever administrative capacity there is to review reports in a matter of days.


There's no need for someone to get the entire CSAM database. If they go on the darknet and just find enough images (or hashes) that would trip Apple's system, that would be enough. I'd assume any publicly available image on the darknet would likely also be on CSAM.


Exactly. It would be trivial for anyone to compile a list of possible known CSAM hashes. It would be illegal to do so, but only one person has to do it, and then the list of probably positive hashes can be distributed legally around the web.


Human doesn't need to see the material. One can automate this by crawling websites with known bad material and compute hash of all images. Surely some of those hashes are CSAM, we don't know which. But it's still equally dangerous.


No, there’s another private hash function that also has to match the known CSAM image for an image to be considered a match.

That one can’t be figured out through this technique.


But surely if they're doing this in anticipation of E2EE user data, that procedure becomes moot?

So either they have no intention to actually protect user data, or the system is trivially broken; either way a pretty damning look for Apple.


No, because the second hash is only performed after the first hash has already matched, and so, using the threshold secret sharing crypto, the server has learned the escrowed encryption key for the image. (Well, for its “visual derivative”, at any rate.)


So, Apple retains the ability to decrypt E2EE data? That’s… worse?


Only E2EE data that matches known CSAM hashes, and only when 30 matches have been found, and only the “visual derivative” of those images.


They couldn't possibly be using the "other perceptual hash algorithm commonly used for this stuff," PhotoDNA, could they?

I mean, hopefully not, but at this point, it's reasonable to call just about everything into question on the topic.


Their goal is to minimize false positives, so would not be in their best interest.


Which means all it takes now is one disillusioned Apple employee to leak the details of thay private hash function and the whole system is compromised.


One disillusioned employee that has access to the secret algorithm.


Before they make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself.


To assume CP is reviewed manually is simply wrong. You don't want to put such weight on an individual. You want to automate it as much as possible, with as little false positives (and false negatives) as possible.

For example, in case of a wallpaper, let's say its the Windows XP wallpaper. There's no human skin color in it at all, so you can easily be reasonably sure it isn't CP. You would not need an advanced ML for such.

And they can have multiple checksums, just like a tarball or package or whatever can have an CRC32, MD5, and SHA512. Just because one of these matches, doesn't mean the other don't. Only problem is keeping these DBs of hashes secret. But that could very well be a reason the scanning isn't done locally.


Apple explicitly states that it is reviewed manually.


To assume it’s never reviewed manually is absolutely terrifying.


That is not what I asserted. I asserted that it is only done when other options are exhausted.


A lot has been said about using this as an attack vector by possibly poisoning a victims iPhone with an image that matches a CSAM hash.

But could this not also be used to circumvent the CSAM scanning by converting images that are in the CSAM database to visually similar images that won't match the hash anymore? That would effectively defeat the CSAM scanning Apple and others are trying to put into place completely and render the system moot.

One could argue that these spoofed images could also be added to the CSAM database, but what if you spoof them to have hashes of extremely common images (like common memes)? Adding memes to the database would render the whole scheme unmanageable, no?

Or am I missing something here?

So we'd end up with a system that: 1. Can't be reliably used to track actual criminal offenders (they'd just be able to hide) without rendering the whole database useless. 2. Can be used to attack anyone by making it look like they have criminal content on their iPhones.


Or am I missing something here?

Wouldn't it be easier for offenders to avoid Apple products? That requires no special computer expertise and involves no risk on their part.


That would be more effective! But that's not something Apple is trying to solve here (or could ever solve).

They are trying to prevent CSAM images from being stored and distributed using Apple products. If that goal is easily circumvented, the whole motivation for this (anti-)feature becomes invalid.

I a way, this architecture could potentially even make Apple products more attractive for CSAM distributors, since they now have a known way to fly under the radar (something that is arguably harder/riskier on other image sharing platforms, where the matching happens server-side).

One reasonable strategy Apple could have against that is through constantly finetuning the NeuralHash algorithm to hopefully catch more and more offenders. If that works reasonably well, it might deter criminals from their platform because an image that flies under the radar now might not fly under the radar in the future.

NB. I'm not trying to say Apple is doing the right thing here, especially since the above arguments put the efficacy of this architecture under scrutiny.


> They are trying to prevent CSAM images from being stored and distributed using Apple products.

If what Apple is aiming for is a more complete version of E2EE on their servers, maybe that's just an unintended consequence of the implementation, and the very reason why they're surprised that this received so much pushback. If Apple wanted to offer encryption for all user files in iCloud and leave no capability to decrypt the files themselves, they'd still need to be able to detect CSAM to protect themselves from liability. In that case, scanning on the device would be the only way to make it work.

If that were the case, I still wouldn't believe that moving the scan to the device fundamentally changes anything. Apple has to conduct a scan regardless, or they'll become a viable option for criminals to store CSAM. But in Apple's view, their implementation would mean they'd likely be the first cloud company that could claim to have zero knowledge of the data on on their servers while still satisfying the demands of the law.

Supposing that's the case, maybe what it would demonstrate is that no matter how you slice it, trying to offer a fully encrypted, no-knowledge solution for storing user data is fundamentally incompatible with societal demands.

But since Apple didn't provide such an explanation, we can only guess what their strategy is. They could have done a lot better job at describing their motivations, instead of hoping that the forces of public sentiment would allow it to pass like all the other scanning mechanisms actually had in the past.


Allegedly other image sharing services already do CSAM detection after the upload. Switching handset OS removes any image-level defensive capability the abusers may employ, so they might face a higher risk using these other services.


What we're describing at this point is effectively the same as a system of automatically flagging users as potential criminals based on something as manipulable as a filename.


In addition to the attacks, such as converting legit image to be detected as CSAM (false positive) or circumventing detection of the real CSAM image (false negative), which have been widely discussed in HN, I think this can also be used to mount a DOS attack or to censor any images.

It works like this. First, found your target images, which are either widely available like internet memes for DOS attack or images you want to censor. Then, compute their Neuralhash. Next, use the hash collision tool to turn real CSAM images to have the same NeuralHash as the target images. Finally, report these adversarial CSAM images to the government. The result is that the attackers would successfully add the targeted NeuralHash into the CSAM database. And people who store these legit image will then be flagged.


Really naive question. What's to stop apple from using two distinct and separate visual hashing algorithms? Wouldn't the collision likelihood decrease drastically in that scenario?

Again, really naive but it seems like if you have two distinct multi-dimensional hashes it would be much harder to solve the gradient descent problem.


I'm fairly sure they do, actually. It was in one of the articles earlier today that Apple has a distinct, secret algorithm they perform on suspected CSAM server side after it gets flagged by the client side neural hash. Then only after 30 such images from a single user are identified as CSAM by both algorithms will they be sent to a human reviewer who will confirm their contents. Then, finally, law enforcement will be alerted.

There has been a lot of hyperbole going around and the original premise that this is a breach of privacy is still true, but in my opinion the actual repercussions of attacks and collisions are being grossly exaggerated. One would have to create a collision with known CSAM for both algorithms (one of which is secret) which also overlaps with a legal porn image that could be misconstrued as CSAM by a human reviewer, or at the very least create and distribute hundreds of double collisions to DOS the reviewers.


But relying on an algorithm staying secret is security-by-obscurity 101. You can rely on a cryptographic key staying secret; you can't rely on the design of an algorithm staying secret (I do agree there's a little blurring of these lines with large, trained models, but the gist remains - you can't just hope that nobody sees the structure/weights of your second hash function).


Given that training the same network (with the same structure) will result in different weights and different hashes with high probability, I would argue that the weights actually have all the important properties of a secret key. You would just need to treat them as one operationally, i.e., make sure as few people as possible have access, and use truly random instead of pseudorandom numbers during training.


Why can you rely on a cryptographic key staying secret, but not on a trained model staying secret? Both are pieces of binary data that you keep on your own machines and do not give to others.

They are exactly the same case.


There's a massive difference. In fact, the data describing the trained model is not at all analogous to a cryptographic key.

A cryptographic key is a piece of information which, as long as it remains secret, should be sufficient to protect the confidentiality and integrity of your system. This means that your system should remain secure even if your adversary knows everything else apart from the key, including the details of the algorithm you use, the hardware you have, and even all your previous plaintexts and ciphertexts (inputs and outputs). If the key fails to have this property, your cryptosystem is broken.

The trained model (or the weights of a NN) does not have this property at all. Keeping the model secret does not ensure the confidentiality or integrity of the system. E.g. just knowing some inputs and outputs of the secret model allows you to train your own classifier which behaves similarly enough to let you find perceptual hash collisions. If you treat your model as a cryptosystem, this would be a known-plaintext attack: any system vulnerable to these is considered completely and utterly broken.

You'd have to keep all of the following secret: the model, all its inputs, all its outputs. If you manage to do that, this might be secure. Might. But probably not. See also Part 2 of my FAQ, which happens to cover this question. [1]

[1] https://news.ycombinator.com/item?id=28232625


> E.g. just knowing some inputs and outputs of the secret model allows you to train your own classifier which behaves similarly enough to let you find perceptual hash collisions.

This seems highly unlikely. You could train a model to find those exact known hashes, but I highly doubt you could get it to accurately find any other unknown hash.

> You'd have to keep all of the following secret: the model, all its inputs, all its outputs.

These are all, in fact, secret.


> This seems highly unlikely. You could train a model to find those exact known hashes, but I highly doubt you could get it to accurately find any other unknown hash.

Your "highly doubt" is baseless. Black box attacks (where you create adversarial examples only using some inputs and outputs, but not the model) on machine learning models are not new. They have been demonstrated countless times [1]. You don't need to know the network at all.

> These are all, in fact, secret.

This is not the case, since regular, unprivileged Apple employees can and will look at the inputs and outputs of the model (the visual derivatives and their hashes). It's also irrelevant.

You insist that there is some kind of analogy between "keeping the model secret" and keeping a "cryptographic key" secret. There is no such analogy. It makes no sense. It is simply not there. Keeping a cryptographic key secret keeps confidentiality and integrity. Keeping your model secret accomplishes neither of these.

[1] https://towardsdatascience.com/adversarial-attacks-in-machin...


> Your "highly doubt" is baseless. Black box attacks (where you create adversarial examples only using some inputs and outputs, but not the model) on machine learning models are not new. They have been demonstrated countless times [1]. You don't need to know the network at all.

This is not a machine learning model as such, though, and is used differently than they are.

> This is not the case, since regular, unprivileged Apple employees can and will look at the inputs and outputs of the model

Can they?


Security by obscurity has protected a lot of things successfully...


In this context, security by obscurity is literally what protects private keys.


> It was in one of the articles earlier today that Apple has a distinct, secret algorithm they perform on suspected CSAM server side

But then they still need to upload the original image to the server, and what was the reason for doing the scanning client-side then when they still upload it?


>and what was the reason for doing the scanning client-side then when they still upload it?

Probably so that China cannot force Apple to hand over arbitrary images in iCloud (or all images in iCloud). With Apple's design the only images China can get from you are malicious images that people send you. If Apple scanned every image serverside without any clientside scanning, then theoretically China could get all newly-uploaded iCloud images.


They do: https://news.ycombinator.com/item?id=28230029

They also keep the second hash function private.


They do. The system isn’t vulnerable to these collisions attacks. The people saying they are are just not aware of how the system works.


It's common for two unrelated images come up as false positives when comparing hashes across different unrelated perceptual hashing methods.


To me the most interesting findings from this fiasco were:

1. People actually do use generally publicly available services to store and distribute CP (as suggested by the amount of reports done by Facebook)

2. A lot of people evidently use iCloud Photo Library to store images of things other than pictures they took themselves. This is not really surprising, I've learned that the answer of "does anybody ever?" questions is aways "yes". It is a bit weird though since the photo library is terrible for this use case.


> 2. A lot of people evidently use iCloud Photo Library to store images of things other than pictures they took themselves

Not in the context of CSAM, but this is iOS’s “appliance” user interface coming back to bite it. The iOS photos app doesn’t appear to have a way to show the user only the photos they took.

Apps like Twitter, browsers, chat apps, screenshots all get to save their photos in the photo library. I believe iOS 15 has a way to filter photos by originating app, but for most users currently, it’s hard not to use iCloud Photo Library for photos I didn’t take myself.

Interestingly, users save chat history including images, from apps like iMessage and WhatsApp, on iCloud too. I’m not sure what happens to e2e encryption for backed up data.


I also remember that WhatsApp has a default where it stores any viewed message in conversation in the library.


Yes, originally WhatsApp used to auto-save images received in chats into the photo library. But they added a setting for that years ago, not sure what the default is now.


My iCloud Photo Library is full of pictures saved from the Web and screenshots.

That is just the way it works on iOS and it is really annoying to have random cute dog pictures saved from Reddit crop up in "Your Furry Friends"-Compilations.


Before Files I've never actually saved any pictures and painstakingly routinely deleted screenshots.

The first is now a bit easier since the share sheet also has "save to files" (only some of the time though, for no apparent reason). The second is a bit easier as there at least is a Screenshot automatic album.

But yes, I see why people do this, but I wish apple provided a better way to not pollute the iCloud library.


> (as suggested by the amount of reports done by Facebook)

The number of reports is not the number of actual incidents. It could be Facebook's algorithms are really shitty and has millions of false positives. NCMEC and similar organizations like to brag about numbers of reports because Big Number Good.


Despite that Apple scanning our images is a horrible privacy practice, I don't get why 𝚜̶𝚘̶ ̶𝚖̶𝚊̶𝚗̶𝚢̶ some people think this is an ineffective idea.

Surely you can easily fabricate innocent images whose NeuralHash matches the database. But in what way are you going to send them to victims and convince them to save them to their photo library? The moment you send it via WhatsApp FB will stop you because (they think) it is a problematic image. And Even if the image did land, it has to look like some cats and dogs or the receiver will just ignore. (Even worse, the receiver may report you.) And even if your image does look like cats and dogs, it has to pass another automatic test at the server side that uses another obfuscated, constantly-updating algorithm. After that, even more tests if Apple really wants to.

That means your image needs to collide ≥ three times, one open, one obfuscated, and one Turing.

Gmail scans your attachments and most people are cool with it. I highly doubt that Apple has any reason to withdraw this.


It's a good question, and unfortunately you're probably right.

It boils down to this: If you can prevent [some organisation] from potentially destroying civilisation, how much effort would be too much effort, and how much uncertainty is too much uncertainty?

For most, there's a trade off. If someone believes that the technology is sufficient for any country to implement a brutal civil rights destruction campaign, and that this is 50% likely, and all you need to do is upload some harmless images to your icloud to thwart it, why wouldn't you? For example, maybe a certain political party could regain power in a massive way and start doing away with gay rights, locking up anyone with pictures of men kissing each other on their phones. Of course, other countries have other types of extremists that look like they might take over government, or have already taken over governments. In those countries, tools like this are already in place, and this new one could be very powerful.

So if you could upload some innocuous images and save hundreds of lives in 20 or more countries in the world, even if you don't fear for your own country, would you?

Assuming you said yes, the question is, could everyone who agrees give Apple enough false positives, force enough human moderators to have to inspect the images, that the whole scheme becomes financially un-viable on Apple's end?

Of course, the only thing we could do here is slow this "progress" down. Maybe we can use that extra time to share the message that "this kind of technology is not ok" and that being naive about what this tech will be used for is almost guaranteed to kill more people than it saves.

But while so few people are thinking of it in these terms, or if everyone believes the attempt to be futile, it can't work. It's like anything - by the time people realise there's a problem, it's often too late to fix the problem.


  save hundreds of lives, would you?
Sure, why not.

But let me ask some questions, because at this point I am not sure if people want Apple's system to be robust or jammable. If our fear is that Apple will tune the system to detect pictures of two men kissing, wouldn't an easily jammable system works in our favor because we can DDoS it or threaten to do so anytime we want?


It's a tricky one, certainly.

There is probably a graph somewhere showing how much effort to fix is too much effort, vs how much bad will in the community this project is inducing, vs how much value this project has.

So no, Apple's system being jammable is a great "booby prize" right up to the point where Apple fix the algorithms, or the chinese government start reporting these false positives as bugs and saying that Apple must fix the bugs before their devices can be sold in China.

And so, one has to assume that if the algorithms can be fixed, they will be fixed. If we can DDoS Apple's human checking capability while it's still young, we might be able to prevent more resource being sunk into it - though I agree, that's unlikely. If we can do all that AND make it clear that this is going to cause nothing but bad will, and if we can get enough governments to regulate against it, then there may be hope.

So your question was, why do people think this software isn't going to work, and the answer is we really hope it isn't going to work and really hope that enough people get onboard with the efforts to subvert it and really hope that the message sinks in to Apple that this was a doomed project that they should abandon for good.

But with enough money and time, there is no problem that can't be overcome. So even if it is a currently jammable child protection system, in the future is will almost definitely become a robust human rights violation system. In the space between those two points there is hope. Slim hope, but hope nonetheless.


I really appreciate this comment, as anytime a new security issue creates a fuss, I feel like I'm the only one wondering what the real attack vector is. I'm genuinely glad people are so thoroughly investigating this new Apple policy, but at the same time I feel like I'm the only one dumb enough not to understand what I should be actually concerned about.


It's all theater. We need to stop talking about it and get trusted security researchers picked at random to be deployed to Apple for an audit. They sign an NDA, they work in a clean room with no way to exfiltrate data, they get full access to all the algorithms, source code, trained networks, all test data, access to Apples infra to test as they please, etc. And then they need to be the definitive authority on this matter to give us info and suggestions. Not Apple, not PR departments, not the EU parliament, not the US gov. And most certainly not us.


Almost nobody is arguing the effectiveness of the idea. That would be missing the point entirely.


The first comment algorithm choses to show me contains substring "and this system is transparently broken". Isn't that a remark on the effectiveness of the idea or I misread?

(In case you have problem with me saying "so many", that I can fix.)


> I don't get why 𝚜̶𝚘̶ ̶𝚖̶𝚊̶𝚗̶𝚢̶ some people think this is an ineffective idea.

Because such architectural flaws become absolute train wrecks when scaled. Remember the Clipper Chip? This is like that: cryptographers pointing out fundamental flaws that may seem like minor issues to most of the users who were going to be compelled to use it - but at scale those flaws result in the direct opposite of the stated objectives.

It feels weird having to explain scalability on HN... everyone here should know that if your little scheme is struggling pre-rollout then trying to power through will only magnify your troubles. So it is hard to account for that blind spot that defenders of this thing seem to have.


Unless Apple implements this with a backdoor on iCloud, the worst case scenario here is that they receive millions of false positives per day and terminate the program after two weeks.

The scalability issue seems to work in our favor because, perhaps, the normal usage will overwhelm the human reviewers Apple prepared and we don't even need to send troll images.

And for the entire time, our data remains untouched.


So Apple customers go from paying for a status symbol luxury item to paying for the privilege of participating in this, at best, pointless exercise wherein they rely upon the goodwill and competence of Apple's employees to not get them swatted. The mental gymnastics needed in order to guard one's ego on this issue can't be healthy either.


Some people sync iMessages. That might even be the default.


I believe being downvoted because this is thoroughly covered in the thread. Suggest you read it all again.


I did my best skimming comments the algorithm showed me, including related posts' comments. But man am I bad at reading comprehension.

If I miss anything then surely I am willing to be corrected. But so far I don't see comments that show us how to penetrate the four-layer system (local hash check, semantic check by user, on-server hash check, and human reviewer).


Ah ok! Here's the relevant part of the thread for that. https://news.ycombinator.com/item?id=28229832


That NSFW picture is, in honor of Sean Lock, a challenging wank


So, what does Apple get out of all this, except negative attention, erosion of their image, possible privacy lawsuits, etc?

I just don't understand what Apple's motivation would have been here. Surely this fallout could have been anticipated?


Is it like other stories I hear on HN where one guy or team is trying to get a promotion so keeps pushing their project? And people were afraid to oppose it? I’m baffled how this kept going up to implementation even in the birthplace of the reality distortion field.


Something this major, company brand changing, would have required a lot of executive sign-offs, presumably even Cook's personal blessing.


Cynical speculation:

Apple have decided their position of not being able to provide access to law enforcement is becoming a liability. They're probably under intense pressure from several governments on that front.

This is a way to intentionally let their hand be forced into scanning for arbitrary hashes on devices at the behest of governments, taking pressure off Apple and easing their relations with governments. They take a PR hit now, but it's not too bad since it's ostensibly about fighting child abuse, and Apple's heart is clearly in the right place. When later, inevitably, the hashes start to include other material, Apple can say their hands are tied on the matter - they can no longer use the "can't do it" defense and are forced to comply. This is much simpler than having to fight about it all the time.


My guess is that internally, they've realized they have a big CP problem on iCloud. That's a huge liability.


I have series doubts about that. CSAM really isn't an issue in the US, culturally and legally.


Really? It isn't an issue?


It is?


The FBI off their back that they aren’t doing enough to stop the spread of CP.


I find it hard to believe CSAM was so pervasive on iDevices that they'd feel compelled to do something about it.

As far as we know (and I'm sure lots of eyeballs are looking now) Android doesn't do this.

And frankly, why would Apple care that the FBI isn't cozy with them. Their entire brand is "security and privacy", kind of goes against most 3 Letter Agencies anyway.


At least according to [1]:

"Last year, for instance, Apple reported 265 cases to the National Center for Missing & Exploited Children, while Facebook reported 20.3 million, according to the center’s statistics. That enormous gap is due in part to Apple’s decision not to scan for such material, citing the privacy of its users."

If you were a law enforcement agency and noticed this discrepancy, would you believe that you'd be letting some number of child abusers get away because of that difference in 20 million reports? iCloud probably doesn't have the same level of adoption as Facebook, but the gap is still very large.

[1] https://www.nytimes.com/2021/08/05/technology/apple-iphones-...


I agree CSAM isn't likely to be pervasive in the photo libraries on iOS devices.

Android does not do on-device scanning, but Google does scan photos after they are uploaded to their cloud photo service. It's not on-device scanning, but the effect is functionally identical: photos that are being uploaded to the cloud are being scanned for CSAM. The only real distinction is who owns the CPU which computes the hash.

I doubt it's the FBI pressuring Apple. My suspicion is it's fear of the US Congress passing worse, even more privacy-invading laws under the guise of combating CSAM. If Apple's lobbyists can show that iPhones are already searching for CSAM, arguments for such laws get weaker.


> Android doesn't do on-device scanning, but Google does scan photos after they are uploaded to their cloud photo service.

So did Apple, and pretty much all cloud hosting providers.

This, on device, scanning is what's new, and very out of character for Apple.

> If Apple's lobbyists can show that iPhones are already searching for CSAM, arguments for such laws get weaker.

I'm not aware of any big anti-CSAM push being made by Congress. CSAM just isn't really a big issue in the US, the existing laws, and culture, are pretty effective already.


CSAM can never be a policy issue with two sides, because everyone is in agreement that we need to protect children. The higher powers want to prevent child abuse, and CSAM is directly tied to child abuse. When people argue that "think of the children" can be weaponized to attack their freedoms, they wouldn't dare try to argue against the premise that children are harmed because of CSAM - not because the arguments will fall on the deaf ears of some governmental agents trying to push an agenda, but because the premise itself is sound.

As a result, people will focus their arguments instead on the technological flaws in the current implementation of on-device scanning or slippery slope arguments that are unlikely to become reality, the feature will be added anyway with no political opposition, and in the end Apple and/or the government will get what they want, for what they consider the greater good.

I think that absolute privacy in society as a whole isn't attainable with those values in place, and it raises many questions regarding to what extent the Internet should remain free from moderation. Are there really no kinds of information that are so fundamentally damaging that they should not be allowed to exist on someone's hard drive? If not, who will be in control of moderating that information? Maybe we will have to accept that some tradeoffs between privacy and stability need to be made for the collective good, in limited circumstances.


There is a lower limit to privacy (as a human right) – which after passing, societies would seize to be "free" (liberal democracies?). But that's not a discussion people seem to want to have, when talking about their good intentions of fighting against horrible things.


> I'm not aware of any big anti-CSAM push being made by Congress.

Right now. The best time for Apple to do this is when cannot be painted as a defensive move against any specific legislation. The CSAM argument has been used many times in the past and it's certain to be used many more times in the future.


Apple did not scan uploaded images. Apple has never scanned iPhone images. Last year Apple reported 245 cases to missing and exploited children, and Facebook reported 50M, Google ~4M, Microsoft reported, and Apple was below the line.

https://www.nytimes.com/2020/02/07/us/online-child-sexual-ab...


We do know that Apple has been scanning email attachments sent via iCloud email. I don't think it's ever been claimed that Apple has ever scanned anyone's iCloud Photo Library.

Ethics aside, on-device scanning has the benefit of Constitutional protection, at least in the USA. Because the searching is being performed on private property, any attempt by the Government to try to expand the scope of searches would be a clear-cut 4th Amendment violation.

(Whereas if the scanning is done in the cloud, Government can compel searches and that would fall under the "third party doctrine" which is an end-run around the 4th Amendment.)


> Because the searching is being performed on private property,

It's it though? The device someone bought 2 years ago suddenly starts reporting them to the FBI's Anti-CSAM unit without the owners realistic consent does seem like a run-around to unpermissioned government searches. It's not reasonable to say "throw away your $1200 device if you don't consent", is it? Nor can a person reasonably avoid iOS updates that force this feature to be active.

> any attempt by the Government to try to expand the scope of searches

We've seen private companies willfully censor individuals at the government's behest under the current administration - will Apple begin expanding the search and reporting mechanisms just to stay in whatever administration's good grace?

Like I said, this is extremely out of character, and very off-brand for Apple. Why would someone trust Apple going forward? Even Google's Android doesn't snitch on it's owners to law enforcement... Setting aside all the ways for nefarious actors to abuse this system and sic LE on innocent individuals.


> It's it though?

Yes. Your phone is your private property, just like your house or your car. Searching your private property requires a warrant or reasonable suspicion, otherwise it's a 4th Amendment violation.

This twitter thread is worth a read.

https://twitter.com/pwnallthethings/status/14248736290037022...


The 4th amendment only protects you from the government searching your property. Otherwise, the Microsoft telemetry which reports back to Microsoft what software you have installed and what apps you are running would be illegal.


So, what does a person do if they do not consent to this search? Tough?

You can't realistically avoid the iOS update. Apple has effectively given consent on your behalf... How will that fly?


If you do not consent to having your photos scanned for CSAM, turn off iCloud Photo Library. Same as how you opt out of CSAM scanning of your photo library on Android.

If you're concerned about other forms of scanning compelled by the Government, you never consented to the search. So even if Apple complied, the search is invalid and cannot be used to prosecute you.


> If you're concerned about other forms of scanning, you didn't consent—so even if Apple complied, the search is invalid and cannot be used to prosecute you.

This is a dangerously false understanding of the law. Stop giving legal advice. You are not a lawyer.


Are you saying that if the US Government compelled Apple to scans millions of citizen's private property for non-CSAM images, this would not be a clear-cut violation of the Fourth Amendment?

I'm curious, do you think that the Third Party Doctrine applies here?


I think the realistic danger here is the US Government no longer needs to compel this type of activity. Reference Twitter and Facebook/Instagram voluntarily censorship per mere suggestion of the current administration/power party.


Let's be practical for a minute. What specific image would Apple voluntarily search for on behalf of the US Government? I sincerely can't think of anything.


Images, leaked government files, anti-administration phrases, unflattering memes of the president, statements that contradict the government's current stance, etc.

All things current at social media companies seem willing to censor after suggestion of the administration.


You seriously think Apple would voluntarily search private devices for images which aren't illegal and don't even hint at any action which is illegal?

I don't think you're being serious.


Why not? Facebook and Twitter have done exactly that in the past year. Why is it far fetched for Apple suddenly, given this amazing reverse-course on branding?

The only realistic alternative to Apple is Android... And Google is pretty darn transparent in their spying on users. Apple just did a 180 degree about-face on all the branding they've built over the last decade. Why should anyone trust Apple again?

Look, this whole neural-hash thing took what, 2 weeks for people to fabricate collisions? This just illustrated how poorly conceived and ill-thought the entire plan was from Apple. It's not beyond reason to assume any of these things given the evidence we currently have.


It does not appear one can opt out certain folders from this scan. If you enable iCloud backups, it's scans the entire shebang.

As previously mentioned, Android doesn't scan all photos on your device... Google scans content uploaded to their servers. Which is reasonable... It's their servers, they can host what they want. Your iPhone is your iPhone.


> If you enable iCloud backups, it's scans the entire shebang.

Citation?


Do I need one? Where in iOS can you choose which folders to opt-into CSAM scanning? I only see an all-or-nothing option for iCloud photos.


Yes, you do need a citation, because I've not heard Apple (or anyone else) claim that iCloud Backups or iCloud Drive are being subject to CSAM scans.

From everything I've read, from Apple and other sources, if the photo is about to be uploaded to iCloud Photo Library then it is scanned for CSAM. If it's not, it isn't.


How does one choose individual photos to not upload?


You store the photos you want to keep private in another app. I'm sure there are lots in the App Store.

Still waiting on that citation.


If the default behavior is not to exclude photo rolls from this new feature, I'm not sure where the argument exists. Telling iOS users they should download some app to keep photos private is absurd.


If a photo is about to be uploaded to iCloud Photo Library then it is scanned for CSAM. If it's not, it isn't.

Still waiting on that citation.


Are we arguing the same thing? How does one opt-out a specific photo? It's not possible as far as I know.


I've no idea what your point is. I've tried offering answers for all these random questions, but I'm still waiting for you to offer a citation for the claim you made earlier.


> I agree CSAM isn't likely to be pervasive in the photo libraries on iOS devices.

Where does this assumption come from? Because of iOS lower market share? Are you implying they are more prevalent in Android devices? In desktop computers? I don't understand the logic.


The assumption comes from a general observation that most normal people don't tend to use their photo library to store legal porn (other than home made) and I haven't seen any argument for why CSAM aficionados are expected to be any less careful. There are surely plenty of apps out there for keeping separately encrypted vaults of files/photos, and I'm sure many are very easy to use.

You don't have to be particularly tech savvy to know it's a bad idea to co-mingle your deepest darkest secrets alongside photos of your mum and last night's dinner. Especially when discovering those secrets would lead to estrangement, or prison.

As for the few who might be doing it currently, that's likely to plummet quickly. If you think Apple's move caused waves in the Hacker News crowd, just imagine how much it has blown up in the CSAM community right now. I dare say it's probably all they've been talking about for the past two weeks.


Yeah, they’re probably all saying “well I know for sure I’ll never use an Apple device for my CP from now on!” From Apple’s point of view, that’s mission accomplished.


It is more believable they are introducing this tech for larger international security reasons still kept under wraps.


Possible avoidance of being told how they have to do it later.

They will have to do it either way, and they the fact they are even telling how us they plan to do it is more than we can say for every other cloud services.

This is better than all alternatives at this point. Like it or not. If you don't like, you might need to get up to speed on what other services you may already be using are doing.


Government coercion.


They had to have been coerced.


Apple is scanning files locally before they are uploaded to iCloud in order to avoid storing unencrypted photos within iCloud but still discovering CSAM. All the other storage providers already scan all the images uploaded on their servers. I guess you can decide which is better. Here is Google's report on it:

https://transparencyreport.google.com/child-sexual-abuse-mat...


> in order to avoid storing unencrypted photos within iCloud

To be clear, Apple does not utilize E2E in iCloud. They can (and already do) scan iCloud contents


Apple has said this is not the final version of the hashing algorithm they will be using: https://www.vice.com/en/article/wx5yzq/apple-defends-its-ant...


Does it matter? Unless they're going to totally change the technology I don't see how they can do anything but buy time until it's reverse engineered. After all, the code runs locally.

If Apple wants to defend this they should try to explain how the system will work even if generating adversarial images is trivial.


Apple has outlined[1] multiple levels of protection in place for this:

1. You have to reach a threshold of matches before your account is flagged.

2. Once the threshold is reached, the matched images are checked against a different perceptual hash algorithm on Apple servers. This means an adversarial image would have to trigger a collision on two distinct hashing algorithms.

3. If both hash algorithms show a match, then “visual derivative” (low-res versions) of the images are inspected by Apple to confirm they are CSAM.

Only after these three criteria are met is your account disabled and referred to NCMEC. NCMEC will then do their own review of the flagged images and refer to law enforcement if necessary.

[1]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...


I do want to note that decrypting the low-res images would have to happen before step 2.


Doesn't disabling the account kind also defeat the whole purpose?

I mean assuming the purpose is to catch child abusers and not merely to use this particular boogeyman to introduce a back door for later use.


Will the high-resolution images be collected and used as evidence? Or just the visual derivatives? That's not clear.


Currently, most likely.

I don’t believe Apple has said whether or not they send them in their initial referral to NCMEC, but law enforcement could easily get a warrant for them. iCloud Photos are encrypted at rest, but Apple has the keys.

(Many have speculated that this CSAM local scanning feature is a precursor to Apple introducing full end-to-end encryption for all of iCloud. We’ll see.)


NeuralHash collisions are interesting, but the way Apple is implementing their scanner it's impossible to extract the banned hashes directly from the local database.

There are other ways to guess what the hashes are, but I can't think of legal ones.

> Matching-Database Setup. The system begins by setting up the matching database using the known CSAM image hashes provided by NCMEC and other child-safety organizations. First, Apple receives the NeuralHashes corresponding to known CSAM from the above child-safety organizations. Next, these NeuralHashes go through a series of transformations that includes a final blinding step, powered by elliptic curve cryptography. The blinding is done using a server-side blinding secret, known only to Apple. The blinded CSAM hashes are placed in a hash table, where the position in the hash table is purely a function of the NeuralHash of the CSAM image. This blinded database is securely stored on users’ devices. The properties of elliptic curve cryptography ensure that no device can infer anything about the underlying CSAM image hashes from the blinded database.

https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...


You can extract the hashes with a few hours spent on the darknet. Doing that is certainly illegal and not to mention VERY morally wrong, but criminals exist and criminals won't hesitate to abuse this as a mechanism for framing, extortion, or ransom.

It's also possible for someone (Attacker A) to go on the darknet and get a list of 96-bit neural hashes, and then publish or sell this list somewhere to another party, Attacker B. The second party would never have to interact with CSAM.

Imagine Ransomware v2: We have inserted 29 photos of CSAM-matching material into your photo library. Pay X monero to this address in 30 minutes, or we will insert 2 additional photos, which will cross the threshold and may result in serious and life-changing consequences to you[1].

The difference here (versus the status quo) is that an easily-broken perceptual hashing enables the attacker to never send or possess any CSAM images[2]. From my experiences with being victims of various hackers, I know a lot of them won't touch CSAM because they know it's wrong, but they'll salivate at an opportunity to weaponise automated CSAM scanning.

[1]: If you think Apple's human review will mitigate this attack, you can permute legal pornography to match CSAM signatures. If Apple's reviewers see 30 CSAM matches and the visual derivatives look like porn, they will be legally required to report to to NCMEC (a statutory quasi-government agency staffed by the FBI), even if all the photos are actually consensual adults.

[2]: If you never possess nor touch CSAM, it might be harder for you to get charged with CP charges. You might be looking at CFAA, blackmail or extortion charges; while your victim faces child pornography charges. This is basically an "amplification attack" on the real world judicial system.


One of the grosser clean room designs.

It's certainly possible, but I posit that the exploit chain necessary to get the capability to inject photos onto an arbitrary user's iPhone is valuable enough that it's more likely to be used for spying by repressive regimes than straight up blackmail-- and if you had such a capability, why bother with hash-colliding permutations of legal pornography? Why not plant CSAM directly onto the user's device?

Nearly all cloud storage services implement a scanner like this, and permit the same level of blackmail with a simpler attack chain, such as phishing Dropbox credentials to inject illegal material.

I think the more interesting attacks are governments colluding to add politically motivated non-CSAM material to the lists and then requiring Apple allow them to perform the human review to discover dissidents.


If you plant CSAM, you must possess, distribute, and transmit CSAM. That creates moral and legal barriers. The median ransomware actor probably finds CSAM repulsive and wrong.

If you plant material that matches CSAM hashes, you do none of that. The median ransomware actor might find this to thje fastest way to collect a thousand monero.

Also, you can distribute 30 media items per message via WhatsApp. There is a configurable setting for WhatsApp to save all received photos to your iCloud photo library. No exploits needed, you could probably weaponise this via an WhatsApp bot.


> If Apple's reviewers see 30 CSAM matches and the visual derivatives look like porn

Even worse, just get a "teen" porn screengrab, pass it through the collider and you have pretty much a smoking gun


The "visual derivative" is not something any of us have been shown an example of either. Whatever it is, I suspect you only need to be vaguely in the same ballpark (I would wager humanoid shaped skin tones maybe).

So I suspect it would be easier then that (particularly since this whole hashing scheme has been surrounded with a lot of clear garbage - "1 in a trillion" -> on demand collisions in a couple of weeks?


I think visual derivative is just a beating-around-the-bush way of saying “thumbnail.”


That's their words that they feel no need to elaborate on. Obviously they actually seem to just be doing the "technically the truth" thing - which shows that someone realized no one would like hearing what it actually is.


Yeah, it was very pointedly awkwardly worded. It’s intended for human reviewers to distinguish a false positive from a real positive. An eigen vector mapped image isn’t going to do that, a heavily Gaussian-blurred image isn’t going to do that - it needs to be something that a minimum wage person who’s only been trained a day or to can distinguish as “CSAM” or “not CSAM” and that means it’s a thumbnail of sorts.


This attack does not work, as Apple uses two hashing algorithms, one on the device which is now public, and one which is secret. You would have to collide both, which would be hard enough if you knew what the second one was, which you don't.


To defeat this, all you need to be is a state actor with a database of child porn at your disposal (which is stored for exactly the purpose of training detection systems). Then you run the hashing algorithm against images you know are in the database (Apple suggested that they would accept suggestions by some kind of multi-Country vote). Then you can pull out the hashes and figure out how to trigger false positives on the important vectors.

Next, embed your images in sites of interest, like:

* A meme in some group

* A document or 'leak'

* An email to a journalist

Wait for somebody to save it to their Apply device. Wait for it to be flagged and then use that as 'reasonable means to conduct a search'. When asking for a warrant, the agency would say something like "we detected possible CSAM on a device, the likelihood of a false match is extremely low" - a judge will hardly press further.

You now essentially have a weapon where you can search any Apple device in the name of preventing the distribution of CSAM.

Failing that, you could just have `document_leak.pdf` and download a file that is both a valid PDF and a child porn image, depending on which program you open it with.


There's already a problem that Apple can't verify the hashes. Say a government wants to investigate a certain set of people. Those people probably share specific memes and photos. Add those hashes to the list and now you have reasonable cause to investigate these people.

Honestly this even adds to the danger of hash collisions because now you can get someone on a terrorist watch list as well as the kiddy porn list.


Apple is the one doing the first line of investigation.


This doesn’t work for two reasons: 1) There’s no way to know the perceptual hash value of Apple’s private NeuralHash function that is run on the derivative of the image server side to verify a hit really is CSAM. So while you could cause a collision with the on device neural hash if you possessed illegal content, you wouldn’t know if you successfully faked Apple’s private neuralhash implementation. 2) An Apple reviewer must verify the image is illegal before it’s passed along to law enforcement.


is this an example of homomorphic encryption? checking for hashes in a 'blinded' table I mean.


Some people seem to be confused why a hash collision of a cat and a dog matters. Here's a potential attack: share (legal) NSFW pictures that are engineered to have a hash collision with CSAM to get someone else in trouble. The pictures are flagged as CSAM, and they also look suspicious to a human reviewer (maybe not enough context in the image to identify the subject's age). To show that this can be done with real NSFW pictures, here is an example, using an NSFW image from a subreddit's top posts of all time.

Here is the image (NSFW!): https://i.ibb.co/Ct64Cnt/nsfw.png

Hash: 59a34eabe31910abfb06f308


Does anyone save porn to their personal photo libraries? Especially porn as suspicious as the image you posted?


Going by what some people on Reddit say, it seems to be the case. https://old.reddit.com/r/datahoarder/search?q=porn&restrict_...

Probably not the weird image I posted, which looks obviously suspicious. But maybe someone will make a program to find "cleaner" hash collisions that don't look suspicious.


https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issue...

I posted some examples that look like totally normal images, they're no harder to produce, you just need to noise-shape the gradient descent so that the introduced noise has a spectrum similar to the image. E.g. just feeding back a gaussian highpassed version of the error signal is sufficient.


The CSAM detection system supposes people save actual child porn to their personal photo libraries.


Keep in mind that you have to also collide with another perceptual hash function that only Apple has to trigger a match.


> Keep in mind that you have to also collide with another perceptual hash function that only Apple has to trigger a match.

If it's another neural network I wouldn't be shocked if the adversarial preimages worked across both-- it's not uncommon for blackbox generalization to work for adversarial examples. It would be very likely if someone (maybe the attacker) made their own version of neuralhash and then generated examples that passed both theirs and apple's public one.

Privacy wise, if there were two perceptual hash functions Apple should have used the more restrictive one on the devices too -- because even if they decide to not report you, your privacy is still invaded if they inspect at your images at all.

The neuralhash function is extremely easy to attack. We should not have any confidence in the competence of its authors, so we shouldn't expect their undisclosed mechanism to provide a great deal of protection.

A secret second hash also will not be secret against a state attacker who will have access to this function by virtue of being trusted to create the databases for Apple.

There is, however, a very simple technique they could use that would provide almost perfect protection: They could stop invading the privacy of their users and refrain from scanning their private content!


Does Google Chrome scans downloaded images ?


You seem to be assuming a human cannot tell the difference from some random NSFW content, and some legit known CSAM, 30 times. Try again.


Apple's reviewers don't have access to the original CSAM to know if it's a match or not. That stays with NCMEC. If they see some legal porn that looks like it could be illegal, they'd likely flag it as a match.


Roughly thirty images is the threshold for the system to activate, but would they need to review all thirty images to pass it on or would they just need to verify one image looks visually like CSAM in order to pass it along?

It seems unlikely in the event that there was anything that they verified as CSAM they wouldn't pass it on just because they found a false positive in those thumbnails.


Great work, I hope people keep hacking the system to lower the system's credibility. This idea is just beyond insane, and the plan to have manual check on user's photos on their own devices sounds like what China is doing - not great


I am strongly against Apple’s decision to do on-device CSAM detection, but: wasn’t there a secondary hash whose database is not shared? In theory you need to collide with both to truly defeat the design, right?


You just need a sample of something that is evidently so pervasive we're building a nation wide dragnet to stop.


I doubt finding it would really be that hard, if you wanted to be on a list somewhere in some government database, but even armed with a full image that is in the NCMEC database, the problem is that the second hashing algorithm runs on the server and presumably has secure-by-obscurity details… so it would be hard to collide with it on purpose unless you are an insider. That’s my understanding, although details have been a bit shaky at times.


I don't see how.

They're hashing on feature space (so trivial cropping and such doesn't defeat this) but they have two totally separate methods of matching those hashes? Doesn't sound right to me...


Apparently the images in question would get sent to the server, and all calculation happens there.

> In a call with reporters regarding the new findings, Apple said its CSAM-scanning system had been built with collisions in mind, given the known limitations of perceptual hashing algorithms. In particular, the company emphasized a secondary server-side hashing algorithm, separate from NeuralHash, the specifics of which are not public. If an image that produced a NeuralHash collision were flagged by the system, it would be checked against the secondary system and identified as an error before reaching human moderators.

https://www.theverge.com/2021/8/18/22630439/apple-csam-neura...

For one reason or another Apple really wants to create this precedent, so it’s only natural they’re doing every last thing to make the feature hard to defeat.


Hard to exploit is better phrasing.


Yes, but it’s much easier to just ignore that and proclaim how weak Apple’a system is.


This is just getting wilder and wilder by the day, how spectacularly this move has backfired. As others have commented, at this point all you need is someone willing to sell you the CSAM hashes on the darknet, and this system is transparently broken.

Until that day, just send known CSAM to any person you'd like to get in trouble (make sure they have icloud sync enabled), be it your neighbour or a political figure, and start a PR campaign accusing the person of being investigated for it. The whole concept is so inherently flawed it's crazy they haven't been sued yet.


The "send known CSAM" attack has existed for a while but never made sense. However, this technology enables a new class of attacks: "send legal porn, collided to match CSAM perceptual hashes".

With the previous status quo:

1. The attacker faces charges of possessing and distributing child pornography

2. The victim may be investigated and charged with child pornography if LEO is somehow alerted (which requires work, and can be traced to the attacker).

Poor risk/reward payoff, specifically the risk outweighs the reward. So it doesn't happen (often).

---

With the new status quo of lossy, on-device CSAM scanning and automated LEO alerting:

1. The attacker never sends CSAM, only material that collides with CSAM hashes. They will be looking at charges of CFAA, extortion, and blackmail.

2. The victim will be automatically investigated by law enforcement, due to Apple's "Safety Voucher" system. The victim will be investigated for possessing child pornography, particularly if the attacker collides legal pornography that may fool a reviewer inspecting a 'visual derivative'.

Great risk/reward payoff. The reward dramatically outweighs the risk, as you can get someone in trouble for CSAM without ever touching CSAM yourself.

If you think ransomware is bad, just imagine CSAM-collision ransomware. Your files will be replaced* with legal pornography that is designed specifically to collide with CSAM hashes and result in automated alerting to law enforcement. Pay X monero within the next 30 minutes, or quite literally, you may go to jail, and be charged with possessing child pornography, until you spend $XXX,XXX on lawyers and expert testimony that demonstrates your innocence.

* Another delivery mechanism for this is simply sending collided photos over WhatsApp, as WhatsApp allows for up to 30 media images in one message, and has settings that will automatically add these images to your iCloud photo library.


Before they make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself. Presumably, if it doesn’t match the same asset, it won’t be passed along. This is explained towards the end of the threat model document that Apple posted to its website. https://www.apple.com/child-safety/pdf/Security_Threat_Model...


What happens if someone leaks or guesses the weights on that "secret" classifier? The whole system is so ridiculous even before considering the amount of shenanigans the FBI could pull by putting in non-CSAM hashes.


For better or worse, opaque server-side CSAM models are the norm in the cloud photo hosting world. I imagine that the consequences would be roughly the same as if Google's, Facebook's or Microsoft's "secret classifiers" were leaked.


but in the cloud setting they have the plaintext of what was uploaded. The attack described above is about abusing the lack of information apple has so they will report an innocent user to the authorities.


The voucher that Apple can decrypt once enough positives have been received contains a scaled-down version of the original. How else would Apple be able to even run a second hash function on the same picture?


Can't they just make a new one and recompute the 2nd secret hash on the whole data set fairly easily?

Also, the whole point is that it's fairly easy to create a fake image that collides with one hash, but doing it for 2 is exponentially harder. It's hard to see how you could have an image that collides with both hashes (of the same image mind you).


Two hash models is functionally equivalent to a particular type of one double-sized hash model. So it shouldn't be any harder to recompute against a 2nd hash, if that 2nd hash were public.

Of course, it won't be public (and if it ever became public they'd replace it with a different secret hash).


If you have both models it is easy. If Apple manages to keep the server model private then it is hard.


You don’t need to have the weights. “Transfer attack” is a thing.


You can still hack someone's phone and upload actual CSAM images. That exposes the attacker to additional charges, but they're already facing extortion and all that anyway. I don't understand the "golly gee whizz, they'd have to commit a severe felony first in order to launch that kind of attack" argument.

Don't know why this hasn't already been used on other cloud services, but maybe it will be now that its been more widely publicized.


How...exactly did they train that CSAM classifier? Seeing as that training data would be illegal. I'd be most interested in an answer on that one. They are willing to make that training data set a matter of public record on the first trial, yes?

Or are we going to say secret evidence is just fine nowadays? Bloody mathwashing.


They didn't train a classifier, just a hashing function.


honestly asking — why is it illegal?


It may not be, so honestly I think my objection is best dismissed. Once I ran down the actual chain I mostly sorted things out with a cooler head.

However, the line of thinking was if Apple has a secondary classifier to run against visual derivatives, the intent is it can say "CSAM/Not CSAM". Since the NeuralHash can collide, that means they'd need something to take in the visual derivatives, and match it vs an NN trained on actual CSAM. Not hashes. Actual.

Evidence, as far as I'm aware, is admitted to the public record, and a link needs to exist, and be documented in a publically and auditable way. That to me implies any results of a NN would necessarily require that the initial training set be included for replicability if we were really out to maintain the full integrity of the chain of evidence that is used as justification for locking someone away. That means a snapshot of the actual training source material, which means large CSAM dump snapshots being stored for each case using Apple's classifier as evidence. Even if you handwave the government being blessed to hold onto all that CSAM as fitting comfortably in the law enforcement action exclusions; it's still littering digital storage somewhere with a lotta CSAM. Also Apple would have to update their model over time, which would require retraining, which would require sending that CSAM source material to somewhere other than NCMEC or the FBI (unless both those agencies now rent out ML training infrastructure for you to do your training on leveraging their legal carve out, and I've seen or come across no mention of that.)

Thereby, I feel that logistically speaking, someone is commiting an illegal act somewhere, but no one wants to rock the boat enough to figure it out, because it's more important to catch pedophiles than muck about with blast craters created by legislation.

I need to go read the legislation more carefully, so just take my post as a grunt of frustration at how it seems like everyone just wants an excuse/means to punish pedophiles, but no one seems to be making a fuss over the devil in the details, which should really be the core issue in this type of thing, because it's always the parts nobody reads or bothers articulating that come back to haunt you in the end.


i did a bit of reading as well and came across this. you might find it useful or interesting: https://www.law.cornell.edu/uscode/text/18/2258A at the end (h1-4), it details that providers must preserve the information they submit and also take steps to limit access to only people who need it. in this sense then, it’s not illegal for companies to possess csam. it’s not a big leap to then assume that storing csam for the development of detection software is legal (or at least as been throughly cleared with the courts, which is about the same). photodna was developed twelve years ago, and i can’t find anything about microsoft ever being charged with possession or distribution of cp.


Interesting!

Thank you, that was what I was looking for that closes the gap somewhat.


Somehow this didn't solidify my trust in Apple! By this standard you can probably mount a half decent defence off "ignorance" if you are even caught sending the colliding material. Add this whole debacle on top of what's going on in the EU parliament and 2021 has been WILD for privacy.


It seems like I'm not going to sleep tonight.

Sure, there is hyperbole in OP's comment (CSAM ransomware and automated law enforcement aren't a thing yet), but we're a few steps from that reality.

Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so?

This sounds like the start of an exodus from the cloud, at least in the non-developer consumer space.


Cloud services generally already do this, for example, here is Google's report:

https://transparencyreport.google.com/child-sexual-abuse-mat...


Yeh I was talking in hyperbole, but the possible attack vectors this system enables are so powerful I felt it warranted. Under this system you are able to artificially ddos organizations that verify if CP is sent by sending legitimate, low-res porn whose hash has been modified. You can trigger legitimate investigations by sending CSAM through WhatsApp or through social engineering. You can also fuck with Apple by sending obvious spam.

* With regard to the legislative branch, they can even mandate changes to this system they aren't allowed to disclose. Once this system is in place, what is stopping governments from forcing other sets of hashes for matching.


And this is just one step away from Apple and Microsoft building this scanning into the OS itself (into the kernel/filesystem code, why not?!). This is beyond insane. Stallman was right. Our devices aren't ours anymore.

Now, to be fair, there would be a secondary private hash algorithm running on Apple's servers to minimize the impact of hash collisions, but what's important is that once a file matches a hash locally, the file isn't yours anymore -- it will be uploaded unencrypted to Apple's servers and examined. How easy would it be to shift focus from CSAM into piracy to "protect intellectual property"? Or some other matter?


Jup. As others have pointed out, if Apple were willing to lie about the extent of this system and its inception date, why should we suddenly trust that they won't extend its functionality. They themselves explicitly state that the program will be extended, so if this is the starting point I don't think I will be around for the ride.

It's a shame as I really love some of their privacy-minded features (e.g. precision of access to the phone's sensors and/or media).


> Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so

They already do this. Google and Facebook have even issued reports detailing their various success rates…


So, everyone is going to turn off their iCloud sync and they won’t be a target anymore?


Well according to reports that are generally the source of these collisions, the hashing code has been on the device since around December 2020 (14.3)

https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...

If Apple hasn't been honest about WHEN it was built into and added to their code base, why would anyone take their word for HOW its being used, or many of the other statements they are putting in their documents as of yet, at least until they are verified


It doesn't necessarily mean that it will stop them from being a target, because Apple says this[1]:

> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

[1] https://www.apple.com/child-safety/


> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

"Think of the children" is the most recognizable trope in TV and film. They couldn't have phrased that to be more Orwellian.


Yes, until they add local scanning to macOS / iOS / iPad OS.


The attacker faces no charges because the colliding image can be a harmless meme.


LEO is not alerted automatically, where’d you get that idea?


They'd more or less have to be. Well, not necessarily 'police', but NCMEC.

I did work in automating abuse detection years back, and the US govt clearly tells you are not to open/confirm suspected, reported, or happened upon cp. There's a lot of other seemingly weird laws and rules around it.


Those laws don’t apply if it’s part of the reporting process. Apple’s stated that they do a manual to decide whether to send a report to NCMEC or not, just like other companies do.


Of course they do. If they didn't, every seedy pedo would be in the process of making a "report." It's probably also why Apple is using 'visual derivatives' for confirmation, rather than the image, though I can't find info on exactly how low resolution 'visual derivatives' are.

It is of course possible that companies may get some special sign off from LE/NCMEC to do this kind of work - I won't argue with you on that as I truly don't know. I can just tell you my company did not, and was very harshly told how to proceed despite knowing the nature of what we were trying to accomplish. But, we weren't anywhere near Apple big.

I remember chatting with our legal team, who made it explicit that laws didn't to cover carve outs - basically 'seeing' was illegal. But as you can imagine, police didn't come busting down our doors for happening upon it and reporting it. If you have links to law where this is not the case, I'll gladly eat crow. I've never looked myself and relied on what the lawyers had said.


They will be if you collide a low-res image that resembles CSAM.

Why would person doing manual review risk his job in case if he’s unsure? Naturally he will just play it safe and report images.


Not resembles. The adversarial image has to match a private perceptual hash function of the same CSAM image that the NeuralHash function matched before a human reviewer ever looks at it.


Do you have any material on this private function?


Not beyond the documents Apple has shared. Presumably it will be kept that way given it prevents an adversarial attack against it.


Why wait? Just send them the pictures on Facebook Messenger or Gmail or Dropbox today.


I can't tell if you are being sarcastic. In case you are not, isn't the act of sending those pictures completely illegal?


People here are proposing intentionally creating image assets which collide with perceptual hashes of known CSAM (ignoring whether that is legal or ethical) and sharing those assets to effectively SWAT unaware targets.


They still seem to be under the impression that a neuralhash collision would be enough to do this, which it isn’t.


Oh, I think I misunderstood you. I thought you meant instead of "sending images that collides with perceptual hashes of known CASM", why not "send actual CSAM in 'Facebook Messenger or Gmail or Dropbox', and since those services also use some other detection algorithm, it will also incriminate the receiver."


Those services will take your account through the same, if not more invasive, process if you are found with a hash match like the ones being proposed in these comments. Unlike Apple, they’ve built interfaces that surface all your account activity to reviewers.


> Unlike Apple, they’ve built interfaces that surface all your account activity to reviewers.

You can't know this without independent audits.


In some ways, you can start to see the value in Apple’s system which lets the device user inspect what is stored in the associated data for later review.


I haven't seen anyone proposing actually doing it, but I think a lot of people are rightly pointing out that bad actors, black hats and the Russian mob are going to have a field day with their ability to do so.


I’m not sure how you can conclude the speculation is “right” without engaging with the fact that this hypothetical is addressed directly in the threat model document and hasn’t been pulled off successfully against any of the other services which do similar scanning. Why can’t I buy compromat as a service for your Gmail account?


Nah that's so 2020, 2021 is all about low resolution legitimate porn being transformed to match CSAM. Get with the times!


Those will trip up 2020’s systems as well!


But why low resolution porn?


So that you are able to bypass the manual reviews. It still looks like CSAM, but it isn't.


Imagine being a parent that made pictures of their own children that bathed naked in their own backyard.

I don't know about you, but my parents certainly have lots of embarassing pictures of me in their photo album.

There will be so many false positives in that system, it's ridiculous. It doesn't necessarily have to be a false colliding hash, but legitimate use cases that - by definition - are impossible to train neural nets on unless the data is being used illegally by Apple.


That’s not how Apple’s system works. It’s not an image classifier. Only actual images that are derivatives of known CSAM images (a database of 250k images) will match. Random images of kids will not match those at any greater frequency than any other image.


Counter-question: At what point is child porn actually child porn, socially and statistically speaking?

If I share that picture of my child with my friends and loved ones on Facebook - at what "scale" is it considered to be added to that database as child porn?

1k shares? 10k? Who's the one eligible to decide that? The judicatives? I think this scenario is a constitutional crisis because there's no good solution to it in terms of law and order.


I think you're underestimating the severity of child abuse by orders of magnitude. CSAM is a database of child rape, not child nudity.


For now. You don't know what will be added next.

China will demand it to include pictures of the Tiananmen massacre.


Well that’s a pretty orthogonal concern to the above comment that was worried about getting flagged for sharing pics of their own kids


idk what's in the database, whether it's rape or nudes or both. Although depictions of sexual acts versus simple nudity seems like a logical place to draw a line, all the lines on adult pornography are arbitrarily drawn based on "community standards", and we're only a few decades away from state-level bans on any nudes as "porn" in the US, including artistic photos. (Not to mention anti-sodomy laws).

Even if what's in the database is 100% violently criminal as you suggest, and even if it remains limited to that material, we already have a process in place that denies the accused of even seeing the evidence against them if a hash matches. What a horrific, orwellian situation if someone sent you hash matches, the police raid your house and now you can't even see what they think they have or prove your own innocence.


You would presumably have the 30+ images on your device or in iCloud to prove your innocence.

For you to get caught up in this dragnet, 30+ plus images have to match NeuralHash’s of known illegal images, thumbnails of those images have to also produce a hit when run through a private hash function that Apple only has, and two levels of reviewers have to confirm the match as well.


What does Apple even do in this situation? That media won't match known CSAM, but if you modify childhood images so that its hash matches CSAM, what does Apple do. There are just SO MANY things that can and will go wrong as people try to exploit this system.


You can’t modify your childhood images so their hash matches csam because the visual derivative won’t match.


In the digital age, I certainly wouldn't be taking such pictures, let alone uploading them to cloud storage. Not because of any concerns about neural hashing, but simply because I wouldn't want such pictures of my children getting leaked / stolen / hacked.


Why would anyone save CSAM to their photo library?


A hash collision allows you to create material that matches CSAM signatures, without being CSAM. This opens up a new class of attacks.

Specifically, many criminal actors don't touch CSAM because it's wrong. But some of these criminal actors will happily abuse legal systems, e.g. SWATTing.


I would gladly have a mobile phone full of memes that have been modified to match, just for the lulz. I honestly think every meme should be put through just to have "illegal memes"


Illegal memes. Finally. Illegal Pepe will be the crowning jewel of my rare Pepe collection.


Holy cow. An Illegal Pepe is just too good not to have.


Maybe this is how 4chan finally demolishes itself.


> A hash collision allows you to create material that matches CSAM signatures, without being CSAM.

This is not correct. Hash collisions won’t match the visual derivative.


Sorry, this is not even wrong.

The visual derivative is just a resized, very-low-resolution version of the uploaded image. "Matching the visual derivative" is completely meaningless. The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.

If enough signatures match, Apple employees can decrypt the visual derivatives, and see if these extremely low resolution images look to the naked eye like they could come from CSAM. If so, they alert the authorities.. Given a way to obtain hash collisions, generating non-CSAM images that pass the visual derivative inspection is completely trivial.


> Sorry, this is not even wrong.

Probably a mistake to say things like this, when the public documentation contradicts you.

> The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.

Bullshit.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...


I just want to be clear if I understand this... many images can result in the same hash, but the hash can and will be reversible into one image? And that image is a low res porn photo derived from the algorithm's guesswork? So once a hash matches they don't check if there was a collision and the photo is completely unrelated, they just see the CG porn? If that's the case then why even look at the derived image?


No, this is not what's going on at all. The employees never see the original photos in the government CSAM hash database. Apple doesn't even have these photos: it's precisely the kind of content that they don't want to store on their servers. If some conditions are satisfied, the employees gain access to the visual derivatives (low-resolution copies) of your photos, and they judge whether these look like they could plausibly be related to CSAM materials.

The exact details of the algorithm are not public, but based on the technical summary that Apple provided, it almost certainly goes something like this.

Your device generates a secret number X. This secret is split into multiple fragments using a sharing scheme. Your device uses this secret number every time you upload a photo to iCloud, as follows:

1. Your device hashes the photo using a (many-to-one, hence irreversible) perceptual hash.

2. Your device also generates a fixed-size low resolution version of your image (the "visual derivative"). The visual derivative is encrypted using the secret X.

3. Your device encrypts some of your personally identifying information (device ids, Apple account, phone number, etc.) using X.

4. The hash, the encrypted visual derivative, and the encrypted personally identifying information are combined into what Apple calls the "safety voucher". A fragment of your key is attached to the safety voucher, and the voucher is sent to Apple over the internet. The safety vouchers are sent in a "blinded" way (with another encryption key derived using a Private Set Intersection scheme detailed in the technical summary), so that Apple cannot link them to specific files, devices or user accounts unless there's a match.

5. Apple receives the safety voucher. If the hash in the received safety voucher matches that of known CSAM content in the government-provided hash database (as determined by the private set intersection scheme), the voucher is saved and stored by Apple, and the fragment of your secret key X is revealed and saved. (You'd assume that they filter out / discard your voucher if there's no match; but the technical summary doesn't explicitly confirm this; this means that they may store and use it in the future to run further scans).

6. If your account uploads a large number of matching vouchers, then Apple will gather enough fragments to reassemble your entire secret key X. Now that they know your secret key, they can use it to decrypt the "visual derivatives" stored in all your saved vouchers.

7. An Apple employee will then inspect the "visual derivatives", and if your photos look like CSAM (more precisely, this employee can't rule out by visual inspection that your photos are CSAM-related), they will proceed to use your secret key X (which they now know) to decrypt the personally revealing information contained in your safety voucher, and report you to the authorities.

Keep in mind that the employee looking at the visual derivative does not, and cannot, know what the original image is supposed to look like. The only judgment they get to make is whether the low-resolution visual derivative of your photo looks like it can plausibly be CSAM-related or not. Plainly speaking, they will check if a small, say 48x48 pixel, thumbnail of your photo looks vaguely like naked people or not.


> The exact details of the algorithm are not public,

The relevant parts are.

> but based on the technical summary that Apple provided, it almost certainly goes something like this.

It doesn’t go like that. You are simply wrong.


Seems like that would rule out using the system to detect ‘tank man’ images.


That bit you quoted seems to be actually correct. It does not mention visual derivatives at all.

That said I think your statement is a bit too strong, but generally true. A hash collision is not going to inherently be visually confusing. However you claim that it is impossible for an image to be both visually confusing and a hash collision, which seems unlikely. The real question is going to be how much more effort it takes to do both.


I didn’t claim it was impossible, just that hash collisions won’t match both.

Also, the information needed to create a full match simply is not available.


Are those not the same statement?

Unless you're relying on it being computationally infeasible, but I'm not sure we know enough to consider that true at this point. Usually when we make statements on those grounds we do so with substantial proof. I don't think we know enough to do so here. I'm not even sure how feasible it is when you throw DL into the mix.


From the docs: “as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, inde- pendent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database.”


> Are those not the same statement?

No.


Most people wouldn't of course. In this scenario you'd get someone to download the CSAM unknowingly. If they have iCloud sync it automatically uploads to iCloud, thereby triggering the system. At that point the authorities will be alerted by Apple, and you can inform media outlets. They in turn will ask law enforcement who will confirm the investigation, and the reputation of the person investigated will be tarnished.


Also as dannyw pointed out, you don't even have to send CSAM to trigger the system. If they found you you would still be charged, but not with possession of CSAM.


What exactly would you be charged with? Why would law enforcement even be involved in a case of false positives?


The sender would of course be charged with wasting police efforts, defamation attempts+++. In the case of false positives the receiver of course wouldn't be charged, it's more about the fact that this system can be manipulated with too much ease. Even if you're not charged, an investigation takes time away from already limited law enforcement resources. I'm also not interested in buying products from a company that blatantly spies on me. Today it's CSAM, but as others have pointed out, the hashes can be changed to look for anything.


Do you mean charging the sender of the trick images or the receiver?


Well that depends on the situation. Regardless the sender would be charged if found, but if they were able to get legitimate CSAM on the receiver's phone the receiver could possibly be charged too, or at least investigated. Just the idea of getting investigated in these kinds of attacks, much less being exposed publicly as being under investigation is a horrible thought.


You specifically said someone would be sent known CSAM. How would that get added to their photo library?


I meant * could *. My point is that social engineering is a clear weak link in this system. They can also be sent regular photos whose hash matches the database, or use this repo to transform a regular pornographic photo's hash, making it hard for manual confirmation on Apple's part.


What kind of social engineering would lead an innocent person to save known CSAM to their photo library?


None needed. You could just send a photo to the target through WhatsApp, and the photo would be automatically synced with iCloud.


Wouldn't the photo be scanned for CSAM by WhatsApp first?


Whatsapp messages are e2e encrypted so no.


What kind of social engineering would lead an innocent person to install malware on their devices? Or do you think people like that want to take part in an illegal DDoS botnet?


I think there’s a difference between “I’ll click this totally legit button to protect my computer from viruses” and “I’ll save this picture of a child being raped to my photo library.”

A lot of people may not know how to avoid malware. But I don’t think very many of them would be so inept as to accidentally long press on child porn and tap “Add to Photos”.


... and "I'll save this picture of an hilarious kitten to my photo collection"...

Fixed it for you.

The image to be saved doesn't have to be disturbing at all to trigger a hash collision.

The linked repo has code to modify an image to generate a hash collision with another unrelated image.

That's the whole point.


If some commenters can be believed about their experience with the database, there are a bunch of completely innocuous images in it because they're from the same photosets or distributed alongside CSAM.

Is that enough to cause an investigation? Maybe, maybe not, but I wouldn't want it to be a risk.


Photos in the database are classified for their content. Only images classified as A1 (A: prepubescent minor, 1: sex act) are being included in the hash set on iOS. So this doesn't even include A2 (2: lascivious exhibition), B1 or B2 (B: pubescent minor) let alone images which are in the database and aren't classified as any of A1, A2, B1 or B2.

While I've no doubt that there's a lot of "before and after" images (which are still technically CSAM even if they're not strictly child porn) and possibly many innocuous images, they would not have been flagged as "A1".

I'm sure there's probably still a few images flagged as A1 which shouldn't be in the database at all, but that number is going to be small. How many of these incorrectly flagged images are going to make their way into your photo library? One? Two?

You need 30 in order for your account to be flagged.


If someone is deliberately targeting you with them, 30 isn't very hard to reach.


I think it’s implausible that someone can become aware of 30 images which are miscategorised as A1 CSAM. How would this malicious entity discover them? What’s the likelihood that this random array of odd images could make it into a target‘s photo library?

And what’s the likelihood that a human reviewer will see these 30 odd images and press the “yep it’s CSAM” button?

More likely as soon as Apple’s human review sees these oddball images, they’re going to investigate, mark those hashes as invalid, then contact their upstream data supplier who will fix their data and now those implausible images are now useless.


Lending your phone to someone for a call, then a quick airdrop. Legitimate-looking emails with buttons. There's probably a list somewhere of proven attack vectors.


I posted another comment that was misunderstood as well. Folks, no one is proposing to download actual CSAM images to your photo lib. You could be duped thinking you downloaded an image of a beautiful sunset which was carefully manipulated to match the hash of an actual CSAM image.


The even worse part here is that not only could it impact an image of a beautiful sunset, which would fail the human check, it could impact a low quality version of legal porn, which could easily pass the human check and get passed on to law enforcement.

A sufficiently advanced catfishing attack could probably take advantage of this to get someone raided and have all their electronics confiscated.

Just send someone a zip of photos and let them extract it...


This is the really scary part. Of course getting someone to download blobs that corrolate to CSAM would be one thing, but downloading regular photos that have nefarious hashes is a trend /pol/ could start in an afternoon.


The parent was proposing to “just send known CSAM”.

But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what? The Apple reviewer would see they’re sunsets and you’d challenge the flag saying they’re sunsets. And if somehow NCMEC got involved, they’d see they’re just sunsets. And if law enforcement got involved, they’d see they’re just sunsets.

These proofs of concept might seem interesting from a ML pov, but all they do is just highlight why Apple put so many layers of checking into this.


> But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what?

A real attack would be to take legal porn images and make them collide with illegal images, so when a human goes to review the scaled down derivative images, those images very well look like they could be CSAM. Since there are many of them, they'd get sent to law enforcement. Then law enforcement would raid the victim's home and take all of their electronic devices in order to determine if they can be charged with a crime or not.


This where the "fog of war" kicks in. What with doors being busted down, police departments making press releases, etc. I can easily imagine that the victim could be prosecuted, convicted and sent away because no-one understood the subtlety that their legal porn was not in fact CSAM.


The fog of war is largely in the realm of post-puberty minors, photos of which are not being included in Apple's corpus of hashes. I find it difficult to believe that anyone could mistake or otherwise "fog of war" a photograph of an adult and a prepubescent minor.

And that's assuming someone develops a hash collision which doesn't substantially mangle the photograph like the example offered on Github.

Specifically, only images categorised as "A1" are being included in the hash set on iOS. The category definitions are:

  A = prepubescent minor
  B = pubescent minor
  1 = sex act
  2 = "lascivious exhibition"
The categories are described in further detail (ugh) in this PDF, page 22: https://www.prosecutingattorneys.org/wp-content/uploads/Pres...


> Specifically, only images categorised as "A1" are being included in the hash set on iOS.

Do we know that for sure?

Apple has changed their mind enough times in the last week and a half that I'm convinced they're in full on defensive "wing it and say whatever will get people off our backs!" mode.

You can't read the threat modeling PDF and conclude that it was run through the normal Apple document review process. It reads nothing like a standard Apple document - it reads like a bunch of sleep deprived people were told to whip it up and publish it.


That document is over six years old. It has nothing to do with Apple.


I don't really want to do the research, so I'll take your word for it.

But by fog of war I was thinking more like the victim already has some sleazy (though marginally legal) stuff on their computer, or a search led to a find of pot in their house, or they lied to try and get out of the rap, or perhaps the FBI offered them a deal and they took it because they saw no way out, or perhaps they were simply an unlikable individual who the jury took a dislike to.

Basically that things are not always clear cut, and they come out of the wrong side of things, in a situation created by Apple's surveillance.


Even if I grant all of the above, I don't see how any of that is impacted by the distinction between on-cloud scanning and on-device scanning of photos which are being uploaded to the cloud.

Surveillance is surveillance. It's a bit more obnoxious that a CPU which I paid money for is being used to compute the hashes instead of some CPU in a server farm somewhere (which I indirectly paid for) but the outcome is the same. The risk of being SWAT-ed is the same.


It would still be mentally draining to be accussed of CP. Can you imaging how terrified one would be if they see a warning message with a blurred sunset? I don't know exactly how the system works but from Apple's press release, it hides the image and gives a warning to the user. This would not go well on social media.


Remember, while you are refuting all this to each party, you are actually in the process of defending yourself against one of the worst criminal accusations possible. Your life will be investigated, your devices will be investigated - the amount of stress and reputational harm this causes is insane.


The point isn't to trick NCMEC, but rather create a DoS attack so no actual triggers can get through the noise.


I thought the point was to SWAT some innocent person? The goal keeps changing.


But who would want that?

We all want privacy but it seems odd to try to DoS this, with high risk for yourself and very little to gain.

Might be useful when the system turns into mass political surveillance tho.


As I've commented elsewhere, DoS can be easily mitigated by implementing another layer with basic object recognition to filter out false positive collisions.


> You could be duped thinking you downloaded an image of a beautiful sunset

If it was anything like the image used to demonstrate this technique on Github, it's unlikely that anyone would describe that sunset as "beautiful". They'd be more likely to describe it as "bugger, this JPEG file is corrupted."


Attacks never get worse over time.

It was quite literally less than 24h from "Oh, hey, I can collide this grey blob with a dog!" to "Hey, this thing that looks like cat hashes to the same thing as this dog!"

You really think this is going to end at this proof of concept stage?


Of course it will get better. But it's not going to end at "Hey, this photograph of a sunset is visually unchanged" while now matching CSAM. That's just not plausible. It's not how these classifiers work.

Regardless, this whole thing is moot because there are two classifiers, only one of which has been made public. Before any matches can make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself.


Match the first classifier, and your file gets uploaded unencrypted to Apple. Which is fine if it's probable CSAM. But what if they switch efforts to combat, say, piracy?


So your concern is that Apple will start doing something evil at any moment without your consent. That's been true of any computer platform since the advent of software updates. You can such hypotheticals with any company you like.


That’s not how the technology works. The files are never decrypted. Instead, if enough hashes match, a “visual derivative” is revealed. What a “visual derivative” is hasn’t been explained, but most people seem to think it’s a low-res version of the file.


Yes but that would be harmless because the visual derivative wouldn’t match.


Except that it isn’t. The hashes don’t enable an attack.


Ok so now all we have to do is get a phone, load it with adversarial images that have hashes from the CSAM database and we wait and see what happens. Basically a honeypot. Get some top civil rights attorneys involved. Take the case to the Supreme Court. Get precedence set right.

Lawfare


The adversarial images have to match both the NeuralHash output of CSAM, plus another private perceptual hash that points to the same image that only Apple has access to, plus a human reviewer needs to agree it is CSAM, and this has to happen for 30 images.


Do you think the reviewer will dismiss the alert if only 29 images look like CSAM and the last one looks like a Beagle? What if only 1 looks like CSAM and the other 29 are animal pictures? It's a safe bet that they will report your account for the 1 that looks like CSAM.


30 images are required to match known bad NeuralHash’s before Apple has any access to look at any of those 30 images.


Where would you get the CSAM hashes?


Give it a few days, and you'll probably find someone selling a list of CSAM neural hashes on darknet marketplaces.


Or tweeting out a bunch of them. They're just 12 byte numbers.


I bet there's a list of hashes already up in Pastebin.


The client has to be able to check for them in some way - just run that algorithm against every image you can scrape from Tor/Freenet and I suspect you'll have results rather quickly.

Or you can probably just wait a minute and pay an... enterprising individual to sell you such a list on a darknet market though, or perhaps even find one posted on the clearnet soon enough.


No, the client doesn’t have access to the CSAM hashes. And matches are verified on the server, not on the client.


The poster meant the algorithm to compute the hash has to be on the local device. And it's already been found.

https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...


Indeed, if they're proposing to only decrypt select images the client needs to know pass/fail at some point. Whether that's before or after sending the hashes to Apple's server really doesn't matter as bulk checks will likely be a part of API anyways. We'll have to wait for further reverse engineering to get full details here though.


That was scary fast. Is there a point in using this algorithm for its intended purpose now?


If the intended purpose is to lead by example and eventually mandate code on every computing device (phone and computer) that scans all files against a government provided database, then yes, that purpose still exists and this algorithm still works for it.

Just wait and watch - I guarantee you that Apple will be talking about CSAM in at least one anti-trust legal battle about why they shouldn't be broken up. Because a walled garden means they can oppress citizens on behalf of governments better.


Yes, because isn’t a weakness in the design. There is nothing scary fast about it. It was obvious and anticipated in the threat model.


Why does matter? The photo looks nothing like the target?

If someone looks at the two images wouldn’t they see they’re not the same and therefore the original image was mistakenly linked with the target


Apple's reviewers, by law, cannot look at the target. No one except NCMEC is allowed to possess the target (CSAM material).

So Apple will be looking at a low-res grayscale image of whatever the collided image is, which could be legal adult pornography (let's say: a screengrab of legal "teen" 18+ porn), but the CSAM filter tells it that it's abuse material!

What would you do as the Apple reviewer?

(Hint: You only have one option, as you are legally mandated to report).


But NCMEC will then review the reports and see that it doesn't match the target.


> Apple's reviewers, by law, cannot look at the target

This is false.

> No one except NCMEC is allowed to possess the target (CSAM material).

False. No one is allowed to knowingly possess it, without taking certain actions forthwith when they become aware of it. Obviously, prior to it being reviewed as it is, neither the reviewer nor Apple has knowledge that it is actual or even particularly likely CSAM.


I think you misread what was meant by "target".

Yeah, Apple might be able to look at the uploaded image. But the reviewers don't have a copy of the original image added to the database, which is the "target".


You’re correct, but the uploaded image is sufficient as it would be obvious some that it isn’t CSAM material.

If it was, then would it matter if it wasn’t the original?


That’s the point; you can’t identify that an image is CSAM just by looking at it.


You could pollute the pool and overwhelm their human review process, making it untenable to operate. And that's if you just wanted to pollute it with obvious non-CSAM content.


Well done! Hopefully all of this progress toward demonstrating how easy it is to manipulate neural hash will get Apple to rollback the update...


Counter-point: hijinks like this are defeated by including the original image instead of the image derivative in associated data. At that point, the system works in the exact same way as the photo scanning status quo.


Can they though? To the general public the optics of rolling back the update now would be that they are not fighting CSAM.


Can someone explain what is the profile of criminals they expect to catch with this system? People that are tech savy enough to go on the darknet and find CSAM content but simultaneously stupid enough to upload these images to iCloud?

And they think there are enough of these people to create this very complicated system and risk a PR disaster?


Here in Denmark a 15-year old girl and boy was filmed while having sex, and the video spread around among teenagers, apparently mostly through Facebook Messenger.

In 2018, the police indicted 1000 of them (tracking them down with Facebook's help). Legal results were a child-porn law judgement for 334 of them, and simpler penalties for 400.

The child-porn judgement was mostly suspended sentences, but it precludes working with children (as a teacher or even sports trainer if children under 15yo are involved) for between 10 and 20 years.

If there was a system that would have caught it sooner, prior to sharing, the spread would be minimized. The police took 3 years to form a plan to indict the 1000+ people.


Facebook reported 68.1 million csam images last year. If these people were such criminal masterminds, why are Facebook’s numbers so high?


Is it possible to host this online as a meme filter?

I think every meme should get pumped through this, just for lulz.


So when do we start the protest? Everyone could generate plant a bunch ofthese false positives on their devices. If enough people did it, it'd cost them.


So is this going to be used for DDOSing their photo verifying service?


Can someone explain me what isn neural hash?


NeuralHash is a hashing algorithm made by Apple to create hashes from images. Where other hashing algorithms would look at the pixel values, NeuralHash creates hashes based on the visual features of an image.

You can read more about it here: https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...


thank you very much


isn't it sufficient that they change the function every day or something ?


how long until they start scanning a device's framebuffer in realtime?

why stop at CSAM? Pirated material like movies next?


> how long until they start scanning a device's framebuffer in realtime?

Some smart TVs do automated content recognition so the manufacturers can spy on what you're watching and sell the data to the highest bidders.


Roku's "privacy" policy is a hoot to read for stuff like this.

It's basically, "If we've come up with a way to grab it, we do. And send it to our servers. And do what we want with it."

It literally includes:

> We may receive information about the browser and devices you use to access the Internet, including our services, such as device types and models, unique identifiers (including, for Roku Devices, the Advertising Identifier associated with that device), IP address, operating system type and version, browser type and language, Wi-Fi network name and connection data, and information about other devices connected to the same network.

Emphasis mine. They literally have given themselves permission to nmap your LAN and upload the results!


They are not doing this for the fun of it. If they didn't have to, they would not do it at all.

You have made a huge leap from scanning for pre-existing CSAM while in transit to a cloud service to scanning frame buffers on device in real-time. You should get some type of Olympic medal for such a leap.

This tech is to catch the lowest possible hanging fruit of the dumbest of all CSAM-sharing/saving folks as required by law.


Real-time framebuffer hashing might be a big leap, but what about local filesystem scanning built into the OS?


I give it 5 years


I've seen a lot of comments "muddying the waters" (intentionally or not) about whether hash colliders like the one demonstrated above can be used to carry out an attack. So I wrote up a quick FAQ addressing the most common points.

Part 1/2

Q: I heard that Apple employees inspect a "visual derivative" of your photos before reporting you to the authorities. Doesn't this mean that, even if you modify images so their hash matches CSAM, the visual derivative won’t match?

A: No. "Matching the visual derivative" is completely meaningless. The visual derivative of your photo cannot be matched against anything, and there is no such thing as an "original" visual derivative to match against. Let me elaborate.

The visual derivative is nothing more than a low resolution thumbnail of the photo that you uploaded. In this context, a "derivative" simply refers to a transformed, modfied or adapted version of your photo. So a "visual derivative" of your photo means simply a transformed version of your photo that still identifiably looks like the photo you uploaded to iCloud.

This thumbnail is never matched against known CSAM thumbnails. The thumbnail cannot be matched against known CSAM thumbnails, most importantly because Apple doesn't possess a database of such thumbnails. Indeed, the whole point of this exercise is that Apple really doesn't want to store CSAM on their servers!

Instead, an Apple employee looks at the thumbnails derived from your photos. The only judgment call this employee gets to make is whether it can be ruled out (based on the way the thumbnail looks) that your uploaded photo is CSAM-related. As long as the thumbnail contains a person, or something that looks like the depiction of a person (especially in a vaguely violent or vaguely sexual context, e.g. with nude skin or with injuries) they will not be able to rule out this possibility based on the thumbnail alone. You can try it yourself: consider three perfectly legal and work-safe thumbnails of a famous singer [1]. The singer is underage in precisely one of the three photos. Can you tell which one?

All in all, there is no "matching" of the visual derivatives. There is a visual inspection, which means that if you reach a certain threshold, a person will look at thumbnails of your photos. Given the ability to produce hash collisions, an adversary can easily generate photos that fail visual inspection. This can be accomplished straightforwardly by using perfectly legal violent or sexual material to produce the collision (e.g. most people would not suspect foul play if they got a photo of genitals from their Tinder date). But more sophisticated attacks [2] are also possible, especially since the computation of the visual derivative happens on the client, so it can and will be reverse engineered.

Q: I heard that there is a second hash function that Apple keeps secret. Isn't it unlikely that an adversarial image would trigger a collision on two distinct hashing algorithms?

A: No, it's not unlikely at all.

The term "hash function" is a bit of a misnomer. When people hear "hash", they tend to think about cryptographic hash functions, such as SHA256 or BLAKE3. When two messages have the same hash value, we say that they collide. Fortunately, cryptographic hash functions have several good properties associated with them: for example, there is no known way to generate a message that yields a given predetermined hash value, no known way to find two different messages with the same hash value, and no known way to make a small change to a message without changing the corresponding hash value. These properties make cryptographic hash functions secure, trustworthy and collision-resistant even in the face of powerful adversaries. Generally, when you decide to use two unrelated cryptographic hash algorithms instead of one, you make finding a collision at least twice as difficult for the adversary.

However, the hash functions that Apple uses for identifying CSAM images are not "cryptographic hash functions" at all. They are "perceptual hash functions". The purpose of a perceptual hash is the exact opposite of a cryptographic hash: two images that humans see/hear/perceive (hence the term perceptual) to be the same or similar should have the same perceptual hash. There is no known perceptual hash function that remains secure and trustworthy in any sense in the face of (even unsophisticated) adversaries. Most importantly, it is not guaranteed that using two unrelated perceptual hash functions makes finding collisions more difficult. In fact, in many contexts, these adversarial attacks tend to transfer: if they work against one model, they often work against other models as well [3].

To make matters worse, a second, secret hash function can be used only after the collision threshold has been passed (otherwise, it would have to be done on the device, but then it cannot be kept secret). Since the safety voucher is not linked directly to a full resolution photo, the second hashing has to be performed on the tiny "visual derivative", which makes collisions all the more likely.

Apple's second hash algorithm is kept secret (so much so that the whitepapers released by Apple do not claim and do not confirm its existence!). This means that we don't know how well it works. We can't even rule out the second hash algorithm being a trivial variation (or completely identical) to the first hash algorithm. Moreover, it's unlikely that the second algorithm was trained on a completely different dataset than the first one (e.g. because there are not many such hash algorithms that work well; moreover, the database of known CSAM content is really quite small compared to the large datasets that good machine learning algorithms require, so testing is necessarily limited). This suggests that transfer attacks are likely to work.


FAQ Part 2/2

Q: If the second, secret hash algorithm is based on a neural network, can we think of its weights (coefficients) as some kind of secret key in the cryptographical sense?

A: Absolutely not. If (as many suspect) the second hash algorithm is also based on some feature-identifying neural network, then we can't think of the weights as a key that (when kept secret) protects the confidentiality and integrity of the system.

Due to the way perceptual hashing algorithms work, having access to the outputs of the algorithm is sufficient to train a high-fidelity "clone" that allows you to generate perfect adversarial examples, even if the weights of the clone are completely different from the secret weights of the original network.

If you have access to both the inputs and the outputs, you can do much more: by choosing them carefully [4], you can eventually leak the actual secret weights of the network. Any of these attack can be executed by an Apple employee, even one who has no privileged access to the actual secret weights.

Even if you have proof positive that nobody could have accessed the secret weights directly, the entire key might have been leaked anyway! Thus, keeping the weights secret from unauthorized parties does not suffice to protect the confidentiality and integrity of the system, which means that we cannot think of the weights as a kind of secret key in the cryptographical sense.

Q: I heard that it's impossible to determine Apple's CSAM image hashes from the database on the device. Doesn't this make a hash attack impossible?

A: No. The scheme used by Apple (sketched in the technical summary [6]) ensures that the device doesn't _learn_ the result of the match purely from the interaction with server, and that the server doesn't learn information about images whose hash the server doesn't know. The claim that it's "impossible to determine Apple's CSAM image hashes from the database on the device" is a very misleading rephrasing of this, and not true.

Q: Doesn't Apple claim that there is only a one in one trillion chance per year of incorrectly flagging a given account?

A: Apple does claim this, but experts on photo analysis technologies have been calling bullshit [8] on their claim since day one.

Moreover, even if the claimed rate was reasonable (which it isn't), it was derived without adversarial assumptions, and using it is incredibly misleading in an adversarial context.

Let me explain through an example. Imagine that you play a game of craps against an online casino. The casino will throw a virtual six-sided die, secretly generated using Microsoft Excel's random number generator. Your job is to predict the result. If you manage to predict the result 100 times in a row, you win and the casino will pay you $1000000000000 (one trillion dollars). If you fail to predict the result of a throw, you lose and pay the casion $1 (one dollar).

In an ordinary, non-adversarial context, the probability that you win the game is much less than one in one trillion, so this game is very safe for the casino. But this number, one in one trillion, is based on naive assumptions that are completely meaningless in adversarial context. If your adversary has a decent knowledge of mathematics at the high school level, the serial correlation in Excel's generator comes into play, and the relevant probability is no longer one in one trillion. It's 1 in 216 instead! Whenfaced with a class of sophomore math majors, the casino will promptly go bankrupt.

Q: Aren't these attacks ultimately detectable? Wouldn't I be exonerated by the exculpatory evidence?

A: Maybe. IANAL. I wouldn't want to take that risk. While matching hashes are probably not sufficient to convict you, and possibly not sufficent to take you into custody, but it's more than sufficient to make you a suspect. Reasonable suspicion is enough to get a warrant, which means that your property may be searched, your computer equipment may be hauled away and subjected to forensic analysis, etc. It may be sufficient cause to separate you from your children. If you work with children, you'll be fired for sure. It'll take years to clear your name.

And if they do charge you, it will be in Apple's best interest not to admit to any faults in their algorithm, and to make it as opaque to the court as possible. The same goes for NCMEC.

Q: Why should I trust you? Where can I find out more?

A: You should not trust me. You definitely shouldn't trust the people defending Apple using the claims above. Read the EFF article [7] to learn more about the social dangers of this technology. Consult Apple's Threat Model Summary [5], and the CSAM Detection Technical Summary [6]: these are biased sources, but they provide sketches of the algorithms and the key factors that influenced the current implementation. Read HackerFactor [8] for an independent expert perspective about the credibility of Apple's claims. Judge for yourself.

[1] https://imgur.com/a/j40fMex

[2] https://graphicdesign.stackexchange.com/questions/106260/ima...

[3] https://arxiv.org/abs/1809.02861

[4] https://en.wikipedia.org/wiki/Chosen-plaintext_attack

[5] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

[6] https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

[7] https://www.eff.org/deeplinks/2021/08/apples-plan-think-diff...

[8] https://www.hackerfactor.com/blog/index.php?/archives/929-On...


You are glossing over how an adversary can generate an image that meets the following requirements:

  a) hashes to the same value as known csam image A with the public NeuralHash algorithm 

  b) has a derivative (e.g. lower res thumbnail) that when processed with a _private_ perceptual hash algorithm also matches known csam image A.
What is your proposal for solving b. For a, it’s possible to iteratively generate NeuralHash’s that get close and closer to the value you are attempting to equal, while that isn’t possible for step b.


…and? Does OP think reviewers will think a picture of a cat is CSAM?


The input image is a parameter, so you could start with some image that would be easier to confuse at the low resolutions the reviewers use. The missing piece in this case is that the database of target hashes is unknown.


Easier to confuse, or even something like a pic of a 20 year old where it's effectively impossible to be sure from the image itself.


Hi, this looks interesting but I have no idea what this all means lol Is this some way of hiding a picture within a picture, or am I way off the mark?


Apple began scanning for CSAM with Neuralhash. This allows you to turn an image into a specific neuralhash thus possibly triggering its (automatic) CSAM detection. Imagine if a picture of a cat could cause Apple to think you have CP on your device.


Well, you'd have to do it 30 times to trigger the system, and then someone at apple moderation would look at those 30 pictures of cats and hit "next" vs "supervisor"


Good that there’s some human supervision. But, I know I have more than 30 photos of my dog. Also don’t like the idea of false positives auto-sharing some of my camera roll.


It's only if you back it up to iCloud, the signatures of the CP used as references are rotated, and they're also not public. The chances of you randomly triggering the system is effectively 0 unless you're uploading CP to your iCloud.


Wait, wasn't all the hullabaloo over this scanning not requiring an upload to iCloud anymore?

They're scanning anything you upload to iCloud (and have been for some time) but now also scan everything on your device too.


No. They calculate a hash on the device, but they only do it as part of the iCloud upload. So whether the hashing happens on the device or on the server, the same images get hashed either way.


Photos of your dog are not going to trigger it. Someone would need to engineer the 30 photos of your dog tweaked to hash to a particular value, and then convince you to save them to your device and then upload to iCloud. And then some portion/abstraction of the dog photo would need to convince a reviewer they were looking at CSAM.

The more likely path to trouble is legal NSFW material that's been engineered.


You can be pretty sure they'll report your account if at least 1 low-res thumbnail ("visual derivative") looks like an image of naked people/a sexual act.


You're not trying hard enough on how to bypass the human component methinks.

Use porn as the base images. The more petite, flat and young looking, the better. The moderators are already going to be tuned in to csam, so all you need to do is to give them a slight push.


Oh damn, that’s crazy. Very cool project. Thanks for sharing and the explanation.


What are trying to prove here? It takes a human not noticing that a glitchy image of cat is not the same as picture of dog, 30 times.

Yea collisions are technically, possible. Apple has accounted for that. What is your point?

Hashes are at the core of a lot of tech, and collisions are way easier and more likely to happen in those in many cases, but suddenly this is an issue for ya'll?


I don't see how this is fixable on their end.

Several people have suggested simply layering several different perceptual hash systems, with the assumption that it's difficult to find a colliding image in all of them. This is pretty suspect - there's a reason we hold a decades-long competition to select secure hash functions. Basically, a function can't generally achieve cryptographic properties (like collision-resistance, or difficulty of preimage computation) without being specifically designed for it. By it's nature, any perceptual hash function is trivially not collision resistant, and any set of neural models are highly unlikely to be preimage-resistant.

The really tough thing to swallow for me is the "It was never supposed to be a cryptographic hash function! It was always going to be easy to make a collision!" line. If this was such an obvious attack, why wasn't it mentioned in any of the 6+ security analyses? Why wasn't it mentioned as a risk in the threat model?


> First, as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database.

https://www.apple.com/child-safety/pdf/Security_Threat_Model...


And that hash function is kept private, so you can’t just iterate to find a match of both hashes.


Correct, gradient descent attacks like this depend on being able to differentiate your hash function.


Honestly missed this.

Is that security-through-obscurity? If the model and weights for the second hash function became public, then we could still construct a collision on both functions, right?


It’s not security through obscurity. The implementation is not necessarily a secret; it’s the model configuration which can only be differentiated if known. Since part of the threat is adversarial embeddings, it’s perfectly fine to assume you can keep a private model confidential that adversaries can’t differentiate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: