Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You want to use my code, without ever knowing I wrote it? You want to use my hard work, regurgitated anonymously, stripped of all credit, stripped of all attribution, stripped of all identity and ancestry and citation? FUCK YOU!

Training must be opt in, not opt out.

Every artist, every creative individual, must EXPLICITLY OPT IN to having their hard work regurgitated anonymously by Copilot or Dall-E or whatever.

If you want to donate your code or your painting or your music so it can easily be "written" or "painted", in whole or in part, by everyone else, without attribution, then go ahead and opt in.

But if they don't EXPLICITLY OPT IN, you can't use the artist's or author's creative work for training.

All these code/art washing systems, that absorb and mix and regurgitate the hard work of creative people must be strictly opt in.



Every human is using the hard work of other humans down through the entirety of history and mostly without credit or attribution. None us exists in a vacuum and we are all copying each other constantly.

Should students need to attribute the copyrighted textbooks and lessons that they learned from for all their future work?

Should artists attribute every reference they've used? Even if they draw stick figures based on the reference? Even if they only use small parts from multiple references?

What's different from a machine learning something and a human learning it?

I think in terms of practical open source/permissive licenses it makes the most sense for new licenses to be made that include no-training clauses for the rights holders that dislike machine learning.

Dall-E's use of training on non-permissive copyrighted web-scraped data seems more complicated and I imagine there will eventually be lawsuits to figure that out.


I just don't understand this at all. I publish my code as open source when I can because I want others to find it useful, either by using the software that I wrote or by reusing the code. If I didn't want that, I wouldn't publish the code. But I do want it, so I'm glad there's a way for people to access it more easily.

I understand the argument from an artist's perspective much more, since they don't really have the option to publish their work in a way that any AI or any other artist can't copy off of.


Simply being public doesn't mean it's in the public domain - this applies to movies, art, code, etc.

One example of restrictive but public licenses include requiring others to share their source code if it's derived from yours, allowing individuals to use a product but not allowing business to use it (businesses can use it under a different - likely paid for license), or requiring attribution or acknowledgement that they used your code.

There is an argument for fair use if it counts as a substantial derivative, which is a different discussion from why people make it publicly viewable without making it flat out public domain.


That's great for you. I hope you choose a license and copyright terms that enable this specific vision.

The vast majority of open source licenses and copyright terms specifically stipulate the legal requirements for reproducing even just parts of the code. Which at a minimum require reproducing the license and copyright with all software including the licensed and copyrighted code.


Do you place your published code in public domain or use something like CC0? Or do you use a license with some strings (e.g. attribution) attached?



Is your code public?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: