Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The submitted code must adhere to either of (a), (b), (c), and separately a (d) clause of: https://github.com/nodejs/node/blob/main/CONTRIBUTING.md#dev...

If submitter picks (a) they assert that they wrote the code themselves and have right to submit it under project's license. If (b) the code was taken from another place with clear license terms compatible with the project's license. If (c) contribution was written by someone else who asserted (a) or (b) and is submitted without changes.

Since LLM generated output is based on public code, but lacks attribution and the license of the original it is not possible to pick (b). (a) and (c) cannot be picked based on the submitter disclaimer in the PR body.

 help



Not sure if you are intentionally misrepresenting (a), but here is the full text

(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or


That seems exclusive of LLMs, as the user didn't create the contribution, the LLM did.

It's exclusive of code where you wrote 0% of it.

"in part" is a trivial bar to clear.


I guess as a very strict reading where you take the output and insert a newline somewhere...but that sounds against the intent

Orthogonal to? Irrespective of the use of?

If there's a "the original" the LLM is copying then there's a problem.

If there isn't, then (b) works fine, the code is taken from the LLM with no preexisting license. And it would be very strange if a mix of (a) and (b) is a problem; almost any (b) code will need some (a) code to adapt it.


> the code is taken from the LLM with no preexisting license

That's not good enough to comply with (b). The code must be specifically covered by an open-source license, it's not enough for it to just not have a license.


There's a difference between "no license, all rights reserved" and "no license, public domain". Up until recently, you could assume that not having a license meant the former. But treating the latter as the same would just be silly.

As far as I'm concerned, public domain counts as "an appropriate open source license".


> As far as I'm concerned, public domain counts as "an appropriate open source license".

For material whose author is known and has explicitly placed it in the public domain, sure. For code that fell off the back of a truck, not so much.


I'm of course assuming the legal status quo holds, where code properly generated by LLM is also explicitly public domain. No shadiness involved.

(There's always a risk of an LLM copying something verbatim by accident, but if the designers are doing their job that chance gets low enough to be acceptable. Human code has that risk too after all. (And for situations that aren't an accident, with the human intentionally using snippets to draw out training text, then if they submit that code in a patch it's just a human violating copyright with extra steps.))


> code properly generated by LLM is also explicitly public domain

Where? I hadn't heard of any such ruling.


https://en.wikipedia.org/wiki/Artificial_intelligence_and_co...

This page has a pretty good overview.

> Both the federal and circuit courts in the District of Columbia have upheld the Copyright Office's refusal to register copyrights for works generated solely by machines, establishing that machine ownership would conflict with heritable property rights as establish by the Copyright Act of 1975.[16] As of March 2026, the Supreme Court of the United States has denied hearing challenges to the Copyright Office's decision.[17]


To many, it qualifies under either A or B, and therefore C as well. Under A, you can think of the LLM as augmenting your own intelligence. Under B, the license terms of LLM output are essentially that you can do whatever you want with it. The alternative is avoiding use of AI because of copyright or plagiarism concerns.

It would be considered (a) since the author would own the copyright on the code.

Owning copyright of something and writing it are very different things

Not in the US. Copyright exists from the moment the work is created.

Source: https://www.copyright.gov/help/faq/faq-general.html


Citation needed.

Whether AI output can fall under copyright at all is still up for debate - with some early rulings indicating that the fact that you prompted the AI does not automatically grant you authorship.

Even if it does, it hasn't been settled yet what the impact of your AI having been trained on copyrighted material is on its output. You can make a not-completely-unreasonable argument that AI inference output is a derivative work of AI training input.

Fact is, the matter isn't settled yet, which means any open-source project should assume the worst possible outcome - which in practice means a massive AI-generated PR like this should be treated like a nuke which could go off at any moment.


The two main points are that:

1. Copyright cannot be assigned to an AI agent.

2. Copyrighted works require human creativity to be applied in order to be copyrighted.

For point 2 this would apply to times were AI one shots a generic prompt. But for these large PRs where multiple prompts are used and a human has decided what the design should be and how the API should look you get the human creativity required for copyright.

In regards to being a derivative work I think it would be hard to argue that an LLM is copying or modifying an existing original work. Even if it came up with an exact duplicate of a piece of code it would be hard to prove that it was a copy and not an independent recreation from scratch.

>the worst possible outcome

The worst possible outcome is they get sued and Anthropic defends them from the copyright infringement claim due to Anthopic's indemnity clause when using Claude Code.


That indemnity clause is only for Team, Enterprise and API users. Do you know what was used here?

Also the commercial version is limited to “…Customer and its personnel, successors, and assigns…”. I am very much not a lawyer and couldn’t find definitions of these in the agreement but I am not sure how transferable this indemnity would be to an open source project.


I reviewed it and it looks like personal Claude Code subscriptions are not covered, so it's riskier than I claimed.

Why write open-source software at all, when the government could outlaw open-source entirely? What if an asteroid destroys Earth and there are no humans left to enjoy your work? At some point, you have to agree that a risk isn't worth worrying about. And your "worst possible outcome" is just the arbitrary outcome that you think has some subjective risk threshold. And it's certainly not one I agree with. Furthermore, calling it a "nuke" is a bad analogy because that implies that it can't be put back in the bottle once opened. In reality, we're dealing with legal definitions, which can be redefined as easily as defined.

> And it's certainly not one I agree with

Well, it's a good thing you're not on the hook for defending against it, then.

Like I said in another comment, you don't have a license just because they're cool and look neat. You have them specifically to guard against people like patent trolls, who are trying to wreck your shit and take your lunch money. It's not an abstract risk.


> Well, it's a good thing you're not on the hook for defending against it, then

If you are on the hook for defending against it, and your risk assessment is based on emotional, irrational fear and not an objective understanding of the risks, then you're doing people a disservice and should step down.


This is not how law works. Stop pretending that you’re a lawyer. You do not “always assume the worst”. Stop giving legal advice. You’re very clearly a developer in over his head. Law is not an engineering problem. Legislation is not a technical specification. Christ.

No, they're absolutely correct, and they're not saying either of those things. They're pointing out an enormous hidden risk. Yanno, like an engineer is supposed to do.

You don't have a license because it's what all the cool kids are doing, you have one in case shit goes sideways and someone decides to try and ruin your day. You do, in fact, have to assume the worst.

The "nuke" here is some litigious company -- let's call them Patent Troll Rebranded (PTR) -- discovers that the LLM reproduced large amounts of their copyrighted code. Or it claims to have discovered it. They have large amounts of money and lawyers to fight it out in court and you are a relatively shoestring language foundation.

Either you have to unwind years of development to remove the offending code or you're spending six figures or more to defend yourself in court, all because you didn't bother to anticipate things that are anticipatable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: