Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While author is correct in general, I would like to add a counter-point regarding em-dashes specifically. Yes, many people use them like this - and many website frameworks will automatically replace a keyboard not-really-a-minus symbol with em-dash. So that is not a sign of the LLM generated slop.

What LLMs also do though, is use em-dashes like this (imagine that "--" is an em-dash here): "So, when you read my work--when you see our work--what are you really seeing?"

You see? LLMs often use em-dashes without spaces before and after, as a period replacement. Now that is only what an Oxford professor would write probably, I've never seen a human write text like that. So those specific em-dashes is a sure sign of a generated slop.





It could also—hear me out here—be me just using compose + --- .

(Not that I used n- or m- dash previously, I used commas, like this! )

But some people learn n- and m-dash, it turns out. Who knew!


> What LLMs also do though, is use em-dashes like this (imagine that "--" is an em-dash here): "So, when you read my work--when you see our work--what are you really seeing?"

>You see? LLMs often use em-dashes without spaces before and after, as a period replacement.

It would not make any sense at all to use periods in the places where those em-dashes are supposedly "replacing" periods in the example.


Tbh whether I use spaces around em dashes depends more on the font than anything. Some fonts have em dashes that are so long that putting spaces around them would be ridiculous.

> LLMs often use em-dashes without spaces before and after, as a period replacement. Now that is only what an Oxford professor would write probably, I've never seen a human write text like that. So those specific em-dashes is a sure sign of a generated slop.

Evidently, you've never read text from anyone whose job requires writing, publishing, and/or otherwise communicating under rules established in (e.g.) the Chicago Manual of Style.


Those people broadly fall under "the Oxford professor" catch-all phrase. Obviously. I was talking about 99.99% of random internet texts, which do not conform to any Manual of style and are not written by literature majors. If I see a text authored by some known figure or in a respectable journal/site, then I don't have a task of detecting LLM slop in the first place. But when I do want to know if the text is generated or not, it is usually written by less sophisticated crowd, or anonymous.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: