LLMs already have problems with fact vs fiction. I don't see how Reddit of all p...

uptownfunk · on April 18, 2023

I think the value is in the examples it provides of language.

nekoashide · on April 18, 2023

Top upvoted comments can filter out the useless information and then it can be trained on actual data and refined.

Arrath · on April 18, 2023

Except when top voted comments are hivemind approved 'funny' quips/responses, or in reply to exercises in creative writing like half the posts in relationshipadvice, iwantthemanager, nuclear/pettyrevenge, etc

aydyn · on April 18, 2023

Is this a joke that I'm missing? Top reddit posts are frequently trash filled with misinformation.

minimaxir · on April 18, 2023

Many popular LLMs already include large amount of Reddit comment data which is (usually) cited in their respective papers.

surgical_fire · on April 18, 2023

Reddit also has a problem with fact vs fiction.