There's a <50% chance you'll make money off of it because it's zero-sum, and if insiders make money on average then other traders (i.e., you) have to lose money on average.
The issue is that the odds aren't actually 50/50 on you buying either side of the trade; one half will look like a better deal (and given public information, it is a better deal) so you'll buy that half. Then when the market resolves, it'll turn out that insiders knew some piece of information that made the other half of the trade a better choice.
So then the person who uses public information is actually the bigger fool than someone who uses no information whatsoever and just picks a side of the bet. If insiders know that a volcano is going to erupt and offers a 50/50 payout, people who know the historical improbability will bet that is will NOT erupt, thinking it is an easy payout. But there will be some idiots out there that bet that it will erupt based on dumb luck.
The insiders don't consistently have opportunities to profit from their information. It's not every day that the war against Iran changes direction, and that whale has to wait for the next 180 degree turn to make their move.
"The community" is astroturfed as hell though. Anthropic pays influencers to promote Claude Code and likely bots a ton as well, so it's hard to come to any kind of consensus online. Even if everyone was acting in good faith, some people will have a much better experience than others because of the domain they're working in (e.g. AI being much better at frontend and commonly used libraries).
The only real way to evaluate a model is to test it yourself but that's exhausting for each new model and not comprehensive anyway.
Yeah, it's crazy that there is no trustworthy source for model reviews. I'd love to know how well the new Deepseek 4 actually performs, for example, but I don't want to spend the next week testing it out. Reddit used to be a somewhat useful gauge, but now there are posts on how 4 is useless right next to posts on how amazing it is. And I have no idea if this is astroturfing, or somebody using a quantized version, or different workloads, or what.
I also find it increasingly difficult to evaluate the models I actually do use. Sometimes each new release seems identical or only marginally better than the previous version, but when I then go back two or three version, I suddenly find that oder model to be dramatically worse. But was that older model always that quality, or am I now being served a different model under the same version name?
One challenge is that model evaluation is typically domain/application specific. Model performance can also depend on the system prompt and the input/context.
Regarding evaluation, I've found using tools like promptfoo (and in some cases custom tools built on top of that) are useful. These help when evaluating new models/versions and when modifying the system prompt to guide the model. Especially if you can define visualizations and assertions to accurately test what you are trying to achieve.
This can be difficult for tasks like summarization, code generation, or creative writing that don't have clear answers. Though having some basic evaluation metrics and test cases can still be useful, and being able to easily do side-by-side comparisons by hand.
I'll be launching the game in a couple months so you can see it then :)
There are bugs, it doesn't work perfectly, but that's just part of testing and refinement at this point.
My initial prompt was just: "let's work on converting this java game to c++ using panda3d. you're a panda3d c++ expert. you will be the agent that owns the project, creating the plan, and the delegating each step to sub-agents that create each system in the correct order."
it created like 17 different tasks and sub agents and opus 4.7 orchestrated it. I did personally validate which rendering engine would be good for the project etc first.
Being good at predicting is a broad skill which is applicable widely, while insider information only works for the specific thing the information is about. Far from every market will have insiders with special information. E.g. on who will win an election.
Not really. I have a pretty solid 5+% edge over a long time period even on the competitive markets I bet in. On many markets I think it's closer to 10-20%. These are things that inside information can't really help as much as you'd think.
And even on markets where someone would benefit from inside information... insiders leak a lot more information than you'd think before it tends to hit the market. Even reading the news can tell you more than you'd think if you look at it right. The single biggest hint is "Why am I reading this, and why now?". News stories on geopolitics almost never arise naturally, and that question will get you to a LOT of information that was not explicitly stated.
> Why do we need a third party gambling apparatus involved?
Wisdom of crowds > wisdom of individual firms, also a market solution actually would work in this case imo. Manipulating the weather seems easy enough to detect and much more expensive than any benefit you'd get, and there aren't really any negative externalities.
I doubt the utility of this simple aphorism. Primarily "the crowd" has a strong bias and would only consist of people willing to take the time to put a financial stake on their position.
> Manipulating the weather seems easy enough to detect
What are you basing this off of?
> and much more expensive than any benefit you'd get
Cloud seeding is not particularly expensive. The problem is the people performing this work may never interact with your market. You're literally playing a rigged game without any clue.
> and there aren't really any negative externalities.
Between these I think Amazon is less bad. It's a monopoly & monopsony which causes a lack of innovation and (eventually) higher prices but it's also a much more efficient way to sell things and it doesn't destroy the fabric of society or anything. Meta though is just as bad if not worse than any gambling site out there. Its products are optimized to destroy your attention span, feed you polarizing content, destroy your mental health and waste hours of your time every day all while ironically making you less connected to other people because users won't get off their phones and have a conversation.
Amazon extracts a lot of the value of a purchase from the seller's take. Sellers risk sanctions if they sell a product cheaper thru their brand website.
Prediction markets are far, far more slippery. Anyone working at one of these places had other options & chose to sell their morals so I think it's perfectly reasonable to not hire them.
I kind of agree, but presumably this would happen more among people maintaining security-critical projects. In that case it'd be a net positive for other projects to get infected first, since if they aren't delaying package updates by 24 hours then security probably isn't quite as important. Which also makes it better in general because hackers will be less incentivized to write viruses if all the really juicy targets will only download them after they've gone undetected for e.g. 7 days.
The issue is that the odds aren't actually 50/50 on you buying either side of the trade; one half will look like a better deal (and given public information, it is a better deal) so you'll buy that half. Then when the market resolves, it'll turn out that insiders knew some piece of information that made the other half of the trade a better choice.