Hacker Newsnew | past | comments | ask | show | jobs | submit | varenc's commentslogin

What is the economic value of all these AI chat logs? I can see it useful for developing advertising profile. But I wonder if it's also just sold as training data for people try to build their own models?

Pretty easy to match up those logs with browser fingerprinting to identify the actual user. Then you have "do you want to purchase what Mr. Foo Bar is prompting the LLM?"

If full AGI dreams are achieved and 80% of jobs disappear, leading to mass unemployment, then we need to do something to support the huge numbers of people that no longer have any income. Taxes to support a UBI program seem one solution. Or maybe the labor market can shift to find opportunities for humans that AI can't replace and we'd avoid the mass unemployment.

But feels like we're a long way from that right now.


We have "disappeared" ~97% of jobs since the Industrial Revolution started, and no increased unemployment has materialized.

Until you understand how something that counter intuitive happened, you should not speculate on how AI replacing current jobs will play out!


If you're so sure that new jobs will appear (and -- critical omission -- that they will be any good), surely you would be willing to ask the capital interests for whom these arguments are self-serving to put money where their mouth is and backstop a guarantee?

No?

Hmmmmmm.


This analogy happens a lot, and it might be true, but it's not clear to me that they're comparable.

The Industrial Revolution mostly ate mechanical labor and created more 'thinking' and knowledge worker jobs closer to the top of the stack. AGI goes after the information / decision-making layer itself. And it's unclear how much remains once those are automated.


I consider the Industrial Revolution to still be ongoing, since jobs have constantly been automated away by technology for 250 years. Some like to split that time into separate eras. In that paradigm we're now in the Fifth Industrial Revolution (Industry 5.0).

Whatever you call it, jobs keep getting "stolen" by technology, and yet employment rates stay high and average living standard keeps rising.

I'm genuinely fascinated by how this keeps happening, decade after decade, and yet most people are convinced the opposite is happening. I'm old enough to remember this exact discussion from 50 years ago.

We all see and interact with jobs that did not exist 20 years ago, and many of us work those jobs. And yet... this knowledge is somehow compartmentalized away from future expectations.

If you want a theoretical framework for why this keeps happening, my thought is that unemployed humans are an unused resource. And capitalism is really good at finding ways to use those.


I suspect that the reason might be that the Industrial Revolution happened over 200 years ago. That provides a lot of time for 97% of jobs to progressively disappear without disrupting society too much (except for all the revolutions and world wars). That would be quite different than if AI caused any significant percentage of jobs to disappear in a much shorter period of time.

Can you give a source for the 97% claim?

I have two ways to think of it, and both give similar numbers.

A: 250 years ago, 98% worked in farming. Today it's 2% (who produce more food!). Assume that the other 2% are at least twice as productive, and you get that 3% of the population now produces as much as 100% back then.

B: It's hard to directly estimate how much GDP per person has increased in 250 years. But the typical number economists get when trying is that it's 30x as big. Which means 3.3% of today's workforce produces as much (per person) as the whole workforce did back then.

Both A and B can be critiqued, but the precise numbers don't really matter for the argument.


We don't need taxes to pay a UBI.

If "every country" is in debt, who owns the debt exactly? ... (it's not real debt)


The problem with socialism, is eventually, one runs out of other people's money.

For an example of what unlimited borrowing and money printing results in, look up Germany in 1921--1923


We're 45 years into the trickle-down experiment and we can now tell if what trickled down was gold or piss.

(It was piss.)


Yeah, when folks say they want to go back to the '50s, I immediately ask, "Bring back the 90% tax bracket? Yes, please."

Sure, but then we at least don't have the ultra wealthy coming up with ways to make everyone elses lives worse.

If we took Elon Musk's money away and simply burnt it, that would still be a net win for society as a whole.


I’m pretty sure the poster you’re replying to is hinting at MMT and for your own statement,

Money is a nations currency. It’s actually the people of that nations property and you only get a lease on it.

If you disagree then try to do something like ceding the land that you “own” to another nations and see how that goes


Lots of people already have Apple TVs and the Tailscale integration is pretty good and can serve as an always online exit node. So no new hardware required. Could even remotely walk a non-techie through the process without too much effort.

personally, I've just upgraded my family's wifi to Ubiquiti and can then use Tailscale Wireguard running on the gateway as a proxy! (with their permission)


Is it that common outside the us? I know of exactly one family here in Germany having Apple TV.

The only folks using Apple TV in 2026 are like 60+ yrs old.

I've literally not seen one in anyone's home for probably 5+ years. And even then nobody used them.

Apple TV was one of those products that relatively few people bought but they were loud about buying it, so it seemed more popular than it was. Then other services like Roku($20) quickly replaced it.

I'm in the USA.


Roku became adware and most of my friends/family switched to AppleTV

They’re not insanely common even in the US, since Roku and Android sticks are cheaper and I don’t live in a wealthy area, but they’re not hard to get or unheard of.

The distinction between AppleTV, the hardware, and Apple TV+, the streaming service, was lost on many. Now that they are “Apple TV 4K” hardware and “Apple TV” service, it’s even harder to convey the correct meaning.


It is in the UK, but I don’t think it is on the continent.

I've never seen one in Poland.

Interesting to learn you can identify the real country/area of origin using probe latency. Though could this be simulated? Like what if the VPN IP just added 100ms-300ms of latency to all of its outgoing traffic? Ideally vary the latency based on the requesting IP's location. And also just ignore typical probe requests like ICMP (ping). And ideally all the IPs near the end of the traceroute would do all this too.

To use an example, 74.118.126.204 claims to be a Somalian IP address, but ipinfo.io identifies it as being from London based on latency. Compare `curl ipinfo.io/74.118.126.204/json` vs `curl ipwhois.app/json/74.118.126.204` to see. If that IP ignored pings and added latency to all outgoing packets, I wonder if that would stymie ipinfo's ability to identify its true origin.


It isn't just latency, but "triangulation".

  [IPinfo] pings an IP address from multiple servers across the world and identify the location of the IP address through a process called multilateration. Pinging an IP address from one server gives us one dimension of location information meaning that based on certain parameters the IP address could be in any place within a certain radius on the globe. Then as we ping that IP from our other servers, the location information becomes more precise. After enough pings, we have a very precise IP location information that almost reaches zip code level precision with a high degree of accuracy. Currently, we have more than 600 probe servers across the world and it is expanding.
u/reincoder, https://news.ycombinator.com/item?id=37507355

There's quite a bit of effort in this space.

In my first job out of school, I did security work adjacent to fortune 50 banks and the (now defunct) startup I worked at partnered some folks working on Pindrop (https://www.pindrop.com/).

Their whole thing at the time was detecting when it was likely that a support call was coming from a region other than the one the customer was supposed to be in (read: fraudulent) by observing latency and noise on the line (the name is a play on "We're listening closely enough to hear a pin drop".)

Long story short, it's a lot more than just the latency that can clue someone in on the actual source location, and even if you introduce enough false signal to make it hard to identify where you actually are, it's easy to spot that and flag you as fake, even if it's hard to say exactly what the real source is.


I work for IPinfo.

We also run traceroutes. Actually, we run a ton of active measurements from our ProbeNet. The amount of location data we process is staggering.

https://ipinfo.io/probenet

Latency is only one dimension of the data we process.

We are pinging IP addresses from 1,200+ servers from 530 cities, so if you add synthetic latency, chances are we can detect that. Then the latency-related location hints score will go down, and we will prioritize our dozens of other location hints we have.

But we do welcome to see if anyone can fool us in that way. We would love to investigate that!


Do you run traceroutes and pings in both directions?

In the case of a ping you might think it shouldn't matter but I can imagine a world where a VPN provider configures a server in London to route traffic via Somalia only when a user establishes a connection to the "Somalia" address of the server. You could only test this if you did a traceroute/ping through the VPN.

And I'm not saying this is what's happening but if you just ping the IP from your infra, couldn't stuff like anycast potentially mess you up?

In the case of traceroutes, you only see the route your traffic takes to the VPN, you don't see the route it takes to get back to you, which I think is really important.


We run traceroutes and latency measurements from many different locations, so we are looking at aggregate behavior rather than any single path. When you combine data from hundreds of ProbeNet PoPs over time, asymmetric routing mostly shows up as noise. When that happens, latency based hints lose weight and we lean more on other signals.

We have seen this in practice. For example, when we deployed servers in Gambia, even traffic between local networks often left the country and came back due to limited peering and little use of the national IXP. Stil, the overall routing patterns were still learnable once you look at enough paths.

For VPNs, we are measuring the location of the endpoint IP itself, not user traffic inside a tunnel. If routing only changes after a tunnel is established, that is a service level behavior, not the network location of the IP.

Anycast and tunneling are things we explicitly detect. They tend to create clear patterns like latency clustering or unstable paths, and when we see those and flag them as anycast IPs by defaulting to their geofeed location.

See the classic: https://ipinfo.io/1.1.1.1


If the VPN IP and the last ~4 hops in the traceroute just ignored ICMP pings, or just all inbound traffic, it sounds like that'd make your detection harder?

I've found that this isn't even that uncommon. One of the example VPN IP's on the article had the last 3 hops in traceroute ignoring ICMP. (though TCP traceroute worked). The VPN IP itself didn't, but it easily could!

(feel free to ignore lest we not give bad actors ideas)


This can fool someone from one location and only in one way (if you are near Somalia and expect a 10ms latency, a virtual VPN can't reduce latency to simulate been in Somalia). So it have to be dynamic to fool multiple locations to stay probable.

But anyway, *you can't fool the last-hop latency* (unless you control it, but you can control all of it), and basically it impossible to fool that.


Not that simple.

If they added latency to all packets then London would still have the lowest latency.


Does this really work? I would think the ping time would not be dominated by speed of light, but by number of hops, and connection quality.

As a hypothetical example, an IP in a New York City data center is likely to have a shorted ping to a London data center, than a rural New York IP address.


The speed of light sets a minimum bound even if you don't account for that, and these are coming up less than the minimum bound.

It also reminds me of this old story: https://web.mit.edu/jemorris/humor/500-miles


Would be even slower as the light will travel slower in the optical fiber and there will be time associated with each repeater as well.

That is a great one!

It's possible to deduce password hashes by timing responses over the internet if the server isn't using constant time comparison. Noise is just that, a noise.

with enough packets you can trilaterate an approximate locatuon. adding random jitter will just delay it a bit.

More than a bit!

Once you know the exit IP you can just find network(s) advertising it.

The VPN provider only controls their network, not their upstream.

So you can set minimum latency on your responses. But your upstream networks won't be doing this.


If you 300ms latency then yes, you defeat this detection mechanism.

Only if the detection mechanism is looking at that single IP and from a single location.

Find the ASN(s) advertising that network and figure out their location.

Even within the ASN there may still be multiple hops, and those IPs may be owned by others (eg the hosting facility) who are not playing the same latency games.


We operate servers for the purpose of measuring the internet using a wide variety of methods. We have more than 1,200 of these servers distributed across 530 cities, running not only ping but traceroute and many other types of active measurements.

In addition to active measurement and research, there are many other sources of data we use. Also, we are actively investing in R&D to develop new sources. Adding just 300ms of latency at the end of an IP address would simply appear as noise to us. We have dozens of locations, hints cut through the noise.

We welcome people to try to break the system. Perhaps it is possible to dupe this system.


Ideally, there'd be a way to subtract lag. (A non-causal network switch? Would be big business...)

If you ping it from UK and it ping >10ms then you know its there. And you are triangulating from multiple countries.

You could vary the additional latency based on the location of the IP you're replying to? Or just hash the requesting IP and use that as a seed to generate that particular IP's random extra latency that always stays the same for that IP. Which feels like enough to make triangulation hard. Though I'm just spitballing.

How did they get leak them? Just someone getting into your personal Claude Code logs? I'm surprised that if it was just that Google would even be aware they're leaked.

Claude was looking up env-vars during the coding session which ended up in ~/.claude/projects/ log. I wanted to make the [construction] logs public with the code. Didn't think that was a leak vector.

How would Google or OpenAI have alerted you? Anthropic could alert you because they scraped their keys and detected on of their keys in the logs. If anything, it’s bad that Anthropic only notified you about their key, and not the other keys that have leaked.

They all partner with Github to detect leaked credentials. In order to have API keys I need to have an account with each service with a valid email. So all three of them had the same information and channels available to reach me. It wouldn't have mattered how the keys got leaked, in the current setup Anthropic would have reached me first and deactivated my key.

Claude (or other LLMs, for that matter) wouldn't know they leaked the keys because I did, by trying to make the construction logs public. I just wasn't expecting the logs to have keys in them from my env vars.



Comments moved thither. Thanks!

(But @dang doesn't work - you have to email hn@ycombinator.com to get reliable delivery. I found out about this dupe because someone else did that.)


100% on the AIME (assuming its not in the training data) is pretty impressive. I got like 4/15 when I was in HS...

The no tools part is impressive, with tools every model gets 100%

If I recall, the AIME answers are always 4 digits numbers. And most of the problems are of the type where if you have a candidate number it's reasonable to validate its correctness. So easy to brute force all 4 digit ints with code.

tl;dr; humans would do much better too if they could use programming tools :)


uh no it's not solved by looping over 4 digit numbers when it uses tools

Link? couldn't find the IQ test in more projects. And super skeptical neal.fun is trying to trick people into $30/month subscriptions

edit: I turned off my ad blocker and discovered the site is showing some ads. Guessing you clicked on an ad?

also it's pretty ironic because one of his projects is showcasing dark patterns: https://neal.fun/dark-patterns/


Apple has gone a similar way with effectively killing kernel extensions for the same reasons. In theory all the kernel extensions use cases have been replaced with "System Extensions" but of course not the same.

> What’s the middle for taking cyanide ? 1g? 1kg?

Cyanide has an LD50 (50% chance of death) in the 1-2 mg/kg range when taken orally. So middle for taking cyanide is probably 1.5mg/kg. 90mg for someone 60kg.

Sadly the middle ground in other topics is less easy to define!


Yeah but what’s the bottom and the top?

Bottom is 0? Or is 1mg? Or what?

Top is killing a human by dose? Killing an elephant dose? Make you feel dizzy?

What’s too much? What’s being in the middle?

You see, it’s a stupid logic “in the middle”. The point is being in the middle or moderate is not always “good”.

Also it’s hard to define. Tbh only non logical people throw words like “in the middle” etc


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: