> I had a friend who would drink a gallon of whole milk a day to maintain weight because he did so much at the gym.
That honestly might be an absorption issue, not an intake issue - you can hit aerobic limits enough for your body to skip digesting stuff & just shove protein directly out of the stomach instead of bothering to break it down.
My experience with this was a brief high altitude climb above 5km in the sky, where eating eggs & ramen stopped working and only glucon-d kept me out of it.
The way I like to think of it is that the fat in your body can be eaten or drank, but needs to be breathed out as CO2 to leave it.
The rate at which you can put it in and the rate of letting it go are completely different.
UUIDv7 is only bad for range partitioning and privacy concerns.
The "naturally sortable" is a good thing for postgres and for most people who want to use UUID, because there is no sorted distribution buckets where the last bucket always grows when inserting.
I want to see something like HBase or S3 paths when UUIDv7 gets used.
> UUIDv7 is only bad for range partitioning and privacy concerns.
It's no worse for privacy than other UUID variants if the "privacy" you're worried about leaking is the creation time of the UUID.
As for range partitioning, you can of course choose to partition on the hash of the UUIDv7 at the cost of giving up cheaper rights / faster indices. On the other hand, that of course gives up locality which is a common challenge of partitioning schemes. It depends on the end-to-end design of the system but I wouldn't say that UUIDv7 is inherently good or bad or better/worse than other UUID schemes.
Isn't it at least a bit worse than v4, which has no timestamp at all? There might be concerns around non-secure randomness being used to generate the bits, but I don't feel like it's accurate to claim that's indistinguishable from a literal timestamp.
Why is it bad for range partitioning? If anything, it's better? With UUIDv7, you basically can partition on primary key, thus you can have "global" unique constraint.
This is, but only for someone who wants to do JIT work without writing assembly code, but can read assembly code back into C (or can automate that part).
Instead of doing all manual register allocations in the JIT, you get to fill in the blanks with the actual inputs after a more (maybe) diligent compiler has allocated the registers, pushed them and all that.
There's a similar set of implementation techniques in Apache Impala, where the JIT only invokes the library functions when generating JIT code, instead of writing inline JIT operations, so that they can rely on shorter compile times for the JIT and deeper optimization passes for the called functions.
It's a good write-up but I wish this blog post went just a little bit deeper with the investigation to confirm whether this is the issue (ollama.tymscar.com having an AAAA record); it's missing the answer to "Why is the JVM trying (or initializing toward) an IPv6 path first and not gracefully falling back?"
> Clearly useful to people who are already competent developers
> Utterly useless to people who have no clue what they're doing
> the same way that a fighter jet is not useless
AI is currently like a bicycle, while we were all running hills before.
There's a skill barrier and getting less complicated each week.
The marketing goal is to say "Push the pedal and it goes!" like it was a car on a highway, but it is a bicycle, you have to keep pedaling.
The effect on the skilled-in-something-else folks is where this is making a difference.
If you were thinking of running, the goal was to strengthen your tendons to handle the pavement. And a 2hr marathon pace is almost impossible to do.
Like a bicycle makes a <2hr marathon distance "easy" for someone who does competitive rowing, while remaining impossible for those who have been training to do foot races forever.
Because the bicycle moves the problem from unsprung weights and energy recovery into a VO2 max problem, also into a novel aerodynamics problem.
And if you need to walk a rock garden, now you need to lug the bike too with you. It is not without its costs.
This AI thing is a bicycle for the mind, but a lot of people go only downhill and with no brakes.
True, AI moves the problem somewhere else. But I’m not sure the new problems are actually easier to solve in the long run.
I’m a reasonable developer with 30+ years of experience. Recently I worked on an API design project and had to generate a mock implementation based on a full openapi spec. Exactly what Copilot would be good at. No amount of prompting could make it generate a fully functional spring-boot project doing both the mock api and present the spec at a url at the same time. Yet it did a very neat job at just the mock for a simpler version of the same api a few weeks prior. Go figure.
> if you are into load balancing, you might also want to look into the 'power of 2 choices'.
You can do that better if you don't use a random number for the hash, instead flip a coin (well, check a bit of the hash of a hash), to make sure hash expansion works well.
This trick means that when you go from N -> N+1, all the keys move to the N+1 bucket instead of being rearranged across all of them.
I've seen this two decades ago and after seeing your comment, felt like getting Claude to recreate what I remembered from back then & write a fake paper [1] out of it.
See the MSB bit in the implementation.
That said, consistent hashes can split ranges by traffic not popularity, so back when I worked in this, the Membase protocol used capacity & traffic load to split the virtual buckets across real machines.
Hot partition rebalancing is hard with a fixed algorithm.
> Is the big win that you can send around custom little vector embedding databases with a built in sandbox?
No, this is a compatibility layer for future encoding changes.
For example, ORCv2 has never shipped because we tried to bundle all the new features into a new format version, ship all the writers with the features disabled, then ship all the readers with support and then finally flip the writers to write the new format.
Specifically, there was a new flipped bit version of float encoding which sent the exponent, mantissa and sign as integers for maximum compression - this would've been so much easier to ship if I could ship a wasm shim with the new file and skip the year+ wait for all readers to support it.
We'd have made progress with the format, but we'd also be able to deprecate a reader impl in code without losing compatibility if the older files carried their own information.
Today, something like Spark's variant type would benefit from this - the sub-columnarization that does would be so much easier to ship as bytecode instead of as an interpreter that contains support for all possible recombinations from split up columns.
PS: having spent a lot of nights tweaking tpc-h with ORC and fixing OOMs in the writer, it warms my heart to see it sort of hold up those bits in the benchmark
I'm less confident. Your description highlights a real problem but this particular solution looks like an attempt to shoe horn a technical solution to a political people problem. It feels like one of these great ideas that years later results in 1000s of different decoders, breakages and a nightmare to maintain. Then someone starts an initiative to move decoding from being bundled and to instead just defining the data format.
Sometimes the best option is to do the hard political work and improve the standard and get everyone moving with it. People have pushed parquet and arrow. Which they are absolutely great technologies that I use regularly but 8 years after someone asked how to write parquet in java, the best answer is to use duckdb: https://stackoverflow.com/questions/47355038/how-to-generate...
Not having a good parquet writer for java shows a poor attempt at pushing forward a standard. Similarly arrow has problems in java land. If they can't be bothered to consider how to actually implement and roll out standards to a top 5 language, I'm not sure I want them throwing WASM into the mix will fix it.
> Python is never the best language to do it in, but is almost always the second-best language to do it in.
I've been writing python from the last century and this year is the first time I'm writing production quality python code, everything up to this point has been first cut prototypes or utility scripts.
The real reason why it has stuck to me while others came and went is because of the REPL-first attitude.
A question like
>>> 0.2 + 0.1 > 0.3
True
is much harder to demonstrate in other languages.
The REPL isn't just for the code you typed out, it does allow you to import and run your lib functions locally to verify a question you have.
It is not without its craziness with decorators, fancy inheritance[1] or operator precedence[2], but you don't have to use it if you don't want to.
Welcome to Rakudo™ v2025.06.1.
Implementing the Raku® Programming Language v6.d.
Built on MoarVM version 2025.06.
To exit type 'exit' or '^D'
[0] > 0.1 + 0.2 > 0.3
False
Raku uses a rational type by default for those which will give an exact value. If you use Python's Fraction type it would be equivalent to your Raku. The equivalent in Raku to the Python above would be:
The REPL example intends to show what the program does, not whether or not something is intuitive for you.
Second, using your same argumentation,
>>> 010 + 006 == 14
True
Is also wrong.
It's based on a misunderstanding of what representations of numbers in programming languages are.
In Python (and almost all other languages), 0.1 means the IEEE float closest to the decimal number 0.1, and arithmetic operations are performed according to the IEEE standard.
I am not “missing the point”, I am disagreeing with you. (Hopefully in an agreeable way)
I am making the point that using a decimal literal (eg 0.1) representation for a IEEE double is a bad choice and that using it as a representation for a Rat (aka Fraction) is a better choice.
I 100% accept your point that in Python 0.1+0.2>0.3 is true which is why I prefer Raku’s number system.
Note that "On overflow of the denominator during an arithmetic operation a Num (floating-point number) is returned instead." A Num is an IEEE 754 float64 ("On most platforms" says https://docs.raku.org/type/Num)
Python always uses IEEE 754 float64, also on most platforms. (I don't know of any Python implementation was does otherwise.) If you want rationals you need the fractions module.
>>> from fractions import Fraction as F
>>> F("0.1") + F("0.2") == F("0.3")
True
>>> 0.1 + 0.2 == 0.3
False
This corresponds to Raku's FatRat, https://docs.raku.org/type/FatRat. ("unlike Rat, FatRat arithmetics do not fall back Num at some point, there is a risk that repeated arithmetic operations generate pathologically large numerators and denominators")
that said, decimals (eg 0.1) are in fact fractions, and the subtlety that 0.1 decimal cannot be precisely represented by a binary floating point number in the FPU is ignored by most languages where the core math is either integer or P754
bringing Rational numbers in as a first class citizen is a nice touch for mathematicians, scientists and so on
another way to look at it for Raku is that
Int → integers (ℤ)
Rat → rationals (ℚ)
Num → reals (ℝ)
"0.1" is what the language specification says it is, and I disagree with the view that it's ignored by most languages when it's often clearly and explicitly stated.
That most people don't know IEEE 754 floats, and do things like store currency as floats, is a different matter. (For that matter, currency should be stored as decimal, because account rules can be very particular about how rounding is carried out.)
Similarly, 3 * 4 + 5 may 'in fact' be 17 .. sometimes. But it's 27 with right-to-left precedence ... and 19683 in APL where * means power (3 to the power of 9). While 3 + 4 * 5 may be 35 or 23 (or 1027 in APL).
FWIW, FatRat is ℚ, not Rat. Rat switches to Num if the denominator is too high, as I quoted.
Bringing it back to Python, ABC (which influenced Python's development) used a ratio/fraction/FatRat natively, which handled the 0.1 + 0.2 == 0.3 issue, but ran into the 'pathologically large numerators and denominators' problem even for beginning students.
I see Rat as a way to get the best of both worlds, but I'm certain it has its own odd edge cases, like I suspect x + 1/13 - 1/13 might not be the original value if x + 1/13 caused a Rat to Num conversion.
when I was a kid in junior school, I was taught that 0.1 means 1/10 something like the . is a division sign, digits to the right are the numerator and the denominator is the position of the digit to the power of 10
true, in fact the syntax of Python consumes the literal '0.1' as a double [float64] ... so ok maybe I was a bit strong that my fact trumps the Python fact (but it still feels wrong to say that 0.1 + 0.2 > 0.3)
---
I welcome your correction on FatRat ... btw I have just upgraded https://raku.land/zef:librasteve/Physics::Unit to FatRat. FatRat is a very useful string to the bow and imo cool that it's a core numeric type.
We are on the same page that the Rat compromise (degrade to P754) is optimal.
---
As you probably know, but I repeat here for others, Raku has the notion of https://docs.raku.org/language/numerics#Numeric_infectiousne... which means that `x + 1/3' will return a Rat if x is an Int or a Num if x is a Num. All "table" operators - sin , cos, log and so on are assumed to return irrationals (Num).
You likely also learned in school that some calculators do left-right evaluation while other, more expensive ones, do PEMDAS. And a few do postfix instead. You might also have learned that most calculators didn't handle 1/3 as a fraction, in that 1 / 3 * 3 is 0.99999999.
Python is a fancy calculator.
To be clear, while in the mathematical sense, yes, sin, cos, and log generally return irrationals, in their IEEE 754 forms they return an exact value within 1 ulp or so of that irrational number. Num is a rational. ;)
>>> x=5**0.5
>>> x
2.23606797749979
>>> x.as_integer_ratio()
(629397181890197, 281474976710656)
Scheme uses the phrase "numerical tower" for the same sort of implicit coercion.
I just realized that in school you likely also learned that 1.0 and 1.000 are two different numbers for physical measurements as the latter implies a higher measurement precision.
Given how slow Python is, isn't it embarrassing that 0.2 + 0.1 > 0.3 ?
I have some test Rust code where I add up about a hundred million 32-bit floating point numbers in the naive way, and it takes maybe a hundred milliseconds, and then I do the same but accumulating in a realistic::Real because hey how much slower is this type than a floating point number, well that's closer to twenty seconds.
But if I ask Python to do this, Python takes about twenty seconds anyway, and yet it's using floating point arithmetic so it gets the sum wrong, whereas realistic::Real doesn't because it's storing the exact values.
If you have a hundred million integers to add, please, by all means, use Rust, or C, and intrinsics for the features not yet in your favorite math libraries. You can call Rust from Python, for instance. Polars is excellent, BTW. This seamless integration between a nice language amenable to interactive experimentation and highly optimized code produced in many other languages I don't want to write code in is what makes Python an excellent choice in many business cases.
Er, yes, I'm aware why this happened, my point is that this happens in the hardware floating point, but Python is as slow as the non-accelerated big rationals in my realistic::Real (it's actually markedly slower than the more appropriate realistic::Rational but that's not what my existing benchmarks cared about)
Not really. It's a limitation of the IEEE floating point format used in most programming languages. Some numbers that look nice in base 10 don't have an exact representation in base 2.
Rational numbers where the denominator is a multiple of the prime factors of the base have an exact fractional representation in that base.
1/3 doesn't have an exact representation in base 10 or base 2. 1/5th does have an exact representation in base 10 (0.2), but doesn't in base 2. 1/4th has an exact representation in base 10 (0.25) and in base 2 (0.01)
You shouldn't add hundred million 32-bit floating point numbers in the naive way. You should use Kahan or Neumaier summation. In Python these are available as math.fsum() and (in recent Python releases) the built-in sum function.
If you did
total = 0.0
for value in data:
total += value
instead of
total = sum(data)
then yes, the answer will take longer and be less accurate. But the naive native Rust equivalent will be less accurate than Python's sum(data).
> You shouldn't add hundred million 32-bit floating point numbers in the naive way.
That's entirely correct, you shouldn't do this. And yet people do for one reason and another. I'm aware of Kahan summation (and Neumaier's improvement), but it wasn't the point of the benchmarks I happened to be performing when this topic arrived.
You will not be surprised to learn there's a Kahan adaptor crate for Rust's iterator, so (with that crate) you can ask for the Kahan sum of some floating point iterator just the same way as you could ask for the naive sum. I suppose it's conceivable that one day Rust will choose to ship a specialisation in the stdlib which uses Kahan (as Python did in newer versions) but that seems unlikely because it is slower and you could explicitly ask for it already if you need it.
You don't like Python's use of IEEE 754 float64 for its "float" type because it's already so slow that you think Python should use a data type which better fits the expectations of primary school math training.
Then to demonstrate the timing issue you give an example where you ignore the simplest, fastest, and most accurate Python solution, using a built-in function which would likely be more accurate than what's available in stock Rust.
If accuracy is important for the first, why is it not important for the second?
Are you aware of the long history of compiled versions of Python (PyPy, numba, and more), plus variants like Cython, where the result has near Rust performance levels?
Were the core float be a non-IEEE 754, those compiled versions would either be dog slow (to be compatible with Python's core float) or give results which are different than CPython's.
Even if they did not exist, there would be a lot of questions about why a given program in C, Rust, Pascal, or any other float64-based system, gives different answers when translated to Python.
FWIW, I, like others, could not reproduce your 20 seconds timing. Python is indeed slow for this sort of task, but even for explicit iteration my code is an order of magnitude faster than you reported.
I think that a language which (as I understand it) already gives you a big number if you exceed the bounds of its provided integer types, might just as well provide rationals as the IEEE floating point types unless those IEEE types are very fast but to me it looks like on the whole they're not.
I was not aware of Cython, which sounds like a terrible idea but each to his own nor Numba, though I have worked with PyPy in the past. I'm afraid that the idea that somehow every Python implementation would be compatible caused me to choke. Python doesn't really care about compatibility, behaviour of the sum function you brought up was changed twice since Python 3.
This is an old machine, so I can well believe you can do the iteration faster. My initial interest happened because by total coincidence I was writing benchmarks for realistic which try out f32 and f64 vs realistic::Real for various arithmetic operations, and so I wondered well, isn't even Python much faster and (with my naive iteration) it was not.
As you are hopefully aware, new programmers are equally likely to run into languages where the default is the 32-bit IEEE floating point and so 1.0 + 2.0 > 3.0 is false for them as they are to encounter a language like Python with 64-bit IEEE floats. I'd expect, as with Python's experience with their hash tables, the kind of people writing Python as their main or even only language would always be pleased to have simpler, less surprising behaviour, and the rationals are much simpler - they're just slower.
The int/long unification occurred a long time ago. ("long" was Python's name for BigNum.) There is now no such thing as "exceed the bounds of its provided integer types".
Guido van Rossum, who started and lead the Python project for many years, previously worked with ABC, which used rational as the default type. In practice this caused problems as it was all to easy to end up with "pathologically large numerators and denominators" (quoting https://docs.raku.org/type/FatRat). That experience guided him to reject rationals as the default integer type.
Pathologically large numerators and denominators make rationals not "just slower" but "a lot slower".
> somehow every Python implementation would be compatible
It's more of a rough consensus thing than full compatibility.
> Python doesn't really care about compatibility
Correct, and it causes me pain every years. But do note that historic compatibility is different than cross-implementation compatibility, since there is a strong incentive for other implementations to do a good job of staying CPython compatible.
FWIW, the laptop where I did my timings is 5 years old.
The new programmers in my field generally have Python as their first language, and don't have experience with float32.
I also know that float32 isn't enough to distinguish rationals I need to deal with, since in float32, 2094/4097 == 2117/4142 == 0.511106, while in float64 those ratios are not equal, as 0.5111056870881132 != 0.5111057460164172.
(I internally use uint16_t for the ratios, but have to turn them into doubles for use by other tools.)
sum() of a list with 100M floats took 0.65 seconds. The explicit loop took 1.5 seconds on CPython 3.13.
But again, yes, Rust performance runs rings around CPython, but that's not enough to justify switching to an alternative numeric type given the many negatives, and Python's performance isn't as dire as you suggest.
Welcome to Rakudo™ v2025.06.1.
Implementing the Raku® Programming Language v6.d.
Built on MoarVM version 2025.06.
To exit type 'exit' or '^D'
[0] > 0.1 + 0.2 > 0.3
False
It's the result if your programming language thinks 0.2 + 0.1 means you want specifically the 64-bit IEEE floating point binary arithmetic.
But, where did we say that's what we want? As we've seen it's not the default in many languages and it isn't mandatory in Python, it's a choice, and the usual argument for that choice would be "it's fast" except, Python is slow, so what gives ?
Interesting observation about Python providing the worst of all possible worlds: unintuitive arithmetic without any of its speed advantages.
But in answer to “where did we say that's what we want?” I would say, as soon as we wrote the expression, because we read a book about how the language works before we tried to use it. Αfter, for example reading a book¹ about Julia, we know that 0.1 + 0.2 will give us something slightly larger than 0.3, and we also know that we can type 1//10 + 2//10 to get 3//10.
> I would say, as soon as we wrote the expression, because we read a book about how the language works
I'm comfortable with that rationale in proportion to how much I believe the programmer read such a book.
I haven't taken the course we teach say, Chemists, I should maybe go audit that, but I would not be surprised if either it never explains this, or the explanation is very hand-wavy, something about it not being exact, maybe invoking old fashioned digital calculators.
The amusing thing is when you try to explain this sort of thing with a calculator, and you try a modern calculator, it is much more effective at this than you expected or remember from a 1980s Casio. The calculator in your modern say, Android phone, knows all the stuff you were shown in school, it isn't doing IEEE floating point arithmetic because that's only faster and you're a human using a calculator so "faster" in computer terms isn't important and it has prioritized being correct instead so that pedants stop filing bugs.
> unintuitive arithmetic without any of its speed advantages.
Using floating-point in Python is still much faster than using an exact type in Python.
$ # At this scale, we need to be aware of and account for the timing overhead
$ python -m timeit 'pass'
50000000 loops, best of 5: 8.18 nsec per loop
$ # The variable assignment defeats constant folding in the very primitive optimizer
$ python -m timeit --setup 'x = 0.1; y = 0.2' 'x + y'
10000000 loops, best of 5: 21.2 nsec per loop
$ python -m timeit --setup 'from decimal import Decimal as d; x = d("0.1"); y = d("0.2")' 'x + y'
5000000 loops, best of 5: 62.9 nsec per loop
$ python -m timeit --setup 'from fractions import Fraction as f; x = f(1, 10); y = f(2, 10)' 'x + y'
500000 loops, best of 5: 755 nsec per loop
You aren't answering my question. I asked worik why they're claiming that something that happens is false. It's insane to claim that reality isn't reality. Run that in a Python REPL and that is the result you get.
That said, changing how you think about programming... even with jshell I still think Java in classes and methods (and trying to pull in larger frameworks is not as trivial as java.lang packages). However, I think Groovy (and a good bit of Scala) in a script writing style.
jshell itself is likely more useful for teaching than for development - especially once you've got a sufficiently complex project and the ide integration becomes more valuable than the immediate feedback.
Still, something to play with and one of the lesser known features of Java.
That honestly might be an absorption issue, not an intake issue - you can hit aerobic limits enough for your body to skip digesting stuff & just shove protein directly out of the stomach instead of bothering to break it down.
My experience with this was a brief high altitude climb above 5km in the sky, where eating eggs & ramen stopped working and only glucon-d kept me out of it.
The way I like to think of it is that the fat in your body can be eaten or drank, but needs to be breathed out as CO2 to leave it.
The rate at which you can put it in and the rate of letting it go are completely different.