Oh, cool! SmithForth[0] is how I originally learned about x86-64 microarchitecture. It's a Forth that bootstraps off hand-coded x86-64 opcodes. I decided to go the other direction and decompile the binary by hand. It really is a beautiful piece of code. Highly recommended reading.
Also, you're excited by Forth and Lisp, you might like Forsp[1]. It uses a call-by-push-value evaluation strategy in a way that really makes the language feel like a true hybrid of the two. The implementation is also brilliantly simple.
Anyway, thank you for the article. I lurk on the DuskOS mailing list and wish I could find out where the Forthers gather IRL so I could osmote more of their world into my own.
> I'm not entirely convinced it actually helped me as a "tool of thought"
This is so real. I had the same issue and was only able to break through by collabing with professional APLers. It's very non-ideal for the curious autodidacts.
I'd love to share, hone, and flesh out what the Tool of Thought looks like in practice with interested others. For anyone here to whom that sounds fun, feel free to reach out. Contact info is in my profile.
The author is clearly a novice who has just tinkered with array languages a little. Extrapolating from that to absolute properties of a tool seem a bit overeager, don't you think? For example, do you think you could design a legible analog circuit without significant time learning the craft?
My experience is with APL, but I think it is capable of producing some of the most readable and maintainable codebases out there. I've worked for years in various projects using Python, Java, C, Scheme, and a smattering of many other languages. However, it's really hard to overstate the clarity that comes from data-oriented design and dogged elimination of API boundaries.
It just takes a long time to learn to write good software in good APL style. In many ways all the in vogue best practices these days around declarative and functional programming tend to work antithetically to writing good array code. This means that the learning curve for experienced programmers is, perhaps paradoxically, higher than that of a totally naive beginning coder.
I really wish I knew some way to directly convey the experience of working with APL at a high level. There's really nothing else much like it.
> My experience is with APL, but I think it is capable of producing some of the most readable and maintainable codebases out there.
Cool! Do you have any public examples to point to? I would be curious to see how a larger project looks, given that I only use array languages for side projects, so my code is often not very legible (e.g. https://github.com/jcmorrow/advent_of_code/blob/master/day_2...).
My YAML loader[0] is a condensed example of the architecture and techniques I'm thinking of. It's a couple years old, so the code is dirty by my current standards, but the couple of times I've gone back to read it, I have found the YAML-specific concerns to be quite salient.
That said, it's specifically written for an audience that is familiar with YAML detail in particular, parsing more generally, and of course APL expressions of ideas. In fact, that is a big part of what makes the code readable and maintainable: it is optimized for communicating to the expert worker not for on-boarding new developers. The latter is more appropriately handled via other means IMHO.
The poster child for this style of APL is Co-dfns[0]. It's a production APL compiler and a much larger example of the code I'm talking about. The entrypoint to read is cmp/PS.apl.
If you're interested, I'm willing to have a chat and talk more about what makes this kind of APL imminently workable, in which case there are also some personal private examples I could share as well. Feel free to reach out to the contact info on my profile here.
When I first encountered it years ago, the thing was impenetrable, but after learning APL to a high level, it now reads like a simple, direct expression of intent. The code even clearly communicates design tradeoffs and the intended focus of experimentation. Or more on the nose, to me the code ends up feeling primarily like extremely readable communication of ideas between like-minded humans. This is a very rare thing in software development in my experience.
IMHO, ideas around "readable code" and "good practices" in software development these days optimize for large, high-turnover teams working on large codebases. Statistically speaking, network effects mean that these are the codebasese and developer experiences we are most likely to hear about. However, as an industry, I think we are relatively blind to alternatives. We don't have sufficient shared language and cognitive tooling to understand how to optimize software dev for small, expert teams.
I've used btrfs for 5-ish years in the most mundane, default setup possible. However, in that time, I've had three instances of corruption across three different drives, all resulting in complete loss of the filesystem. Two of these were simply due to hard power failures, and another due to a flaky cpu.
AFAIU, btrfs effectively absolves itself of responsibility in these cases, claiming the issue is buggy drive firmware.
Business incentives are aligned around incremental delivery, not around efficient encoding of the target domain. The latter generally requires deep architectural iteration, meaning multiple complete system overhauls and/or rewrites, which by now are even vilified as a trope.
Mostly, though, I think there is just availability bias here. The simple, solid systems operating at scale and handled by a 3-person team are hard to notice over the noise that naturally arises from a 1,000-person suborganization churning on the same problem. Naturally, more devs will only experience the latter, and due to network effects, funding is also easier to come by.
I kind of feel like set -o errexit (i.e. set -e) provides enough unexpected semantics that explicit error handling makes more sense. One thing that often trips people up is this:
set -e
[ -f nonexistent ] && do_something
echo 'this line runs'
but
set -e
f(){ [ -f nonexistent ] && do_something; }
f
echo 'This line does not run'
> Python is a great scripting language, and won't blow your foot off if you try to iterate through an array.
I kind of hate that every time the topic of shell scripting comes up, we get a troop of comments touting this mindless nonsense. Python has footguns, too. Heck, it's absolutely terrible and hacky if you try to do concatenative programming with it. Does that mean it should never be used?
Instead of bashing the language, why not learn bash the language? IME, most of the industry has just absorbed shell programming haphazardly through osmosis, and almost always tries to shove the square pegs of OOP and FP into the round hole that is bash. No wonder people are having a bad time.
In contrast, a data-first design that heavily normalizes data into line-oriented tables and passes information around in pipes results in simple, direct code IME. Stop trying to use arrays and embrace data normalization and text. Also, a lot of pain comes from simply not learning the facilities, e.g. the set builtin obviates most uses of string munging and exec:
set -- "$@" --file 'filename with spaces.pdf'
set -- "$@" 'data|blob with "dangerous" characters'
set -- "$@" "$etc"
some_command "$@"
Anyway, the senseless bash hate is somewhat of a pet peeve of mine. Exunt.
All languages have foot guns, but bash is on the more explodey end of the scale. It is not senseless to note that if you can use a safer tool, you should consider it.
C/C++ got us really far, but greenfield projects are moving to safer languages where they can. Expert low level programmers, armed with all of the available linting tools are still making unfortunate mistakes. At some point we should switch to something better.
In my years of reading and writing bash as well as Python for sysops tasks, I'd say that bash is the more reliable workhorse of the two. Python tends to encourage a kind of overengineering, resulting in more bugs overall. Many times I've seen hundreds of lines of Python or Typescript result from the attempt to replace just a few lines of bash!
The senselessness I object to is not the conscientious choice of tooling or discussion of the failings thereof; it's the fact that every single bash article on here sees the same religious refrain, "Python is better than bash. Period." It's like if every article about vim saw a flood of comments claiming that vim is okay for light editing, but for any real programming we should use a real editor like emacs.
If you open vim expecting emacs but with a few different bindings, then it might just explode in you face. If you use bash expecting to be able to program just like Python but with slightly different syntax, then it's not surprising to feel friction.
IME, bash works exceptionally well using a data-oriented, text-first design to program architecture. It's just unfortunate that very little of the industry is even aware of this style of programming.
Pentation? How quaint. For other Large Number fans, David Metzler has a wonderful playlist that goes way down the rabbit hole of the fast growing hierarchy:
Almost all of these mind-bogglingly large numbers are built around recursion, like the Ackermann function, which effectively has an argument for the number of Knuth up arrows to use. Then you can start thinking about feeding the Ackermann function into that slot and enjoy the sense of vertigo at how insane that becomes.
I find it fascinating how quickly the machinery of specifying large numbers probes the limits of what's definable within the mathematical systems we use.
It's funny how, for a certain subset of math, a researcher's life can be condensed to "arguing about which number is the biggest (preschool)" -> "learning about math" -> "arguing about which number is the biggest (academia)"
There is something awesome about incredibly large finite numbers. People gush about infinity, but I find it to be a pretty boring concept compared to finite numbers too large to even be written in this universe.
Infinity is aspirational. Infinity is a concept, simple and self-evident, yet paradoxical and oxymoronic.
I get kind of intrigued by these large number things at first but ultimately it feels like kind of a less elegant version of the same thing? It's all this mucking about with multiples and powers of multiples and powers when it's like...we've already taken this to the limit, we can just stand back and marvel at that, what are we looking for? We already know you can always add another digit, why invent more and more complicated ways to do that?
This isn't meant to be contradictory to what you're saying or anything, just interesting to explore these different perspectives on what mathematical concepts capture the imagination.
I'm wondering if there's a connection between large number hunters, unwritten rule proponents in sports and games, and modular synth collectors. There's a sort of satisfaction derived from finding and defining order according to aesthetic structures that are largely arbitrary but also emotionally resonant.
Meanwhile, infinity is for those that embrace chaos, minimalism, nothingness.
Like, there is a perfectly finite number, but is so large that there simply isn’t enough information in the universe
to encode it in any format. How cool is that to just think about for a while?
I think such a number is going to have strange properties like, some number bigger than that unencodable number is encodable because of a special pattern that allows a special non-surjective recursive function to encode it. I am just wondering if there really is smallest number for which no number greater than it is encodable.
It is not obvious to me that the definition of an encodable function has bounded growth: is it true that f(1) - f(0) for encodable f always has a bound given by the amount of data used to encode f? What is that bound?
The parent suggested that the number couldn't be encoded due to its largeness rather than its value. So while any number n with Kolmogorov complexity K(n) > 10^100 cannot be recursively encoded in the known universe, that number n need only be 10^100 bits long. On the other hand a number that's too large to be recursively encoded in the known universe would have to exceed BBλ2(10^100),
where BBλ2 is an optimal busy beaver for prefix-free binary programs [1].
Yes, I understood what the parent suggested. I am pointing out that such a number may have strange properties like the fact that a number larger than it can have a smaller Kolmogorov complexity, then I am questioning whether there is a number such that every number larger than it has such a large Kolmogorov complexity that it cannot be encoded. The question therefore becomes, is there a limit to the size of physically describable numbers? Or is there always going to be some number larger with some trivial kolmogorov complexity?
Postulate: You cannot define a largest physically describable number.
My assumption is that due to the very nature of Kolmogorov complexity (and other Godel related / halting problem related / self referential descriptions), this is not an answerable or sensible question.
It falls under the same language-enabled recursion problems as:
- The least number that cannot be described in less than twenty syllables.
- The least number that cannot be uniquely described by an expression of first-order set theory that contains no more than a googol (10^100) symbols.
If you could encode it that way, then it's incoherent. After all, that encoding exists within the universe. If it resolved to a value, that would disqualify that value from being correct because of the self reference.
Not really as that implies that you have a list of numbers that can be encoded within the universe, but the universe would run out of room keeping that list.
There is enough information if you assume reality is continuous. Pick a point A to be the origin. Then then you can encode the number by placing something at 1/N meters away from the origin.
Relativistic speeds can contract the length of any object as measured from an outside observer. If an object the size of 1 Planck length travels fast enough: you won't be able to measure it, as from your position it would be smaller than the Planck length as it passes by.
It's not impossible (afaik) for things to be smaller than the Planck length. We just don't have the ability (maybe ever) to measure something smaller than this limit.
Now, good luck finding something the size of 1 Planck length, and also to accelerate it to relativistic speeds.
Frustratingly, attempts to discretize space invariably run into problems with relativity, since they effectively impose a preferred frame of reference. I.e. you can impose a minimum distance, but relativistic length contraction means that observers measure different minima and in different directions.
Apparently, under some of these models, this implies that the speed of light ends up depending on wavelength, lending them to empirical tests. My understanding is that these discrete space models have failed to line up with experiment, at least within the limits of measurement.
That's currently unknown. For all current practical purposes, kind of. The plank length sets a limit on spatial resolution of any information, so a finite region with (universally) bounded entropy per conceivable bucket on that scale still has finite entropic capacity.
The current theories use continuous space and time. However, we can't encode information into space itself. We would have to use some configuration of matter, and then there are limits to how well-defined a particle's position can be coming from the uncertainty principle.
On the other hand, general relativity implies that if you put enough matter in a small enough space it becomes a black hole, and then we can't access the information in it.
IANA physicist but I think this line of thought is probably speculative at the moment because it involves both general relativity and quantum mechanics and it isn't known how they should work together.
For me it is far more to consider the number of atoms in the solar system, as it implies a pretty obvious limit on the data storage that humanity can meaningfully use. Obviously only a tiny fraction of those atoms can be used to store information & furthermore the energy required to actually perform that storage is enormous compared to the energy we have available at present.
Would you like to also be able to communicate about this number? You might have to reserve some atoms to form a being that could actually enjoy it. Considering such a perspective, the decoder for observing it should probably be embedded in the number itself.
You are stretching the definition of "is" here. Almost all finite numbers are impossible to actually count up to, so they exist only in the same sense that infinite numbers exist.
You might already know this, but the busy beaver function grows faster than any computable function. So although the best known lower bound of BB(6) can be expressed with just pentation, generally speaking the BB function is certainly beyond any standard operation in terms of fast growth
It's been shown to be surpassed at n=150, which as you note is likely very generous. Hypertree would typically only require a few more states. Hypertree doesn't grow meaningfully faster than Tree. Hypertree(3) is what would be called a Salad number, combining a very fast growing function with some very weak one(s) such as iteration, which in the Fast Growing Hierarchy corresponds to taking the successor if an ordinal.
The growth of BB is certainly mind-boggling; however, I personally find its growth rate so untouchable as obviate any attempt at understanding. There's nothing to gain purchase on.
The fast growing hierarchy, on the other hand, provides oodles of structure for us to explore, and we can construct numbers that are vastly larger than anything BB(6) is likely to hit. In fact, this is why we use the fast growing hierarchy to approximate really big numbers all the time!
When we take something like f_Γ_0 and try to unpack even just the barest surface of its size, I get a feeling of vastness similar to those videos of diving endlessly into fractals.
If you have a child who likes math I highly recommend "Really Big Numbers" by Richard Schwarz. Tons of nice illustrations on how to "take bigger and bigger steps".
Also, you're excited by Forth and Lisp, you might like Forsp[1]. It uses a call-by-push-value evaluation strategy in a way that really makes the language feel like a true hybrid of the two. The implementation is also brilliantly simple.
Anyway, thank you for the article. I lurk on the DuskOS mailing list and wish I could find out where the Forthers gather IRL so I could osmote more of their world into my own.
[0]:https://dacvs.neocities.org/SF/
[1]:https://xorvoid.com/forsp.html
reply