More

cosmos0072 · 2025-10-22T21:25:22 1761168322

I need this *so* often that I programmed my shell to execute 'cd ..' every time I press KP/ i.e. '/' on the keypad, without having to hit Return.

Other single-key bindings I use often are:

KP* executes 'ls'

KP- executes 'cd -'

KP+ executes 'make -j `nproc`'

sudahtigabulan · 2025-10-23T05:40:37 1761198037

How? Readline macros?

mghilardi · 2025-10-23T13:29:21 1761226161

Literally with my own shell: https://github.com/cosmos72/schemesh

cosmos0072 · 2025-10-08T18:27:23 1759948043

The math looks suspicious to me, or at least how it is presented.

If, as stated, accessing one register requires ~0.3 ns and available registers sum up to ~2560 B, while accessing RAM requires ~80 ns and available RAM is ~32 GiB, then it means that memory access time is O(N^1/3) where N is the memory size.

Thus accessing the whole N bytes of memory of a certain kind (registers, or L1/L2/L3 cache, or RAM) takes N * O(N^1/3) = O(N^4/3).

One could argue that the title "Memory access is O(N^1/3)" refers to memory access time, but that contradicts the very article's body, which explains in detail "in 2x time you can access 8x as much memory" both in text and with a diagram.

Such statement would require that accessing the whole N bytes of memory of a certain kind requires O(N^1/3) time, while the measurements themselves produce a very different estimate: accessing the whole N bytes of memory of a certain kind requires O(N^4/3) time, not O(N^1/3)

timerol · 2025-10-08T18:36:07 1759948567

I did not interpret the article as you did, and thought it was clear throughout that the author was talking about an individual read from memory, not reading all of a given amount of memory. "Memory access, both in theory and in practice, takes O(N^⅓) time: if your memory is 8x bigger, it will take 2x longer to do a read or write to it." Emphasis on "a read or write".

I read "in 2x time you can access 8x as much memory" as "in 2x time you can access any byte in 8x as much memory", not "in 2x time you can access the entirety of 8x as much memory". Though I agree that the wording of that line is bad.

In normal big-O notation, accessing N bytes of memory is already O(N), and I think it's clear from context that the author is not claiming that you can access N bytes of memory in less time than O(N).

hinkley · 2025-10-08T19:57:20 1759953440

Nobody has ever had this confusion about the access time of hash tables except maybe in the introductory class. What you’re describing is the same reasoning as any data structure. Which is correct. Physical memory hierarchies are a data structure. Literally.

I’m confused by GP’s confusion.

tsimionescu · 2025-10-08T20:31:24 1759955484

This is completely false. All regularly cited algorithm complexity classes are based on estimating a memory access as an O(1) operation. For example, if you model memory access as O(N^1/3), linear search worse case is not O(N), it is O(N^4/3): in the worse case you have to make N memory accesses and N comparisons, and if each memory access in N^1/3 time, this requires N^4/3 + N time, which is O(N^4/3).

feoren · 2025-10-08T21:53:21 1759960401

> linear search worse case is not O(N), it is O(N^4/3)

No, these are different 'N's. The N in the article is the size of the memory pool over which your data is (presumably randomly) distributed. Many factors can influence this. Let's call this size M. Linear search is O(N) where N is the number of elements. It is not O(N^4/3), it is O(N * M^1/3).

There's a good argument to be made that M^(1/3) should be considered a constant, so the algorithm is indeed simply O(N). If you include M^(1/3), why are you not also including your CPU speed? The speed of light? The number of times the OS switches threads during your algorithm? Everyone knows that an O(N) algorithm run on the same data will take different speeds on different hardware. The point of Big-O is to have some reasonable understanding of how much worse it will get if you need to run this algorithm on 10x or 100x as much data, compared to some baseline that you simply have to benchmark because it relies on too many external factors (memory size being one).

> All regularly cited algorithm complexity classes are based on estimating a memory access as an O(1) operation

That's not even true: there are plenty of "memory-aware" algorithms that are designed to maximize the usage of caching. There are abstract memory models that are explicitly considered in modern algorithm design.

tsimionescu · 2025-10-08T22:10:03 1759961403

You can model things as having M be a constant - and that's what people typically do. The point is that this is a bad model, that breaks down when your data becomes huge. If you're tying to see how an algorithm will scale from a thousand items to a billion items, then sure - you don't really need to model memory access speeds (though even this is very debatable, as it leads to very wrong conclusions, such as thinking that adding items to the middle of a linked list is faster than adding them to the middle of an array, for large enough arrays - which is simply wrong on modern hardware).

However, if you want to model how your algorithm scales to petabytes of data, then the model you were using breaks down, as the cost of memory access for an array that fits in RAM is much smaller than the cost of memory access for the kind of network storage that you'll need for this level of data. So, for this problem, modeling memory access as a function of N may give you a better fit for all three cases (1K items, 1G items, and 1P items).

> That's not even true: there are plenty of "memory-aware" algorithms that are designed to maximize the usage of caching.

I know they exist, but I have yet to see any kind of popular resource use them. What are the complexities of Quicksort and Mergesort in a memory aware model? How often are they mentioned compared to how often you see O(N log N) / O(N²)?

menaerus · 2025-10-10T06:22:19 1760077339

> It is not O(N^4/3), it is O(N * M^1/3). There's a good argument to be made that M^(1/3) should be considered a constant

Math isn't mathing here. If M^1/3 >> N, like in memory-bound algorithms, then why should we consider it a constant?

> The point of Big-O is to have some reasonable understanding of how much worse it will get if you need to run this algorithm on 10x or 100x as much data

And this also isn't true and can be easily proved by contradiction. O(N) linear search over array is sometimes faster than O(1) search in a hash-map.

Kranar · 2025-10-09T00:55:43 1759971343

This is untrue, algorithms measure time complexity classes based on specific operations, for example comparison algorithms are cited as typically O(n * log(n)) but this refers to the number of comparisons irrespective of what the complexity of memory accesses is. For example it's possible that comparing two values to each other has time complexity of O(2^N) in which case sorting such a data structure would be impractical, and yet it would still be the case that the sorting algorithm itself has a time complexity of O(N log(N)) because time complexity is with respect to some given set of operations.

Another common scenario where this comes up and actually results in a great deal of confusion and misconceptions are hash maps, which are said to have a time complexity of O(1), but that does not mean that if you actually benchmark the performance of a hash map with respect to its size, that the graph will be flat or asymptotically approaches a constant value. Larger hash maps are slower to access than smaller hash maps because the O(1) isn't intended to be a claim about the overall performance of the hash map as a whole, but rather a claim about the average number of probe operations needed to lookup a value.

In fact, in the absolute purest form of time complexity analysis, where the operations involved are literally the transitions of a Turing machine, memory access is not assumed to be O(1) but rather O(n).

tsimionescu · 2025-10-09T04:11:58 1759983118

Time complexity of an algorithm specifically refers to the time it takes an algorithm to finish. So if you're sorting values where a single comparison of two values takes O(2^n) time, the time complexity of the sort can't be O(n log n).

Now, you very well can measure the "operation complexity" of an algorithm, where you specify how many operations of a certain kind it will do. And you're right that typically comparison sorting algorithms complexities are often not time complexities, they are the number of comparisons you'll have to make.

> hash maps, which are said to have a time complexity of O(1), but that does not mean that if you actually benchmark the performance of a hash map with respect to its size, that the graph will be flat or asymptotically approaches a constant value.

This is confused. Hash maps have an idealized O(1) average case complexity, and O(n) worse case complexity. The difficulty with pinning down the actual average case complexity is that you need to trade off memory usage vs chance of collisions, and people are usually quite sensitive to memory usage, so that they will end up having more and more collisions as n gets larger, while the idealized average case complexity analysis assumes that the hash function has the same chance of collisions regardless of n. Basically, the claim "the average case time complexity of hashtables is O(1)" is only true if you maintain a very sparse hashtable, which means its memory usage will grow steeply with size. For example, if you want to store thousands of arbitrary strings with a low chance of collisions, you'll probably need a bucket array that's a size like 2^32. Still, if you benchmark the performance of hashtable lookup with respect to its size, while using a very good hash function, and maintaining a very low load ratio (so, using a large amount of memory), the graph will indeed be flat.

dooglius · 2025-10-08T22:48:36 1759963716

> if you model memory access as O(N^1/3), linear search worse case is not O(N), it is O(N^4/3)

This would be true if we modeled memory access as Theta(N^{1/3}) but that's not the claim. One can imagine the data organized/prefetched in such a way that a linear access scan is O(1) per element but a random access is expected Theta(N^{1/3}). You see this same sort of pattern with well-known data structures like a (balanced) binary tree; random access is O(log(n)), but a linear scan is O(n).

gowld · 2025-10-08T18:39:52 1759948792

> "in 2x time you can access 8x as much memory"

is NOT what the article says.

The article says (in three ways!):

> if your memory is 8x bigger, it will take 2x longer to do a read or write to it.

> In a three-dimensional world, you can fit 8x as much memory within 2x the distance from you.

> Double the distance, eight times the memory.

the key worda there are a, which is a single access, and distance, which is a measure of time.

N is the amount of memory, and O() is the time to access an element of memory.

hinkley · 2025-10-08T20:00:53 1759953653

The operation GP is thinking of is a full scan, and that will always take n(n^(1/3)) lower bound time. Though if done right all of that latency will be occupied with computation and allow people to delude themselves into thinking it doesn’t matter.

But when something is constrained two or three ways, it drastically reduces the incentive to prioritize tackling any one of the problems with anything but small incremental improvements.

Dylan16807 · 2025-10-08T22:15:00 1759961700

> The operation GP is thinking of is a full scan, and that will always take n(n^(1/3)) lower bound time.

It doesn't. Full scans are faster than accessing each memory address in an unordered way.

Let's look at a Ryzen 2600X. You can sustain 32 bytes per second from L1, 32 bytes per second from L2, and 20 bytes per cycle from L3. That's 64KB, 512KB, and 16MB caches all having almost the same bandwidth despite very different latencies.

You can also imagine an infiniband network that fills 2 racks, and another one that fills 50000 racks. The bandwidth of a single node is the same in both situations, so even though latency gets worse as you add more nodes and hops, it's going to take exactly O(n) time for a single thread to scan the entire memory.

You can find correlations between memory size and bandwidth, but they're significantly weaker and less consistent than the correlations between memory size and latency.

hinkley · 2025-10-09T00:29:44 1759969784

I don’t think I can agree to ignore the latency problem. Intermachine Latency is the source of a cube root of N term.

Dylan16807 · 2025-10-09T02:14:33 1759976073

Once you start receiving data in bulk, the time it takes is quantity of data divided by your connection speed. Latency doesn't factor in.

Technically you need to consider the time it takes to start receiving data. Which would mean your total time is O(n + ∛n). Not O(n * ∛n). But not all nodes are ∛n away. The closest nodes are O(1) latency. So if you start your scan on the close nodes, you will keep your data link saturated from the start, and your total time will be O(n). (And O(n + ∛n) simplifies to O(n) anyway.)

wazdra · 2025-10-08T22:23:20 1759962200

I don't think we have such a lower bound: from a “theoretical" point of view (in the sense of the post), your processor could walk on the cube of memory and collect each bit one by one. Each move+read costs O(1) (if you move correctly), so you get O(n) to read the whole cube of n bits.

If you require the full scan to be done in a specific order however, indeed, in the worse case, you have to go from one end of the cube to the other between each reads, which incurs a O(n^{1/3}) multiplicative cost. Note that this does not constitute a theoretical lower-bound: it might be possible to detect those jumps and use the time spent in a corner to store a few values that will become useful later. This does look like a fun computational problem, I don't know it's exact worst-case complexity.

squirrellous · 2025-10-09T00:27:25 1759969645

I think the article glossed over a bit about how to interpret the table and the formula. The formula is only correct if you take into account the memory hierarchy, and think of N as the working set size of an algorithm. So if your working set fits into L1 cache, then you get L1 cache latency, if your working set is very large and spills into RAM, then you get RAM latency, etc.

dheera · 2025-10-08T18:57:48 1759949868

I hate big O notation. It should be O(N) = N^(1/3)

That way O is the function and it's a function of N.

The current way it's notated, O is the effectively the inverse function.

pfortuny · 2025-10-08T19:02:48 1759950168

Yes, the notation seems wrong but it is because O(...) is a set, not a function. The functions is what goes inside.

So it should be f€O(...) instead of f=...

(don't know how to write the "belongs" symbol on iOS).

singhrac · 2025-10-08T19:21:02 1759951262

Here you go: ∈ (so f ∈ (...))

pfortuny · 2025-10-10T06:31:56 1760077916

That helps but how do I learn it?

tsimionescu · 2025-10-08T21:40:46 1759959646

O(N) is not a function, though, so the notation is doing a good job of saying this. When we say "the complexity class of an algorithm is O(log N)", we mean "the function WorseCaseComplexity(N) for that algorithm is in the class O(log N)", The function "WorseCaseComplexity(N)" measures how much time the algorithm will take for any input of size N in the worse case scenario. We can also say "the average case complexity of quicksort is O(N log N)", which means "the function AverageCaseComplexity(N) for quicksort in the class O(N log N)", where AverageCaseComplexity(N) is a function that measure how much time quicksort will need to finish for an "average" input of size N.

Saying that a function f(n) is in the class O(g(n)) means that there exists an M and a C such that f(n) < C * g(n) for any n < M. That is, it means that, past some point, f(n) is always lower than C * g(n), for some constant C. For example, we can say the function f(n) = 2n+ 1 is in the class O(n), because 2n + 1 < 7n for any n > 1. Technically, we can also say that our f(n) is in the complexity class O(2^n), because 2n + 1 < 2^n for any n >= 3, but people don't normally do this. Technically, what we typically care about is not O(f(n)), it's more of a "least upper bound", which is closer to big_theta(n).

Maxatar · 2025-10-09T02:46:02 1759977962

There are a lot of misconceptions and things being confused together in your post. For one complexity classes are related to decision problems, not algorithms. An algorithm does not have a complexity class, rather it's decision problems that belong to a complexity class. 3SAT is a decision problem that belongs to the NP-complete complexity class regardless of any particular algorithm that implements a solution to it.

Secondly, it's a common misconception that O(f(n)) refers to worst case complexity. O(f(n)) is an upper bound on the asymptotic growth of f(n). This upper bound can be used to measure best case complexity, worst case complexity, average case complexity, or many other circumstances. An upper bound does not mean worst case.

For example I often perform code reviews where I will point out that a certain operation is implemented with the best case scenario having O(log(N)), but that an alternative implementation has a best case scenario of O(1) and consequently I will request an update.

I expect my engineers to know what this means without mixing up O(N) with worst-case scenarios, in particular I do not expect an improvement to the algorithm in the worst case scenario but that there exists a fast/happy path scenario that can avoid doing a lot of work.

Consequently the opposite can also happen, an operation might be implemented where the worst case scenario is o(N), note the use of little-o rather than big-o, and I will request a lower bound on the worst case scenario.

tsimionescu · 2025-10-09T03:26:40 1759980400

> An algorithm does not have a complexity class, rather it's decision problems that belong to a complexity class. 3SAT is a decision problem that belongs to the NP-complete complexity class regardless of any particular algorithm that implements a solution to it.

Both algorithms and problems have their own complexity. The complexity of an algorithm is the time it takes that algorithm to finish. For example, quicksort is an algorithm, and its average case complexity is in O(n log n). I called O(n log n) a complexity class, this was indeed sloppy on my part - I was just trying to find a name for this set.

> Secondly, it's a common misconception that O(f(n)) refers to worst case complexity.

I know, and my post was very explicitly avoiding this misconception. I explicitly gave the example that the average case complexity of quicksort is in O(n log n).

> Consequently the opposite can also happen, an operation might be implemented where the worst case scenario is o(N), note the use of little-o rather than big-o, and I will request a lower bound on the worst case scenario.

I'm not sure what you're trying to say in this part. o(n) is basically the set of all sublinear functions, e.g. log(n), but also 1. Any function in o(n) is also in O(n), though the converse doesn't hold. So, if you know an algorithm has worse complexity in o(n), you know more information about the algorithm than if I tell you it has worse case complexity in O(n). Knowing a lower bound is still useful in either case, of course, since a constant time algorithm and a logarithmic time algorithm are still both in both sets.

purplesyringa · 2025-10-08T20:39:20 1759955960

No it shouldn't. The function you're talking about is typically called T(N), for "time". The problem is that you can't write T(N) = N^(1/3) because it's not exactly N^(1/3) -- for one thing, it's approximate up to a constant factor, and for another thing, it's only an upper bound. Big-O solves both of these issues: T(N) = O(N^(1/3)) means that the function T(N) grows at most as fast as N^(1/3) (i.e.: forms a relationship between the two functions T(N) and N^(1/3)). The "T(N) =" is often silent, since it's clear when we're talking about time, so at the end you just get O(N^(1/3)).

bigbuppo · 2025-10-08T19:34:08 1759952048

All their numbers are immediatley suspect since they admit to using ChatGPT to get their numbers. Oh, and their wonderful conclusion that something that fits fully in the CPU's caches is faster than something sitting a few hops away in main memory.

cosmos0072 · 2025-09-10T13:19:36 1757510376

For me it's more about "feeling" badly written code, as if it stinks and is unpleasant to look at, and conversely enjoying nicely written one, and seeing its elegance and beauty.

Especially when I have to familiarize with code written by someone else, I usually start by "cleaning" it - small refactorings such as splitting overlong functions and simplifying expressions, that are trivial to prove as correct even without a deep knowledge of the code purpose.

Only when the code looks sufficiently clean and familiar, I start adding the requested new features

gibsonsmog · 2025-09-10T14:18:44 1757513924

This is also how I have traditionally looked at new codebases. I don't necessarily actually clean it up (I dislike changing someones style unless it is completely necessary) but I build up a mental model similar to what you're outlining. It's been really interesting with these LLM coding tools. I may not disagree with the implementation in terms of logic (it just does what I've prompted it to do after all) but something about it will 'stink'. Amusingly this feeling is the true 'vibe' of 'vibe coding' to me.

I sometimes wonder if this is the result of experience/education (I'm not a compsci major) or if it's just a naturally different way to think about systems.

bqmjjx0kac · 2025-09-10T13:28:41 1757510921

I also do this sort of restless refactoring. I find interacting with a new codebase is a kind of brain hack that (subjectively) helps me get up to speed faster.

cosmos0072 · 2025-08-20T12:07:11 1755691631

> So, people who use native REPLs, what do you do with them?

In my case, I use my interactive shell https://github.com/cosmos72/schemesh every day as login shell.

You can look at it as heavily customized Scheme REPL, where everything not inside parentheses is parsed and executed as shell syntax, and everything inside parentheses is parsed and executed as Scheme syntax.

Having arithmetic and procedure definition within the login shell definitely feels liberating, at least to me

cess11 · 2025-08-20T13:34:46 1755696886

I've used the Picolisp REPL like that, though not as a login shell proper, but as the shell where I actually do stuff. Mainly due to the ease with which it integrates with the below shell through 'in, 'out and 'fork.

cosmos0072 · 2025-06-17T08:36:22 1750149382

You can choose the font with `twin --hs=X11,font=X11FONTNAME` or `twin --hw=xft,font=TRUETYPEFONTNAME`

I will have a look too, the terminal should not crash or stop being responsive

panzi · 2025-06-17T17:16:03 1750180563

Thank you. Weird that this isn't in the man page or --help message. Another font didn't fix anything anyway.

cosmos0072 · 2025-06-21T15:48:59 1750520939

I tried compiling https://github.com/panzi/ansi-img and running it inside the truecolor branch of twin, i.e. https://github.com/cosmos72/twin/tree/truecolor and it correctly displayed an example JPG I tried - no artefacts, and terminal remained responsive.

As I wrote above, I am debugging it and fixing some known issues, thus truecolor support is not yet in the default branch

cosmos0072 · 2025-06-16T12:05:21 1750075521

Author here :)

I've been using Twin as my everyday terminal emulator and terminal multiplexer since ~2000, slowly adding features as my free time - and other interests - allowed.

As someone pointed out, the look-and-feel reminds Borland Turbo Vision. The reason is simple: I started writing in in the early '90s on DOS with a Borland C compiler, and I used the Borland Turbo Vision look-and-feel as a visual guideline (never actually looked at the code, though).

The porting to linux happened in 1999 (it was basically dormant before that), and Unicode support was progressively added around 2015-2016 (initially UCS-2 i.e. only the lowest 64k codepoints, then full UTF-32 internally, with terminal emulator accepting UTF-8). There are still some missing features, most notably: no grapheme clusters, no fullwidth (asian etc.) support, no right-to-left support.

Right now I'm adding truecolor support (see https://github.com/cosmos72/twin/tree/truecolor) - it's basically finished, I'm ironing out some remaining bugs, and thinking whether wire compatibility with older versions is worth adding.

And yes, documentation has been stalled for a very long time.

Retrospectively, I should have switched C -> C++ much earlier: lots of ugly preprocessor macros accumulated over time, and while I rewrote the C widget hierarchy as C++ classes, several warts remain.

panzi · 2025-06-16T13:17:46 1750079866

Do symbols for legacy computing work with it? Especially the 1/8ths vertical/horizontal blocks?

cosmos0072 · 2025-06-16T14:32:24 1750084344

If you mean the Unicode glyphs listed at https://en.m.wikipedia.org/wiki/Block_Elements they are supported - you just need a display driver that can render them. For example, `twin --hw=xft` (it's the default) or `twin --hw=X11`, both with a font that contains them

JdeBP · 2025-06-16T14:42:27 1750084947

Xe means the Unicode block that is actually named "Symbols For Legacy Computing". It's not in the BMP. Some bloke named Bruce was doing TUI windows with scrollbars and sizer/menu boxes some years before TurboVision and code page 437. (-:

panzi · 2025-06-16T18:05:41 1750097141

Indeed I meant these: https://en.wikipedia.org/wiki/Symbols_for_Legacy_Computing

I use them in some tiny hobby projects like these:

https://github.com/panzi/progress-pride-bar

https://github.com/panzi/bad-apple-terminal

https://github.com/panzi/js-unicode-bar-chart

https://github.com/panzi/js-unicode-plot

https://github.com/panzi/js-unicode-progress-bar

https://github.com/panzi/python-term-flags

JdeBP · 2025-06-16T18:54:47 1750100087

Reading the flags one: Unscii has font coverage, if you want to try that out on the emulators whose fonts were problems.

* https://news.ycombinator.com/item?id=43812026

JdeBP · 2025-06-16T14:36:30 1750084590

Given that it's drawing TUI windows, the MouseText characters for doing that very thing on the Apple IIe would seem even more pertinent.

* https://tty0.social/@JdeBP/114409020672330885

JdeBP · 2025-06-16T13:22:36 1750080156

Alas, it's not finished. You've made the mistakes that all of us have made, and haven't caught up with us, must of us having fixed those mistakes, a few years back when implementing 24-bit RGB was in vogue.

This is not, as the function name suggests, a colon, but per ITU/IEC T.416 it should be:

https://github.com/cosmos72/twin/blob/truecolor/server/hw/hw...

And not only should this have colons too, but per ITU/IEC T.416 there's a colour space parameter that goes in there:

https://github.com/cosmos72/twin/blob/truecolor/server/hw/hw...

The unfortunate part is that when rendering to a terminal, you don't have any available mechanism apart from hand-decoding the family part of the TERM environment variable, and knowing who made which mistakes, to determine which of the 7 possible colour mechanisms are supported. They are:

1. ECMA-48 standard 8 colour, SGRs 30 to 37, 39, 40 to 47, and 49

2. AIXTerm 16 colour, ECMA-48 plus SGRs 90 to 97 and 100 to 107

3. XTerm 256 colour, ITU T.416 done wrongly with SGR 38;5;n and SGR 48;5;n

4. XTerm 256 colour corrected, ITU T.416 done right with SGR 38:5:n and SGR 48:5:n

5. 24-bit colour take 1, ITU T.416 done wrongly with SGR 38;2;r;g;b and SGR 48;2;r;g;b

6. 24-bit colour take 2, ITU T.416 done wrongly with SGR 38:2:r:g:b and SGR 48:2:r:g:b

7. 24-bit colour take 3, ITU T.416 done right with SGR 38:2::r:g:b::: and SGR 48:2::r:g:b:::

Few people support 4, and although quite a lot of us have finally got to supporting 7 it isn't quite universal. Egmont Koblinger, I, and others have been spreading the word where we can over the last few years.

This is where I was at in 2019:

https://github.com/jdebp/nosh/blob/trunk/source/TerminalCapa...

There a few updates to that that are going to come out in 1.41, but when it comes to colour they're mainly things like recognizing the "ms-terminal" and "netbsd6" terminal types in the right places.

cosmos0072 · 2025-06-16T14:16:46 1750083406

Yep, I am well aware of the `;` vs `:` confusion in both 256 color and 24-bit color control sequences.

Short of hand-coding "which terminal supports which variant" I do not know any standard mechanism to detect that (beyond the well-known $TERM=...-256color and $COLORTERM=truecolor or $COLORTERM=24bit)

I guess I'll have to add command line options to choose among the variants 1...7 you helpfully listed above.

My main use it to render twin directly on X11, which avoids all these issues, and while rendering inside another terminal is important and is not going away, I am OK with a few minor color-related limitations (note: limitations, not bugs) in such setup, especially if the other terminal does not follow the relevant standards

skissane · 2025-06-16T15:24:35 1750087475

> This is not, as the function name suggests, a colon, but per ITU/IEC T.416 it should be

Digressing, but I’m fascinated to see ODA still being referenced, even if only some small part of it

cosmos0072 · 2025-04-05T20:19:47 1743884387

Compiling an expression to a tree of closures, and a list of statements to a slice of closures, is exactly how I optimized [gomacro](https://github.com/cosmos72/gomacro) my Go interpreter written in go.

There are more tricks available there, as for example unrolling the loop that calls the list of closures, and having a `nop` closure that is executed when there's nothing to run but execution is not yet at the end of the the unrolled loop.

ALLTaken · 2025-04-05T23:00:59 1743894059

Impressive! I would love to learn howto implement that in Julia. Could you help me understand how you did that?

I'd love to see, if it's possible to create a libc-free, dependency-free executable without Nim (https://nim-lang.org/).

stevekemp · 2025-04-06T18:39:04 1743964744

I put together this example after reading the article:

https://github.com/skx/simple-vm

Simpler than the full gomacro codebase, but perhaps helpful.

cosmos0072 · 2025-04-06T19:03:20 1743966200

For optimal speed, you should move as much code as possible outside the closures.

In particular, you should do the `switch op` at https://github.com/skx/simple-vm/blob/b3917aef0bd6c4178eed0c... outside the closure, and create a different, specialised closure for each case. Otherwise the "fast interpreter" may be almost as slow as a vanilla AST walker.

stevekemp · 2025-04-07T03:24:21 1743996261

That's fair, thanks for taking the time to look and giving the feedback.

cosmos0072 · 2025-04-06T10:30:56 1743935456

The core idea is simple: do a type analysis on each expression you want to "compile" to a closure, and instantiate the correct closure for each type combination.

Here is a pseudocode example, adapted from gomacro sources:

https://gist.github.com/cosmos72/f971c172e71d08030f92a1fc5fa...

This works best for "compiling" statically typed languages, and while much faster than an AST interpreter, the "tree of closures" above is still ~10 times slower that natively compiled code. And it's usually also slower than JIT-compiled code

cosmos0072 · 2025-02-21T12:30:57 1740141057

You mean a command or builtin `(` inside a traditional shell as bash or zsh?

It would be quite limited:

internal status would not persist between invocations,

and it would only be able to exchange unstructured data (a stream of bytes) with the shell

cosmos0072 · 2025-02-16T09:34:53 1739698493

Porting to a different Scheme implementation requires some effort: schemesh needs a good, bidirectional C FFI and an (eval) that allows any Scheme form, including definitions.

For creating a single `schemesh` executable with the usual shell-compatible options and arguments, the Scheme implementation also needs to be linkable as a library from C:

Chez Scheme provides a `kernel.o` or `libkernel.a` library that you can link into C code, then call the C functions Sscheme_init(), Sregister_boot_file() and finally Scall0(some_scheme_repl_procedure) or Sscheme_start()

cosmos0072 · 2025-02-16T07:18:37 1739690317

Rash and schemesh start from similar ideas: create a shell scriptable in some dialect of Lisp.

Rash has several limitations, sometimes due to design choices, that schemesh solves:

1. no job control

2. multi-line editing is limited

3. from what I understand, shell syntax is available only at REPL top level. Once you switch to Lisp syntax with `(`, you can return to shell syntax only with `)`. Thus means you cannot embed shell syntax inside Lisp syntax, i.e. you cannot do `(define j {find -type f | less})`

4. shell commands are Lisp functions, not Lisp objects. Inspecting and redirecting them after they have been created is difficult

5. Rash is written in Racket, which has larger RAM footprint than schemesh running on vanilla Chez Scheme: at startup, ~160MB vs. ~32MB

6. Racket/Rash support for multi-language at REPL is limited: once you do `#lang racket`, you cannot go back to `#lang rash`

Bogdanp · 2025-02-16T08:50:59 1739695859

> 3. from what I understand, shell syntax is available only at REPL top level. Once you switch to Lisp syntax with `(`, you can return to shell syntax only with `)`. Thus means you cannot embed shell syntax inside Lisp syntax, i.e. you cannot do `(define j {find -type f | less})`

It's possible I misunderstand what you mean because I'm not sure what piping to less is supposed to accomplish here, but this is not true. The following program works just fine:

    #lang rash

    (require racket/port
             racket/string)

    (define (echo!!! message)
      (define output {echo $message |> port->string |> string-trim})
      (string-append output "!!!"))

    (echo!!! "Hello")

cosmos0072 · 2025-02-16T09:20:22 1739697622

Nice! then my point 3. above is wrong and should be deleted

willghatch · 2025-02-16T16:34:10 1739723650

Yes, Rash has a variety of limitations. Let me give some more context to these:

>1. no job control

Racket is missing a feature in its rktio library needed to do job control with its process API, which Rash uses. At one point I added one or two other minor features needed for job control, but I ran out of steam and never finished the final one. It's a small feature, even, though now I don't remember much of the context. I hope I wrote enough notes to go back and finish this some day.

>2. multi-line editing is limited

I always intended to write a nice line editor that would do this properly. But, again, I never got around to it. I would still like to, and I will probably take a serious look at your line editor some time.

The design was intended as something to use interactively as well as for scripting. But since I never improved the line editing situation, even I only use it for scripting. After documentation issues, this is the most pressing thing that I would fix.

>3. from what I understand, shell syntax is available only at REPL top level. Once you switch to Lisp syntax with `(`, you can return to shell syntax only with `)`. Thus means you cannot embed shell syntax inside Lisp syntax, i.e. you cannot do `(define j {find -type f | less})`

As mentioned is not correct, you can recursively switch between shell and lisp.

>4. shell commands are Lisp functions, not Lisp objects. Inspecting and redirecting them after they have been created is difficult

This one is a design flaw. I've meant to go back and fix it (eg. just retrofitting a new pipe operator that returns the subprocess pipeline segment as an object rather than its ports or outputs), but, of course, haven't gotten around to it.

>5. Rash is written in Racket, which has larger RAM footprint than schemesh running on vanilla Chez Scheme: at startup, ~160MB vs. ~32MB

Yep.

>6. Racket/Rash support for multi-language at REPL is limited: once you do `#lang racket`, you cannot go back to `#lang rash`

So actually `#lang` is not supported at all in the REPL. It's neither supported in the Racket REPL nor the rash repl. In practice, what `#lang` does is (1) set the reader for a module, and (2) set the base import for the module, IE what symbol definitions are available. With the REPL you have to do this more manually. The repl in Racket is sort of second class in various ways, in part due to the “the top level is hopeless” problems for macros. (Search for that phrase and you can find many issues with repls and macros discussed over the years in the Racket mailing list.) Defining a new `#lang` in Racket includes various pieces about setting up modules specifically, and since the top level repl is not a module, it would need some different support that is currently not there, and would need to be retrofitted for various `#lang`s. But you can start a repl with an arbitrary reader, and use `eval` with arbitrary modules loaded or symbols defined. My intention with a rash line editor would also have been to make some infrastructure for better language-specific repls in racket generally. But, well, obviously I never actually did it. If I do make time for rash repl improvements in the near future, it will just as likely be ways for using it more nicely with emacs rather than actually writing a new line editor... we'll see.

I'm always sad when I think about how I've left Rash to languish. In grad school I was always stressed about publication (which I ultimately did poorly at), which sapped a lot of my desire and energy to actually get into the code and make improvements. Since graduating and going into industry, and with kids, I've rarely felt like I have the time or energy after all of my other responsibilities to spend time on hobby projects. Some day I would like to get back into it, fix its issues, polish it up, document it properly, etc. Alas, maybe some day.