This is one of the best uses I've found for Singeli[0]. Here's how I implemented an AVX2 transpose kernel similar to that in transpose_Vec256_kernel, for generic type and vector/kernel size (I use unpack instructions rather than shuffle and blend, which I think is probably faster since it's just one instruction for each interaction):
The language is oriented towards compile-time array programming instead of managing a bunch of individual vectors. So you have runtime vec_select{} (docs at [1]), mirrored by compile-time select{}, and the indices generated by pairs{} can be used in either.
That's not O(N log log N), it's more like N^2. Prime sieves are hard to implement well with immutable arrays for obvious reasons; there are some cool methods but they're definitely harder. I'm ashamed to be part of a community that won't cop to this.
The algorithm iterates over numbers ⍺ from 2 to N, removing the multiples that are greater than ⍺ and no greater than N from p. If the removal with ~ has to inspect all of p, then all the primes are there so we have asymptotically at least N/log N entries by the prime number theorem and we get N^2/log N time (when ⍺ is over N/2, no multiples are in range so this can be skipped, but that just cuts a constant factor 2 from the time). Conceivably p could be marked sorted, so the entries to remove could be found with a binary search. This is a bit harder to analyze, but I think each prime under √N will cause the list to change, and incur N/log N data movement. So you get at least (N/log N)^(3/2) cost, still quite a bit worse than linear.
Edit: changed the algorithm while I was writing... the new one is better, it keeps one list p of primes and one list ω of not-yet-marked-out numbers. However, primes are removed from ω one at a time, so that each of the N/log N primes has to be moved for each one before it, giving at least (N/log N)^2 cost (I mean, maybe an interpreter could binary search and also recognize when ~ only drops the first entry and do that by slicing? But the (N/log N)^(3/2) from above definitely holds). Mutating a bit array in place is pretty important to classical sieve performance.
It's possible to get O(N^(3/2)/log N) in an ordinary interpreter with some changes to the code, assuming linear-time ~ with hashing. The idea is to leave the primes in ⍵ and stop when it stops changing, which will happen at the first prime over √N. To get the complexity multiply by O(N) for each step. It's also a small change to get the multiples to remove to start at n×n instead of n.
i←¯1⋄{⍵~n×n↓⍳1+⌊N÷n←i⊃⍵⊣i+←1}⍣≡1↓1+⍳N
I think this is about as good as can be done with ~ instead of marking out bits. And I wouldn't say it's as easy as the imperative version!
Eep. You're right. Evidently, I didn't even know what a sieve was in this context and wrote a search instead. You got me to do a bit of research. Actually, this discussion is exactly where I think APL shows one of it's strengths. It feels like a human communication tool more than any other PL I've mucked about with. The hard parts here are not language issues but fundamental understanding ones.
It's a tad tricky to carefully analyze the asymptotics of my above prime generator, since the search space of Without (~) shrinks on each iteration. I think Merten's theorem gives an estimate of e^-γ/log(p_i), which we need to sum for all primes up to N. Taking prime density 1/log(n) and integrating N/(log x)^2 over our range is O(N^2/log N), I think.
> Mutating a bit array in place is pretty important to classical sieve performance.
It's very strange to see handwriting lumped in with typewriting, to be described as limited relative to screens! Iverson notation was a 2D format (both in handwriting and typeset publications) making use of superscripts, subscripts, and vertical stacking like mathematics. It was linearized to allow for computer execution, but the designers described this as making the language more general rather than less:
> The practical objective of linearizing the typography also led to increased uniformity and generality. It led to the present bracketed form of indexing, which removes the rank limitation on arrays imposed by use of superscripts and subscripts.
I think this is more true than they realized at that time. The paper describes the outer product, which in Iverson notation was written as a function with a superscript ∘ and in APL became ∘. followed by the function. In both cases only primitive functions were allowed, that is, single glyphs. However, APL's notation easily extends to any function used in an outer product, no matter how long. But Iverson notation would have you write it in the lower half of the line, which would quickly start to look bad.
I don't think this really describes neon_prefixsum_fast as a whole? The algorithm does use a Hillis-Steele sum on sums of 4 values, but each of these is computed with a sequential sum, interleaving those with a transposed order. In terms of what's added to what, it's actually quite a bit like my "Sequential broadcasting" picture from [0]. The reference I'd use for a general form is "Parallel Scan as a Multidimensional Array Problem"[1], breaking 16 elements into a 4x4 array; the paper describes how the scan splits into a row-wise scan, plus values obtained from an exclusive scan on carries from the rows.
I was interested in how it would make sense to define complex numbers without fixing the reals, but I'm not terribly convinced by the method here. It seemed kind of suspect that you'd reduce the complex numbers purely to its field properties of addition and multiplication when these aren't enough to get from the rationals to the reals (some limit-like construction is needed; the article uses Dedekind cuts later on). Anyway, the "algebraic conception" is defined as "up to isomorphism, the unique algebraically closed field of characteristic zero and size continuum", that is, you just declare it has the same size as the reals. And of course now you have no way to tell where π is, since it has no algebraic relation to the distinguished numbers 0 and 1. If I'm reading right, this can be done with any uncountable cardinality with uniqueness up to isomorphism. It's interesting that algebraic closure is enough to get you this far, but with the arbitrary choice of cardinality and all these "wild automorphisms", doesn't this construction just seem... defective?
It feels a bit like the article's trying to extend some legitimate debate about whether fixing i versus -i is natural to push this other definition as an equal contender, but there's hardly any support offered. I expect the last-place 28% poll showing, if it does reflect serious mathematicians at all, is those who treat the topological structure as a given or didn't think much about the implications of leaving it out.
More on not being able to find π, as I'm piecing it together: given only the field structure, you can't construct an equation identifying π or even narrowing it down, because if π is the only free variable then it will work out to finding roots of a polynomial (you only have field operations!) and π is transcendental so that polynomial can only be 0 (if you're allowed to use not-equals instead of equals, of course you can specify that π isn't in various sets of algebraic numbers). With other free variables, because the field's algebraically closed, you can fix π to whatever transcendental you like and still solve for the remaining variables. So it's something like, the rationals plus a continuum's worth of arbitrary field extensions? Not terribly surprising that all instances of this are isomorphic as fields but it's starting to feel about as useful as claiming the real numbers are "up to set isomorphism, the unique set whose cardinality matches the power set of the natural numbers", like, of course it's got automorphisms, you didn't finish defining it.
You need some notion of order or of metric structure if you want to talk about numbers being "close" enough to π. This is related to the property of completeness for the real numbers, which is rather important. Ultimately, the real numbers are also a rigorously defined abstraction for the common notion of approximating some extant but perhaps not fully known quantity.
There's a related idea in mathematics, the proof that the real numbers are a vector space over the rational numbers. If you scramble the basis vectors, you obtain an isomorphic vector space, but it is effectively a "permutation" of |R. Of course, vector spaces don't even have multiplication, but one interesting thing is that the proof requires the axiom of choice.
I think that actually constructing a "nontrivial" model of C using the field conception might require choosing a member from each of an infinite family of sets, i.e. it requires applying the axiom of choice, similar to the way you construct R as a vector space.
Ooh, I've run into this one before! I'm a big fan of interval index[0], which performs a binary search, so Josh's suggestion is the one I prefer as well (the implementation might sometimes optimize by transforming it into a table lookup like the other solutions). Searching for +`≠¨ in my BQN files turns up a few uses, including one used to group primitives into types in a utility involved in compiling BQN[1] and an annotated function used a few times in the markdown processor that builds BQN's website[2].
"Ken Iverson - The dictionary of J contains an introductory comment that J is a dialect of APL, so in a sense the whole debate is Ken's fault! He is flattered to think that he has actually created a new language."
Dunno why electroly is dragging me into this but I believe you've misread the article. When it says "His languages take significantly after APL" it means the languages themselves and not their implementations.
I think the article expresses no position. Most source code for array languages is not, in fact, inspired by APL. I encourage you to check a few random entries at [0]; Kap and April are some particularly wordy implementations, and even A+ mostly consists of code by programmers other than Whitney, with a variety of styles.
I do agree that Whitney was inspired to some extent by APL conventions (not exclusively; he was quite a Lisp fan and that's the source of his indentation style when he writes multi-line functions, e.g. in [1]). The original comment was not just a summary of this claim but more like an elaboration, and began with the much stronger statement "The way to understand Arthur Whitney's C code is to first learn APL", which I moderately disagree with.
I unfortunately glossed over the part of the original comment that gives it substance: "The most obvious of the typographic stylings--the lack of spaces, single-character names, and functions on a single line--are how he writes APL too."
That's backing for a claim.
Also, I haven't once written APL. I think this might've been borderline trolling, just because of how little investment I have in the topic in reality. Sorry.
It looks like a weirdo C convention to APLers too though. Whitney writes K that way, but single-line functions in particular aren't used a lot in production APL, and weren't even possible before dfns were introduced (the classic "tradfn" always starts with a header line). All the stuff like macros with implicit variable names, type punning, and ternary operators just doesn't exist in APL. And what APL's actually about, arithmetic and other primives that act on whole immutable arrays, is not part of the style at all!
It's just. So gross. Say it. Sudden interruption of slime coming up your throat. Like walking out the door into a spiderweb. Alphabetically I was mistaken but in every way that matters I was right.
Ordinarily I'd make fun of the Germans for giving such an ugly name to a nice concept, but I've always found "comfortable" to be rather unpleasant too (the root "comfort" is fine).
https://github.com/dzaima/CBQN/blob/v0.11.0/src/singeli/src/...
The language is oriented towards compile-time array programming instead of managing a bunch of individual vectors. So you have runtime vec_select{} (docs at [1]), mirrored by compile-time select{}, and the indices generated by pairs{} can be used in either.
[0] https://github.com/mlochbaum/Singeli/
[1] https://github.com/mlochbaum/Singeli/tree/master/include#sim...
reply