More

thxg · 2026-02-02T08:30:17 1770021017

The original Wordle came with a pre-baked ordered list of 2315 "secret" words, off which the daily secret word was looked up (I think based on local time). The list was right there in the javascript code of the game (alongside the list of 12972 allowed guess words). It covered dates from 2021-06-19 to 2027-10-20.

Then in January 2022, the NYT bought Wordle, and started tweaking both lists, first shrinking the secret word list to 2309 entries, but leaving the logic otherwise intact. Fast forward to today, I looked up the current code [1], and it seems that there are now 14855 allowed words. The first 12546 are ordered alphabetically (0: "aahed", 12545: "zymic"), and the next 2309 are not. This may suggest that the latter are the secret words, but the logic for picking them has changed: I found no obvious sequence, when compared to the last few days' secret words. So it's either a more complex sequence, or the secret word is picked server-side.

In any case, I guess they decided to re-shuffle the list now at day 1689 / 2309 in order to avoid giving particularly assiduous player an additional bit of information: they can exclude all previous secret words. (To be accurate, I think this would be 1.897 bits, but my information theory is rusty.)

[1] https://www.nytimes.com/games-assets/v2/9003.896ec900f2a1ce8...

peesem · 2026-02-02T13:15:12 1770038112

Wordle now credits an "Editor" for each day, so I guess that person is the one who hand-picks the word for that day?

thxg · on Sept 21, 2024

I agree that their results are impressive. Just to be clear, however:

1. They compare their solver with a 1e-4 error tolerance to Gurobi with 1e-6. This may seem like a detail, but in the context of how typical LPs are formulated, this is a big difference. They have to do things this way because their solver simply isn't able to reach better accuracy (meanwhile, you can ask Gurobi for 1e-9, and it will happily comply in most cases).

2. They disable presolve, which is 100% reasonable in a scientific paper (makes things more reproducible, gives a better idea of what the solver actually does). If you look at their results to evaluate which solver you should use, though, the results will be misleading, because presolve is a huge part of what makes SOTA solvers fast.

dsfcxv · on Sept 21, 2024

hmm... I am reading [1] right now. When looking at their Table 7 and Table 11 in [1], they report comparison results with Gurobi presolve enabled and 1e-8 error. Do I miss anything?

Their performance isn't quite as good as Gurobi's barrier method, but it's still within a reasonable factor, which is impressive.

thxg · on Sept 21, 2024

Regarding presolve: When they test their solver "with presolve", they use Gurobi's presolve as a preprocessing step, then run their solver on the output. To be clear, this is perfectly fair, but from the perspective of "can I switch over from the solver I'm currently using", this is a big caveat.

They indeed report being 5x slower than Gurobi at 1e-8 precision on Mittelmann instances, which is great. Then again, Mittelmann himself reports them as 15x off COpt, even when allowed to do 1e-4. This is perfectly explainable (COpt is great at benchmarks; there is the presolve issue above; the Mittelmann instance set is a moving target), but I would regard the latter number as more useful from a practitioner's perspective.

This is not to diminish PDLP's usefulness. If you have a huge instance, it may be your only option!

thxg · on Sept 21, 2024

I think it's partially excusable. Most LP solvers target large-scale instances, but instances that still fit in RAM. Think single-digit millions of variables and constraints, maybe a billion nonzeros at most. PDLP is not designed for this type of instances and gets trounced by the best solvers at this game [1]: more than 15x slower (shifted geometric mean) while being 100x less accurate (1e-4 tolerances when other solvers work with 1e-6).

PDLP is targeted at instances for which factorizations won't fit in memory. I think their idea for now is to give acceptable solutions for gigantic instances when other solvers crash.

[1] https://plato.asu.edu/ftp/lpfeas.html

sitkack · on Sept 21, 2024

Hans D Mittlemann, you are my top dude when it comes to web design. A salute your aesthetic.

[1] https://plato.asu.edu/

whatever1 · on Sept 21, 2024

FYI the table does not include the commercial top dogs (ibm cplex, gurobi, Fico Xpress), since due to politics they all withdrew

thxg · on Sept 21, 2024

Indeed those are the "big four" solver businesses in the West, and also probably the most reliably-good solvers. But by the time Gurobi withdrew from the benchmarks (a few weeks ago), COpt was handily beating them in the LP benchmarks, and closing down on them in MIP benchmarks. Solver devs like to accuse each other of gaming benchmarks, but I'm not convinced anyone is outright cheating right now. Plus, all solver companies have poached quite a bit from each other since cplex lost all its devs, which probably equalizes the playing field. So overall, I think Mittelmann benchmarks still provide a good rough estimate of where the SOTA is.

thxg · on Sept 15, 2023

The article is very recent (September 2023 [1]), mentions that "rax is used to return integer values" in the SysV ABI (hence implicitly the 64-bit SysV ABI). Also confirmed in this excerpt:

> Note that the top 56 bits in rax are not zeroed – they contain junk. This is fine b/c the compiler will only make callers check the lowest bit of a register for boolean operations. This is why changing the compiler’s “understanding” (ie the cast) is necessary.

... and yet, the function ABI is clearly i386 (fetching arguments from stack) and indeed everything is compiled with -m32 [2] (i.e. 32-bit SysV ABI). This is a strange contradiction in 2023. On the Intel/AMD side, x86_64 has been prevalent (and the default) for... more than 15 years?

It does not invalidate the article's point. But it is slightly confusing....

[1] https://dxuuu.xyz/

[2] https://godbolt.org/z/ff8r44nKn

AshamedCaptain · on Sept 15, 2023

The article has too many x86 assumptions. The casts are most definitely _not_ free, because there are multiple operations like sign-extending, zero-extending, etc. which depend on the ABI. The fact that these are all almost free in x86 is irrelevant since some architectures will have to go out of their way to implement the operation (albeit the ABI is usually designed to make these operations simple). Similarly, I could also imagine another architecture having a shorter instruction to set a register to 0/1, or even one where _Bool is defined with the machine word size and thus doesn't need any cast whatsoever (not even sign extension).

On such architecture "casting to bool" would be free, but other, narrowing type conversions would not.

danobi · on Sept 15, 2023

Above two comments are very helpful, thanks. I've corrected the -m32 issue.

And thanks for pointing out the x86/ABI assumptions. Had not considered that. It's certainly interesting to think about.

thxg · on July 3, 2023

This representation of jagged array is literally the "Compressed Sparse Row" (CSR) format for sparse matrices [1]:

- a single array of floats with all the nonzero matrix entries, ordered row-by-row:

  double values[nnz];

- a single array of ints with the corresponding column index:

  int columns[nnz];

- a single array of ints, of length (m+1), with the start of each row in the two other arrays (the (m+1)th element stores nnz, which will be useful later):

  int start[m + 1];

So to print a whole matrix, you do this:

  for (int i = 0; i < m; i++) {
    index = start[i]; // where
    count = start[i + 1] - start[i]; // number of nonzeros in row i
    
    for (int t = 0; t < count; t++) {
       j = columns[index + t];
       v = values[index + t];
       printf("element (%d, %d) has value %g\n", i, j, v);
    }

To a modern programmer, this may look very old-fashioned (Fortran-ish), but I have tried many alternatives (like grouping columns and values together in a struct, or using pointers), and in my numerical codes, nothing gets even close in terms of performance. It must be said that CSR and CSC (its transpose) are very compact in memory, which is often a bottleneck. Also, since so many sparse numerical codes in benchmarks have used this representation for decades, it could be that CPU designers make sure that it works well.

[1] https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_spars...

gautamcgoel · on July 3, 2023

Thanks for this comment, was super interesting! Apparently Matlab uses a similar format (Compressed Sparse Column) to represent sparse matrices.

dataflow · on July 3, 2023

Have you tried using precomputed byte offsets rather than indices for all the integers? Those would definitely speed up the non-power-of-2-sized case, but I'd be curious if they might speed this up a little bit too, since they should remove some bit shifting instructions.

patagurbon · on July 4, 2023

You often need the indices for other operations (indexing into other arrays). CSC it's transpose CSR, and the doubly compressed forms DCSC/DCSR, are optimized for linear algebra for the most part.

jjgreen · on July 3, 2023

A very nice (and fast) C implementation in CSparse (now part of SuiteSparse) https://people.engr.tamu.edu/davis/suitesparse.html

hinkley · on July 3, 2023

I’m fairly sure anecdotes like this inform the conversation about structs of arrays versus arrays of structs.

There are a couple of languages that let you treat them the same way. Having list comprehensions (plural) over multidimensional data would be handy.

xeonmc · on July 3, 2023

How do you insert elements at the end of a row?

thxg · on May 13, 2023

Ligatures are an interesting contrast to Julia programmers' habit of using actual (non-Latin) Unicode characters in their code. It seemed to me that the use of Greek letters and symbols is actively encouraged in the Julia documentation. I think they advise using editors that allow inputting special characters through their LaTeX equivalent (but source files are then saved to disk with the Unicode characters).

I am not very familiar with the latest trends, could anyone else chime in here? Do Julia coders use fonts with ligatures? That could indeed be one case where confusion is possible. Or maybe not?

sundarurfriend · on May 13, 2023

A lot of Julia folks use the JuliaMono font, which has a minimal set of ligatures: https://juliamono.netlify.app/#contextual_alternates (just above this segment, they mention this very article as something they agree with).

I personally don't like any ligatures, especially when people use them in presentations to public like the article mentions. Confusion between operators is rare, but possible, for eg. when using Catalyst (https://docs.sciml.ai/Catalyst/stable/catalyst_functionality...).

> It seemed to me that the use of Greek letters and symbols is actively encouraged in the Julia documentation.

I don't remember getting that sense from the docs, but I may be wrong. The general convention is that Unicode at the level of user code is fine, and a lot of the community (including me) likes it because it makes it much closer to the actual equations and scientific notation we're working with. For libraries, it's strongly encouraged to have ASCII equivalents for any Unicode operators, keywords, etc. that you expose, and all the major libraries do this. The same is true in base Julia as well, every non-ASCII Unicode name has an ASCII equivalent too.

kps · on May 13, 2023

Whatever font you prefer in the terminal, Julia Mono is an excellent fallback due to its high symbol coverage.

sundarurfriend · on May 13, 2023

You know what, I'd tried JuliaMono in my editor (gVim) and for some reason didn't like it there, but making it my console font makes my terminal look so much better. Thanks for the recommendation.

adgjlsfhk1 · on May 14, 2023

It's definitely a slightly quirky font. It took me a bit to decide that I like it, but after a bit, I've gotten pretty used to it and it has become my default mono font.

grzm · on May 13, 2023

I'm not familiar with Julia, but I think a useful distinction can be made between input methods and ligatures. An input method is what you have to type to get a particular "code point" or character. Japanese, for example, is often typed using a latin keyboard, and as latin characters are typed, the application substitutes the appropriate Japanese character code point. For example, if I type the characters "n" (U+006E) and "i" (U+0069) using a Japanese input method, a に (U+306B HIRAGANA LETTER NI) (or 二 U+4E8C CJK UNIFIED IDEOGRAPH-4E8C or ニ U+30CB KATAKANA LETTER NI, or others as the case may be) can be selected to be entered. The actual character bytes will be substituted. You typed "n" and "i" but what gets stored is U+306B.

This is in contrast to ligatures: say if there's a fi ligature, I type "<" "=", and the "≤" character is substituted during rendering, but the individual characters are still the "character string". (You could also choose to directly insert the "≤" character directly, and that would then be a single code point, but I think for the purposes of this discussion, people are mostly talking about cases such as when the character string "<=" is rendered as "≤".

If Julia (like APL) uses characters outside the ASCII range as syntax, I suspect that there are input methods to assist in entering them as opposed to ligatures representing them on screen. But that's just naïve supposition on my part.

Another interesting case is TLA+ [1], where the language shares some LaTeX operators which can be rendered for display.

[1]: https://lamport.azurewebsites.net/tla/tla.html

thxg · on May 11, 2023

MemComputing mainly present themselves as optimizers: Their machine can solve real optimization problems, in practice, today. This is very good because, through that lens, we can skip the BS and go straight to the facts. The evaluation criteria for any optimization approach/algorithm are:

- the quality of the solutions found (in the best case: optimal solutions)

- the presence or absence of optimality guarantees (for example, some algorithms provide optimal solutions very often, but cannot guarantee it 100%)

- the time (or computational cost) needed to find such solutions

Furthermore, the state-of-the-art (SOTA) is well known for most types of optimization problems. In this article, they present a list of real and practical applications, so let us have a look at them one by one:

1.1 Traffic flow optimization

Simple flow problems can be solved in polynomial time (and quickly in practice), so there is no need for anything fancy. Once you introduce additional constraints or discrete variable, the SOTA is mixed-integer programming for offline problems. For online problems it's more complicated. In both cases, I am not aware of any application in which MemComputing can reach SOTA.

1.2 Vehicle routing & scheduling

Here, depending on your computing time constraints, needs for solution quality and/or optimality guarantees, the SOTA can be constraint programming (gecode, OptaPlanner), local search heuristics (LocalSolver), or mixed-integer programming (CPLEX/XPress/Gurobi) with column generation. MemComputing is nowhere to be seen again.

1.3 Supply chain optimization

This is a very broad field, but in general mixed-integer programming is king here.

2.1 Protein folding

Here there is quite an objective measure: the biennal CASP competition. This is where AlphaFold made a splash in 2022. MemComputing has never participated.

2.2 Genome sequencing

I am not knowledgeable enough to comment here.

2.3 Radiotherapy treatment

I am not very knowledgeable here either, but last I looked mixed-integer programming approaches were favored.

3.1 Portfolio risk optimization

Various types of branch-and-bound solvers. Mixed-integer linear/quadratic/convex programming. No MemComputing.

3.2 Detecting market instabilities

No idea.

3.3 Optimizing trading trajectories

No idea.

4.1 Training neural networks

Many people here know how this is done. Stochastic gradient descent on GPUs or TPUs. No MemComputing involved. How can they even claim to be active in this field?

4.2 Detecting statistical anomalies

Vague.

4.3 Classifying unstructured datasets

No idea.

The problem is that if you invent a new optimization algorithm, it is very easy to find one instance of one problem for which your algorithm works well. They did literally that in a paper [1]: They took a library of mixed-integer programming problem instances containing 270 benchmark problems, and published a whitepaper showing that they beat a SOTA solver on one of them. A single instance out of 270!

The really hard part is the opposite: given a class of problems, find an algorithm that beats the SOTA. MemComputing has never done that. Combined with their propensity for grand claims backed by misleading evidence, MemComputing have accumulated a lot of badwill from the academic community over the years. My suspicion is that, while on the surface this post seems to put their approach in contrast to quantum computing, what they really try to do here is ride on the quantum computing hype wave.

[1] https://arxiv.org/abs/2003.10644

thxg · on May 7, 2023

The scientific article seems to be open access [1].

Before people draw links to recent large language model breakthroughs: Although they do use techniques from computational linguistics, there are no neural networks involved. This is more like old-school AI.

They have essentially a giant optimization problem, and they (approximately) model it as a lattice parsing problem, with a stochastic context-free grammar. They can solve that to optimality in O(n^3), which is too slow for some applications. So they propose a O(n) heuristic (hence no optimality guarantees, but the model was approximate to begin with anyways, and the heuristic is a lot faster), which is the reason for the name of their code: "LinearDesign".

[1] https://www.nature.com/articles/s41586-023-06127-z

choeger · on May 7, 2023

Much appreciated! Do you have a good source on what a lattice parsing problem is?

shishy · on May 7, 2023

"The lattice parsing problem refers to the task of parsing a word lattice, which is a graph structure that represents multiple possible sequences of words that could have generated a given speech signal [1]. The word lattice is a weighted directed acyclic graph, where each node represents a word hypothesis and each edge represents a transition between two words. The weights on the edges represent the likelihood of the transition. The goal of lattice parsing is to find the most likely sequence of words that generated the speech signal, given the word lattice [1]. Lattice parsing is a challenging problem because the word lattice can be very large and contain many alternative paths, making it difficult to find the most likely path efficiently [1]. Several techniques have been proposed to address this problem, including bi-directional LR parsing from an anchor word, augmented chart data structure, and attention shifting for parsing speech [2][3][4]."

1. https://doi.org/10.21437/interspeech.2016-1583

2. https://doi.org/10.3115/997939.997950

3. https://dl.acm.org/doi/10.3115/991146.991188

4. https://doi.org/10.21236/ada105028

----

(full disclosure this might not be correct, I tried this with an LLM approach we're beta testing at my job called scite Assistant that answers with real references - no hallucinations, just curious how the response is against someone that knows the field a bit more..!)

thxg · on April 22, 2023

The video link in the article works in Chrome but not Firefox mobile (for some reason I can't comprehend).

Working link in case you are interested:

https://www.youtube.com/watch?v=ujjkbLZETHs

simple10 · on April 22, 2023

I ran into the same issue. Opening the video link in Chrome incognito works. Going directly to the video link after visiting the article (with blocked embedded video) causes a connection refused error to YouTube.

I haven't seen this with YouTube before. It must have to do with the embed settings being disabled for the video and a cookie getting set. If the cookie is present, the connection to the server is refused.

gojomo · on April 22, 2023

Also doesn't work in (Chromium-based) Brave. Might be Google product-tying its YouTube dominance to Chrome dominance.

thxg · on March 13, 2023

> The best-known algorithms for NP-complete problems are essentially searching for a solution from all possible answers. The traveling salesman problem on a graph of a few hundred points would take years to run on a supercomputer.

This shortcut in explaining NP-completeness is frustratingly widespread. It is also very wrong. A 49-city TSP was solved (to provable optimality) by hand by 1954 [1]. Concorde [2] can solve TSP instances with more than a hundred thousand cities [3]. In practice, for a graph with a few hundred points, you will most likely find an optimal tour in minutes on an iPhone (yes, there is an app for that [4]).

And it's not just about TSP. SATs [5] and MIPs [6] with hundreds of thousands of variables are solved every day to provable optimality. How? Well, sure, the best-known algorithms for NP-complete problems are of course exponential in the worst-case. But they are definitely not "essentially searching for a solution from all possible answers", that would indeed be prohibitively costly. We have algorithms that perform much better on practical instances (dynamic programming, backtracking, branch-and-bound, etc.).

[1] https://www.math.uwaterloo.ca/tsp/uk/history.html

[2] https://www.math.uwaterloo.ca/tsp/concorde/index.html

[3] https://www.math.uwaterloo.ca/tsp/star/about.html

[4] https://apps.apple.com/us/app/concorde-tsp/id498366515

[5] http://www.satcompetition.org/

[6] http://plato.asu.edu/ftp/milp.html