Why I sometimes like to write my own number crunching code

Zababa · on June 26, 2021

I wonder if sometimes always telling people to not do anything themselves and use already made libraries is robbing the industry (and especially juniors) of lots of opportunities to learn. Instead of being in control of what you do, you learn to use something made by other people to simplify things. This is a bit like cooking where in the last 10-20 years I've seen appliances appearing everywhere. Some are very good (I love my slow cooker), some I'm more perplexed about (A stand mixer? I usually get by either with my hand for dense doughs like pizza or bread or a hand mixer for more liquid/light doughs). If everybody uses libraries, how are we training new libraries writers? Maybe part of the solution would be to tell people to use libraries, and then read the code, try to understand them and contribute.

yen223 · on June 26, 2021

I feel like one of the unintended consequence of discouraging people from "rolling their own crypto" is that developers who obey that rule will never be able to learn what good crypto is and isn't. This I believe will lead to more insecure software in the long run.

Kalium · on June 27, 2021

At the risk of being overly opinionated, I think that developers do not learn what is good or bad cryptography from writing it. We learn what is good or bad code from writing and maintaining it. People learn what is good or bad cryptography from studying the mathematics of cryptography and listening to cryptographers. However, it's very possible for good code to be terrible cryptography for reasons you are very unlikely to learn by writing the code.

The catch with trying the reasonable compromises that are acceptable in other specialist areas often backfire in cryptography. "It's fine, this isn't serious, we'll fix it later" is something you can live with when it just results in a half-baked ORM, but much less fine when it results in seriously flawed cryptography that endangers people.

Cryptocat - and decryptocat - is a good example on both points.

gloriousternary · on June 27, 2021

I think a pretty reasonable compromise is to reserve custom implementations of potentially dangerous things like crypto for personal projects. As an end user I'd rather run software that just uses sodium, but as a developer I agree that it's necessary that we get experience doing it ourselves.

dagw · on June 27, 2021

I wonder if sometimes always telling people to not do anything themselves and use already made libraries is robbing the industry (and especially juniors) of lots of opportunities to learn.

In an ideal world, that is what university should be for. I rolled a whole bunch of my own numerical code while working on my degree, and got to learn in a 'safe' environment how hard it actually can be and all the pitfalls there are. When people say don't roll your own crypto, what they really mean is don't roll your own crypto and use it in production.

Completely off topic. If I was forced to choose between a slow cooker and a stand mixer, I would keep my stand mixer every time.

Zababa · on June 27, 2021

> In an ideal world, that is what university should be for.

That's true, but since we're supposed to be learning our whole career, this could work for CS fundamentals but not for new things that appear after.

> Completely off topic. If I was forced to choose between a slow cooker and a stand mixer, I would keep my stand mixer every time.

A big reason that makes me love my slow cooker is that I only have a small induction plate that can either burn food or not cook it at all.

dagw · on June 26, 2021

Another case where writing your numerical algorithms can be faster than using the standard lib is if you know you're only working on a small subset of the possible input domain and can write a highly specialized function.

For example, a certain function may take arbitrary dimensional complex matrices as input. However if you know that you will only be passing in 3x3 Real positive semidefinite matrices you can probably write a faster function that only works on those matrices by using all that extra information you have. In Matlab I've gotten order of magnitude speedups over the standard functions by rolling my own highly specialized functions.

lordnacho · on June 26, 2021

The author has written books about this area, so he's pretty confident that he can write numerical libs. That might be right, but for most people, when you come across some need, that's going to not be your speciality.

I might need a numerical lib at some point, but I also know that there are things like numerical instability, gremlins in this area that I know some of but not all of.

Something popular with lots of people working on it is likely to be far better for me than rolling my own.

Of course this does mean that for most things by far, the best way forward is to research which lib is best supported and go with that.

mjburgess · on June 26, 2021

It seems like there's been a crash in the FP language market: Scala, F#, ... Clojure.

Anyone any idea why?

Scala 3 almost looks attractive.

_meqs · on June 26, 2021

In my experience, OOP is taught to basically everyone in schools (AP CS is usually in Java, and so was my DSA class) which makes the jump into FP hard to wrap your head around: I've tried a few things in Haskell, and I kept trying to use weird facsimiles of objects in places they should not go.

And then Rust came along and wrapped a ton of FP concepts into easy to conceptualize OOP packages, and that reduced my motivation to actually learn FP properly.

Still slowly trying to learn these things, though, I don't think Rust really managed to capture all the advantages.

R0b0t1 · on June 26, 2021

FP is also just not as suitable for solving many common CRUD problems as an OOP or imperative design is. A FP solution will be some combination of more verbose, slower, harder to maintain, and less flexible.

Zababa · on June 26, 2021

I don't think that's true. I've used Phoenix with Elixir and it was a joy to use, easy to understand and pretty fast. More verbose and harder to maintain, I don't think it's true for statically typed functional programming language, the compiler does a lot of work that would be tests in other languages, and they're usually quite terse.

For the "slower" part, it depends. Slower compared to what? Most will be faster than Ruby or Python, most will be slower than C++, they're usually in the ballpark of Java (by that I mean with less features but faster than Spring, at least these are the results on TechEmpower).

For the less flexible, I don't know what you mean by that. Harder to extend since you don't have reflection/dynamism? That may be true, but for me these mechanism usually make a codebase harder to maintain.

R0b0t1 · on June 27, 2021

> For the "slower" part, it depends.

Boy does it depend. Sometimes you'll get a pathological memory allocation that makes your haskell solution the slowest possible. It's this kind of stuff that makes avoiding FP languages a good idea in usual production. You've turned "do X" into "track down compiler implementation details and fix them."

> For the less flexible, I don't know what you mean by that.

I'm going to suggest you lack experience, then. OOP is optimized for the lowest common denominator. That is usually sufficient. You can do amazing things with FP, but it's not usually necessary, and usually gets in the way of accurately describing what your computing machinery is doing.

Zababa · on June 27, 2021

> Sometimes you'll get a pathological memory allocation that makes your haskell solution the slowest possible. It's this kind of stuff that makes avoiding FP languages a good idea in usual production.

Isn't that mostly due to Haskell being lazy? OCaml is strict and has more predictable performance for example.

> OOP is optimized for the lowest common denominator.

Again, I'm not sure what you mean by this. OOP can be a bit too good at hiding the data flow. One of the most popular book in the OOP space, Clean Code, suggests separating objects into "data structures" and "proper objects", with the former having lots of data and not much functions, and the latter the opposite. That sounds to me like using objects means you have to be more careful that when using types and plain functions.

> You can do amazing things with FP, but it's not usually necessary, and usually gets in the way of accurately describing what your computing machinery is doing.

The same could be said about OOP, especially with the design pattern parts. You can write very complex code in FP, and you can write very complex code in OOP. The heavy use of reflection eschewing type safety is a good example of complex code in OOP. If you want to look at easy to understand FP, Elm is a good example. React as a whole is also an example of people moving to FP practices because it's easier to understand.

nicoburns · on June 26, 2021

Perhaps because a lot of the idea have been picked up by other languages with bigger ecosystems: Swift, Kotlin, Rust, etc?

Zababa · on June 26, 2021

Swift itself has almost no market share in the web space, just like Scala, F#, Clojure, OCaml have almost no market share in the Apple ecosystem. Scala also wasn't used in the android ecosystem, which is I think still the major user of Kotlin. On the other hand, Kotlin is also on the server now, but I don't know if it's replacing Java or Scala (maybe both?). Another thing is that Java itself is adopting lots of functional programming features (records, pattern matching recently).

I don't think I've seen any mention of migration from F#, Clojure, OCaml, Haskell, Scala, ... to Rust, but maybe it's easier to pick up than C++ when high performance is needed by teams already used to a ML-like?

gwmnxnp_516a · on June 27, 2021

It seems that some functional programming languages are hard to sell to management due to the limited adoption, talent pool concerns and integration with existing systems and code. In the case of Clojure, despite being able to run in the JVM, Clojure has a unfamiliar lisp syntax to most Java developers and is dynamically typed which makes the life of IDEs harder since dynamically typed languages are hard to refactor, hard to figure out which types a function returns and so on. In the case of Scala, Kotlin seems to eating its launch due to be borned with full tooling support from Jetbrains and endorsement by Google on Android. In addition Kotlin is FP enough without over-engineering and compiles faster than Scala. Regarding the F# the problem might be that most tutorials about .NET platform are in C# and most codebases are in C# and Microsoft does not promote F# in the same extent as C#. Moreover C# is getting more in more functional features.

More and more languages are adding support functional-programming features, including Java, C#, Swift, D Language (DLang), Rust and also C++11, C++17 and C++20.

greendream17 · on June 26, 2021

what do you mean, the FP job market?

im3w1l · on June 26, 2021

So uhm why did the cosine similarity give values bigger than one in the library and how can we be confident that this code does not have that issue?

username90 · on June 27, 2021

The answer is 10 / 14, which expands to his answer if you write out the decimals. The library just truncated it in its output. Doing your own cosine similarity is really hard to mess up.

im3w1l · on June 27, 2021

I guess I'm concerned with rounding errors in the intermediate results leading to a similarity of 1.00000001 or something. When you say 10/14, I don't know what you are referring to btw. I couldn't easily find the original bug or library.

username90 · on June 27, 2021

> When you say 10/14, I don't know what you are referring to btw.

I did the calculation manually, the answer is exactly 10 divided by 14. Basically you just take ( 1 * 3 + 2 * 2 + 3 * 1 ) / (3 * 3 + 2 * 2 + 1 * 1), which is 10 / 14. Computing cosine similarities is really simple as I said.

Edit: The funny thing is that his answer is actually a bit too low since he truncates instead of rounds at the end. 10 / 14 just continues as 142857 and repeats after that. So if the library is smaller than his then the library is wrong and not him. Anyway, blindly trusting the results of a library is just dumb, they make mistakes so often you are right and they are wrong.

dragandj · on June 27, 2021

The matrix library only truncates the printout of matrix entries. The output is, of course, full 32bit or 64bit floating point number.

tectonicfury · on June 26, 2021

good question

tectonicfury · on June 26, 2021

I felt that it was a good article, very straightforward, and he made an interesting point about numerical packages being often written by graduate students.

Personally, I would like to learn the math, but I would want to first learn from the 'established' books (i.e. the books the author read on his way to this point), only then will I be in a position to judge the real worth of the author's books.

queuebert · on June 26, 2021

The author is overconfident and underestimates library writers to a fault. Packing a bunch of dot products into a matrix is only the fastest way to compute a batch of cosine similarities if you are relegated to a parallel matrix product function.

If you write the procedure in raw CUDA, for example, it is faster to simply broadcast the dot products across threads. That is exactly what the matrix multiply is doing except without the overhead of matrix creation and with potentially greater locality of memory accesses.

Edit: I do think it is worthwhile to write your own code so that you better understand what is happening under the hood, so good on the author for that.

klyrs · on June 26, 2021

> The author is overconfident and underestimates library writers to a fault.

Clicking around the website a bit, it would seem that the author is a library writer (some specifically targeting GPUs), and the article is a plug for the books he's written about numerical analysis. How confident are you?

queuebert · on June 26, 2021

Not very confident in general. But I have written a LOT of raw CUDA code.

inimino · on June 26, 2021

This pattern comes up a lot in comments on this type of post. Author makes a dumb but easy-to-follow example to make a point, and then someone points out how the example is dumb. Try to avoid getting hung up on examples -- in this case nobody reading this post should come away thinking they now know more about cosine similarity than a library author. I dare say the author's main points were:

- it's not that much code

- it's not terrifyingly hard to understand

- you might open up performance benefits in your specific circumstance just by knowing how it's calculated after writing the code to do it yourself

queuebert · on June 27, 2021

Lest you think mine was a shallow, knee-jerk analysis, this is my area of subject matter expertise. I get all the nuances, but I just felt like defending library authors, whom he unjustly threw under the bus, because that stuff is really hard. Like really hard.

Also, if you are using libraries written by grad students, then you probably need to find better libraries. All of the major battle-hardened ones like BLAS, LAPACK, CUDNN, TF, FFTW, etc., have had major big brains looking at them for years. Decades in some cases.

inimino · on June 27, 2021

I didn't read it that way. I took the audience to be more junior programmers to whom numeric library code can seem inscrutable and terrifying. In my experience "write it yourself first" is very liberating advice that every programmer should take to heart. I would also cut the author a little slack since he is a library author and the post is a kind of ad for his own numeric library. It could have been worded better though.

exmadscientist · on June 26, 2021

> in this case nobody reading this post should come away thinking they now know more about cosine similarity than a library author

However, if this post leaves you open to the option that you might know more about $THING than a library author... that is probably a very healthy possibility to consider.

(Consider. Some are that grad student who got assigned to write the library because they were too useless to do anything else. Others are David M Gay.)

username90 · on June 27, 2021

Beating library implementations is usually very easy since you know your use case perfectly and they don't. You need to be way less competent than them to not be able to do something better when you write a tailor made solution. The main reason not to do it is that it takes time and makes the code harder to get into for new hires.

inimino · on June 27, 2021

Indeed! That you should be optimistic about it is the best thing to take away from the post.

Blackthorn · on June 26, 2021

The author is a library writer!

queuebert · on June 26, 2021

My mistake. I took this statement in isolation: "...consider that machine learning libraries are frequently written by grad students on their path to discovery. It's a domain expert with poor programming skills. Or it might be a case of a good programmer who only barely understands the domain…"