I could not disagree more with that post. Mathematical notation is to Python what English would be to Assembly Language: it is a high-level notation, designed see the forest rather than getting lost in the trees.
Take Maxwell's equations for example. After learning vector calculus and becoming familiar with its notation, you notice how much meaning you can extract at a quick glance from the four Maxwell's equations expressed in differential form. The purpose is not to obfuscate the meaning (like the author implies when he mentions the Obfuscated C contest) but rather to reveal/encapsulate as much information as possible in a way that is easy to understand at a glance. You want to know what would happen if magnetic monopoles existed? Easy to do by just adding a few extra terms that are obvious by symmetry of the equations.
And, if you go beyond vector calculus, you can see an even clearer picture by writing
F = dA (equivalent to the usual two homogeneous Maxwell Equations)
d*F = *J (equivalent to the remaining two)
> Mathematical notation is to Python what English would be to Assembly Language
Which is often the problem, it looks specific, but it really isn't; it's full of implicit assumptions about what the reader should know. It's not executable, because it's a language designed for being written by hand rather than executed by a computer.
However, the teaching of math would greatly benefit from an explicit executable form that makes no assumptions, i.e. a programming language. Gerry Sussman makes this case, he's pretty convincing.
> However, the teaching of math would greatly benefit from an explicit executable form that makes no assumptions
I'm skeptical of this logic. I think nobody would be able to learn mathematics if they had to keep it all in their head at once. "Unobfuscated" maths, i.e. maths that has every single logical deduction written out from first principles so that you can (to use your words) "execute it", would be impossible to understand because, fundamentally, we are not computers.
Instead we learn some, we internalise it, and then we can appreciate the new notation that encapsulates what we have learnt before without exposing us to the underlying complexity that we simply don't need to solve higher problems.
Mathematics is beautiful. If anything, it's programming languages that have the catching up to do.
Edit: more generally, the blog post seems to be a case of "when all you have is a hammer, everything looks like a nail" -- if you spend all day programming, then it might make sense to try and relate mathematics back to something you're more comfortable with. That doesn't mean that it's the most sensible way for everyone else to learn mathematics.
> I'm skeptical of this logic. I think nobody would be able to learn mathematics if they had to keep it all in their head at once.
That's what functions are for.
> Instead we learn some, we internalise it, and then we can appreciate the new notation
Those new notations are functions, simplified abstractions of more complex processes that allow us to think in larger concepts and not sweat the details.
> Mathematics is beautiful. If anything, it's programming languages that have the catching up to do.
You think that people learning math will find it less confusing if they have no context (which, in mathematics, is largely equivalent to "assumptions") for what they are learning and have to build their mathematical knowledge from first principles? Or am I misunderstanding your post?
I don't think anything, I was recalling what I heard Sussman say at a conference, which he makes clear in http://en.wikipedia.org/wiki/SICM. Though I saw him a good 8 years after the book, so I'm sure he'd grown some new opinions.
He is of the opinion that if you can't write the algorithm, you don't understand it; forcing students to write the algorithm aids in teaching them a real understanding. He spent quite a bit of time complaining about the vagueness of mathematical notation due to its implicit assumptions for the purposes of teaching and the superiority of executable code for the task.
On a side note, the speech was aimed at both engineers and programmers and the audience was a mix of both. He did a wow demo with a circuit on the overview, calculating all the voltages and resistances on the fly from chosen starting values that made all the programmers smile really big and all the engineers jaws drop. Everyone was impressed because he flew through it, but the programmers much less so because what he did was just how a programmer thinks, albeit it very fast. He was demonstrating to the engineers the benefits of thinking like a programmer. As he is both and teaches both, he clearly connected some dots that they generally don't.
He is of the opinion that if you can't write the algorithm, you don't understand it; forcing students to write the algorithm aids in teaching them a real understanding.
Absolutely.
I had memorized the notation for derivatives. Until I was able to realize how the notation tied to finite numerical approximations, I never really understood it.
If you don't know what I'm talking about, the right way to approximate (d^2/dx^2)(f(x)) is to write an operator d defined as d(f)(x) = f(x + dx) - f(x). And now your numerical approximation for the second derivative is (d(d(f))/(dx * dx))(x).
Now replace your d operator with the much better d(f)(x) = f(x + dx/2) - f(x - dx/2), and see how good your numerical approximations become!
The notation actually means something. When we switched from infinitesmals to limits, we lost sight of that. When you try to write the algorithm, you're going to be reminded of its meaning, fast.
That is fine for you, not everyone needs to know about Wiener path integral formulation of quantum mechanics ... which is based on functions that are almost continuous everywhere but nowhere differentiable (the set of functions that are differentiable and with which you are familiar is a set of measure zero i.e. it is comparable to using only integers instead of real numbers). The limit process, which you discard as not meaning anything, is essential to understand the Wiener path integral formalism. Fortunately, mathematicians and physicists did not "lose sight of that" ... and are able to go beyond what your limited understanding would limit you to. I don't mean this in a deragotary way - you obviously don't need to understand at a deeper level than you do; few people do ... but to those that need this deeper level, the mathematical formalism that you dismiss is, in fact, essential.
Edit: my apology for making an unwarranted assumption about what your level of understanding is.
I don't mean this in a deragotary way - you obviously don't need to understand at a deeper level than you do; few people do ...
Do you like to assume much?
I have a masters in mathematics. I likely understand a lot more than you give me credit for.
I do not "discard" the limit process. I'm saying that the d/dx notation, which we get from the old infinitesmal approach, has insights buried in it that most students miss. I certainly missed it for a long time.
Limits won for good reason - they were the first consistent approach to defining the derivative that mathematicians understood, and were a key step on the way to resolving the inconsistencies in previous definitions that Fourier series had revealed.
But limits are not actually "essential". I am personally aware of multiple completely rigorous formalizations of Calculus. The second most famous of which is Robinson's non-standard analysis, which has absolutely nothing looking like a limit in sight.
Programming languages aren't explicit executable forms, they are abstractions. Unless you are working with machine code or assembly, you're getting the abstract ideas which get expanded by the compiler. What it sounds like you're advocating is to write out 3+3+3+3+3 instead of 3*5 (the latter is assuming).
They're executable abstractions, thus they are explicit. And no, that's not what I'm advocating at all. There is no assumption in 3 * 5; * is a defined function, if you need to know how it works you can look. What's important is that it's computable. Notation often isn't, too many implicit assumptions that rely on a trained eye to execute correctly.
I couldn't possibly disagree more with your post. :)
The language of math requires exact precision which makes extracting what the series of abstract symbols "mean" incredibly difficult. Trying to learn new mathematical concepts from wikipedia is damn near impossible. Here are two examples.
Trying to figure out what those walls of symbols mean requires quite a lot of focus. Dig up source code for either and it's orders of magnitude easier to understand. The images for spherical harmonics conveys more meaning and understanding in a fraction of a second than spending 3 hours with a piece of paper and the equations would.
Wikipedia is not designed to be a pedagogical repository; arguing that the mathematical language sucks because a terse summary in wikipedia is not enough to understand the material is missing the point.
Would you be able to calculate the probability of finding an electron within a certain distance from your pretty pictures? I would be able to do so fairly easily from those equations. What about computing special functions in higher dimensionalal systems? The generalization from equations is straightforward; our minds can not comprehend higher dimensions in pictures.
As you wrote, yes it requires a quite a lot of focus ... but that's the price one has to pay in order to be able to build models that describe the world and allow us to make predictions. That is not to say that adding pictures or explanatory sentences is unnecessary; but you can't replace the equations by only a combination of words and pictures and be able to make predictions (such as, for example, in quantum electrodynamics where predictions and experimental results agree to better than one part in 10^10 ... equivalent to something like 1 mm precision in measure the distance from coast to coast of the US) in a better way.
Scientific theories are those that are testable. You can't pretend to understand a scientific theory if you can't use your knowledge of it to make quantitative predictions. You need math for that.
> Wikipedia is not designed to be a pedagogical repository; arguing that the mathematical language sucks because a terse summary in wikipedia is not enough to understand the material is missing the point.
Nicely worded. I think the real issue is that many people want mathematics to be easy to read and understand. Unfortunately, much of mathematics is not easy to read nor is it easy to understand regardless of how it is written. Mathematics, like many other things, often requires a lot of intellectual scaffolding to be built from the bottom up. There is no shortcut to the top.
For the sake of an argument, let's assume you are right, and the language of math sucks.
Then, we have had centuries of mathematicians inventing ever changing notations and ending up with a notation that sucks.
If so, those mathematicians must be extremely stupid. Surely persons of average intelligence must be able to come up with better notations, and find proofs for stuff that is out of reach of those morons?
I do not rule out that notation can be improved. After all, it took millennia to move from Greek and Roman arithmetic systems to Arabic notation. There is no reason to assume we have reached a global optimum. However, I also find it unlikely that mathematical notations truly are stupid.
Your argument is like someone complaining about the difficulty of discriminating between "37" and "73" because it is way easier to see the difference between 37 and 73 scratches in a piece of wood.
> If so, those mathematicians must be extremely stupid.
Not necessarily. Before modern times mathematicians worked alone and didn't need a notation that others would understand. Newton, for example -- his calculus notation wasn't very good, and over time Liebniz' notation ended up being used instead (completely apart from the issue of who invented the calculus).
> Surely persons of average intelligence must be able to come up with better notations, and find proofs for stuff that is out of reach of those morons?
This doesn't support the thesis that mathematicians are stupid. Most mathematicians only care that they understand their own notation, and give less weight to its comprehensibility to others. And some mathematicians in the distant past deliberately used obscure notation to conceal their work from others.
I would look to educators to come up with a more comprehensible system of notation, not necessarily those at the frontiers of mathematics.
Apart from the fact that you mean the "royal society", yes, both true. They also came into conflict about priority with respect to the calculus. And surprisingly enough, that debate rages on.
This is hardly true. Mathematical notation changes all the time-in fact, inconsistency is a significant problem, because Mathematicians love to experiment with notation, definitions, etc.
Seriously, just look at a Mathematical logic book from 50 years ago vs. one today. You won't even recognize half the words, let alone the symbols.
I agree, it's a fun exercise to try and come up with a "better" form of notation that works in the general case. I think that most people will find that modern mathematics notation is really good stuff.
Both of your examples are new to me as well, however I can make sense of topics discussed. Just to use your example, the first equation for B-splines defines it as a function from a tuple of ordered reals to real vector of dimension d. The real difficulty with math is not so much the symbols as they give a precise but very legalistic way of defining things but the underlying concept or intuition that it represents, the deep semantics if you will. The best analogy is law. The way a lawyer or judge defines things will be different from the way a lay person would due to the difference in precision. The descriptions will be technically correct but will make you second guess on what it really means. I wouldn't go so far as to say it sucks but math is definitely scientific legalese.
1. I came into this thread expecting to find a math expert of some sort defending their "programming language". Thank you for not letting me down!
2. It depends on context. The people writing and manipulating mathematical formulae LOVE the short hand I'm sure... but for consumers of the information perhaps it's less than ideal.
This thread would be more informative if the post author had said _what_ he was trying to read. Most math is written to communicate with other mathematicians, so if he grabbed some random paper off of ArXiv then it's not at all surprising he found it hard to read. It also makes a difference if he's trying to actually learn about a subject, or just to apply some formulas without necessarily understanding him. Math is written differently for specialist and beginner audiences, with the very terse "CRC Handbook of X" off on some other axis.
No, that's not why math is painful to read. The reason math is painful to read is because it's either obvious, in which case it's not painful to read, or because it contains new ideas that you need to work hard on before you can grok them.
There is no short cut. You can't write math like a novel, you can't read math like a novel. There are ideas that are deep and difficult, and unless you actually spend time working on the material you simply won't "get it."
Time and again I have explained to people something that's comparatively simple, and easily within their grasp. I've tried different approaches, different formulations, different arcs through the material, and time and time again the person has followed every step in detail, and just not "got it." There is no enlightenment, there is no sense of what's happening. It's only after they've invested the time that they have come to a sense of what's happening, and then the light dawns. Occasionally they say things like "Why didn't you explain it like that in the first place?" Then I point out that I did, but at that stage, they weren't ready for it.
It's like following a SatNav. You get to your destination having followed the turn-by-turn instructions, and yet, in truth, you really have no idea where you are. You have traveled from one local pocket to another local pocket, with no understanding of the space through which you've been.
Math is painful to read because you're trying to follow the turn-by-turn directions, instead of taking control of your own understanding and working to gain knowledge and insight, scope and intuition.
You won't understanding it without doing the work, and complaining that it's badly written is the smallest part of the problem.
> The reason math is painful to read is because it's either obvious, in which case it's not painful to read, or because it contains new ideas that you need to work hard on before you can grok them.
The same could be said for code that uses only single letter variables and functions.
It'd be a lot easier with readable variable names.
Apologies in advance, but I don't have time to write this as clearly and effectively as I want to, but I felt I had to express an opinion here, and raise some points. Pascal misquoted: long letter, I lack the time to write a short one.
A difference is that when you're reading someone else's code you're usually doing it to try to fix a problem in a specific part of the code, and specifically not trying to understand the full setting, the full architecture, and the grand idea(s).
I could give you an example of a proof, explained in detail, easy to follow every step, and at the end you still wouldn't have gained any understanding of the field, the area, the ideas, or the underlying reasons why things are true. You will have been helped in your small scale specific task in getting control of one tiny aspect, but without working on it for yourself, gaining the insights for yourself, doing the work for yourself, you won't get the bigger picture.
When I work with code, especially someone else's code, I almost invariably end up replacing the long, descriptive variable names with single letter names within the scope of a page. It's easier to read a short line than a long line that's trying to describe, again, the meaning and purpose of every variable. Being able to keep definitions and meanings of variables over a page or two of scope is trivial, especially when you have an IDE that can remind you if necessary. Personally, I don't use one, but I've seen them used to good effect.
When working intensively on a piece of code, keeping track of what "v" means comes naturally, it's repeated often enough that you have effectively used spaced repetition and have learned it. Having constantly to skim over the full_description_of_the_entire variable_encoded_in_the_name becomes a chore.
And no, I don't check back in the version with the single letter names, because others don't share my style, ability, or point of view, and they insist on having verbose code that reads like a novel. That's fine for fixing small errors in local code, but that's not what math is about.
I find it interesting that every mathematician I've discussed this with says that coding and math are fundamentally different activities. Almost every coder I know seems to think somehow they are the same, even though they haven't done any really advanced math.
Yes. In reading other people's code, I've had coding style make the difference between taking about a day and taking about half an hour (two programs to do the same thing in essentially the same way).
val result = directProduct(cyclicGroupOfDegree3, finiteAbelianGroupOfDegree7)
and the second kind, people like me, who do this:
// Compute the direct product of 2 cyclic groups
val z = dP( cg1, cg2 )
You can easily guess that the 2nd kind are math majors. If my math professor started writing everything out in plain English like the first example, he'd never be able to cover a theorem per class. He'd still be writing the first lemma and the hour would be up.
So he resorts to 1-character symbols, precise notations, and terse comments. The idea is - if the reader doesn't grok direct products or cyclic groups, he's fucked anyways, so why bother ? And if he does grok them, why impose cognitive overload by spelling it all out in great detail, just use 1-character symbols and move on.
Now both these styles are in direct conflict with each other, and in the Fortran/C/C++ community during the 80s-90s ( when every respectable programmer had a copy of Numerical Recipes on their desk ), you would emulate the 2nd kind.
In the 2000s & later, people got rid of their Numerical Recipes & exchanged them for "14 days to EJB" and "Learn Python in 21 days" and "Master Ruby in 7 days" and the like...the community started becoming a lot less math-y and a lot more verbose, and style 1 is back in vogue. Nowadays I get pulled up constantly in code review for using single character names....but I think this too will pass :)
cyclicGroupOfDegree3 is a terrible name, unless you need to be really specific about degrees. cyclicGroup, please. (and I could imagine: cyclicGroup.degree() == 3)
dP is also a terrible name. My first thought goes to derivatives. directProduct is the right name. dProduct, dProd (if you like the Numerical Recipes style, expounded below) are better than dP, but still wrong for a library.
So first, let's assume directProduct is a library somewhere; maybe one you've even created. So let's reconstruct:
val z = directProduct(cg1, cg2)
Better, and more believable. And if the declarations of cg1 and cg2 are obvious (ie, the lines directly preceding) then you might have a case. I imagine directProduct(group1, group2) would actually be a happy medium. And if you use z in the next line or two, I'd let that slide.
The thing about Numerical Recipes is that often you're taking math syntax and coding it. Often doing so requires a good deal of commenting and temporary variables. One thing the book gets very wrong is it's function declarations (the function body is a separate argument) -- at the very least, rooted in a past of 80-character-lines and before code reviews. The first example in the book:
void flmoon(int n, int nph, long* jd, float* frac)
ought, in a modern era, be something like:
void MoonPhaseToDay(int n_cycles, int phase, long* julian_day, float* fractional_day);
If for no other reason than I can have some hope at understanding the second argument, or finding it again. You'll also notice flmoon is a misnomer -- it computes more than full moons!
The idea is - if the reader doesn't grok direct products or cyclic groups, he's fucked anyways, so why bother ? And if he does grok them, why impose cognitive overload by spelling it all out in great detail, just use 1-character symbols and move on.
This case analysis works if you only consider people who've been sitting in on the class the whole time, but someone who appears mid-way through the semester (or analogously, starts looking at some project's code without having been involved in its development) may understand group theory and still have trouble following the lecture.
// Compute the direct product of 2 cyclic groups
I would much rather give the function a name that describes its purpose (and give a full description in a comment at the function's definition) than annotate every use of the function.
Notation is what it is because it serves our ( ie. mathematicians') interest so well. If every one of us started writing
Integral Of Quadratic Polynomial = Cubic Polynomial plus constant
maybe we would increase the size of the audience a tad bit....but the downside would be, we'd move at a glacial pace & never make progress.
Things like direct product and cyclic groups are basic....almost trivial even.
If a non-programmer asks you "Why do you guys say
int x = 10;
float y = 0.2
Why not
x is an whole number who value is 10
y is a fraction whose value is one fifth.
you can sit down & reason with him for a while....but if he insists that everything be spelled down in such great verbose detail, you will at some point, pick up your starbucks coffee and say "Dude, this programming thing, its not for you. The average program is like 10s of 1000s of LOC and if I start writing everything out in plain English, I'm going to get writers cramp & file for disability insurance."
Trust me, math gets immensely complicated very, very fast. The only way to have even a fighting chance of keeping up is terse notation ( and frequent breaks ).
One reason for this schism is the lack of rigor.
eg. When a programmer says "function", he is order of magnitude less rigorous than what a mathematician means by that word. You ask a programmer what probability is, and he will say "you know, whether something will happen or not, how likely it is to happen, so if it doesn't happen we say 0, if it is sure to happen we say 1, otherwise its some number between 0 & 1. Then you have Bayes rule, random variable, distribution, blah blah...I can google it :))"
You ask a mathematician what probability is...even the most basic definition would be something like "a function from the sample space to the closed interval [0,1] ". Note how incredibly precise that is. By the word "function", the mathematician has told you that if you take the cross product of the domain ie. a Set of unique outcomes of your experiment, with the range, which is the closed interval [0,1], you'll get a shit-ton of tuples, and if you then filter out those tuples so that that every outcome from the domain has exactly one image in the range, then that is what we call "probability". And this is just the beginning...the more advanced the mathematician is, the more precise he'll get. I've seen hardcore hackers who've designed major systems that use numerical libraries walk out of a measure theory class on day 1, simply because they overestimate how little they know. Calling APIs is very, very different from doing math. The professor is like the compiler - he isn't going to care if you know or not what measure is or what a topological space is...its a given that you've done the work that's laid out in the pre-reqs, and if you haven't, go write an API or something, don't bother the mathematician....atleast that's the general attitude in most American universities I've seen. If you tell him "describe its purpose and give a full description" he will look at you as if you are from Mars, and then tell you to enroll in the undergraduate section of Real Analysis 101 :)
If a non-programmer asks you "Why do you guys say int x = 10; float y = 0.2 Why not x is an whole number who value is 10 y is a fraction whose value is one fifth.
Knowing what language we're working in is enough to know that's exactly what that code says. This is not the case in mathematical notation. C × A could be the direct product of two groups, or maybe the tensor product of two graphs, or perhaps the product of two lattices, or it could be one of who knows how many other things. The issue here is not high precision, but the opposite: heavy reliance on convention and context in order to be unambiguous.
"C x A" is a combination of C and A. C, x, and A are defined once somewhere at the beginning of the paper/book.
Should I take this (which doesn't actually hold in the general case) to mean that every document uses its own language? That isn't exactly good for readability. We certainly don't consider every software project to be written in its own language (and no, the semantics of C does not tell us what the dP function does).
I would argue that the self-documenting code in your first example (albeit excessively verbose to suite your argument) is better than having to put a comment above every terse statement one writes?.
Programming with descriptive and meaningful variable / function names that have intrinsic meaning skips this step: 'wait, what was the cg2 again? let me waste time by looking back through the code and figuring that out again'. Particularly for others reading your code later.
But I try to fit somewhere in between the two examples you've shown, meaningful but not too long -> the benefits of longer variable / functions names diminishes when names get too long as variable names tend to blend together (see law of diminishing returns).
To me, I highly prefer the first expression you've written (well, without the exaggerated style you've given it). The second may save you some time in the short run, but when you come back to your code in a year, you think you'll know what dP means? Differential of P? Distance to P? Distance from P?
I realize you've got a comment above it, but if you really code like that then you are very different from any code I've come across where the programmers used short, non-descriptive variable names.
As my coding has improved over the years, I've gone from an imperative, abbreviated variable style to a functional, long-named variable style. Even with the long names, my code is MUCH more readable and it's still about 2x less typing than the imperative style!
(Also, I don't think all math majors use that second style).
Math notation should be terse because you have to do symbolic manipulation on the expressions. It's easy to argue for longer variable names when you are just using the ready-made results. I think it's acceptable to use long names for programming.
I guess by degree you mean order? And by finiteAbelianGroupOfDegree7 you mean cyclicGroupOfDegree7. Of course, you should always use theorems to keep your notation consistent. :)
The problem with math isn't symbology or notation per se, it's that it's not evolved to take advantage of modern technology. It's an artifact of pencil-and-paper being the medium of choice for expressing ideas. We can do much better.
When you are looking at an equation, you are looking at the purest distillation of an idea. Underneath that equation sits countless layers of abstraction and reasoning. Why can't I peel back these layers and see them on my computer or iPad? The equation is the iceberg tip peeking out of the water onto the paper, I want to see underneath. Let me feel around and slowly and confidently fill in the places where my understanding is fuzzy. Prevent me from moving back up to the final equation until everything underneath is fleshed out and solid in my mind. And if bits start to fade, let me quickly react to snap them back into focus.
This is what we are doing mentally anyway when we flip back and forth in a math text referencing previous proofs and equations. Math needs it's "hyperlink."
Anything I say here has been said better and more convincingly by Bret Victor:
Though I will argue that at least in his Kill Math writings he's throwing the baby out with the bath water in a certain sense. Let's keep the symbology as a optimal way to encapsulate our knowledge for the amazing mental leverage it gives us, but give us a way to move up and down the ladder of abstraction in order to facilitate understanding.
Math is such a big field that there tends to be much re-use of symbols, and the limited number of Greek letters doesn't help (no one likes multiple-Greek-letter variable names).
To a mathematician reading the technical literature from his own field, there's no issue with math notation. Problems sometimes come up when a specialist in one field tries to read literature from a different specialty -- but that's true in computer programming too.
This may come as a surprise, but math notation is much more uniform than computer-science notation, including all the efforts to create generic program listings. Mathematicians are reluctant to invent new symbols, but it's sometimes necessary.
When I read computer program source code, I often wish the programmer had summarized the meaning of the code using math notation. I guess that puts me in a position precisely opposite that of the author.
To a mathematician reading the technical literature from his own field, there's no issue with math notation.
Let me guess. You haven't learned much differential geometry?
Constants vary by 2 pi depending on which definition you use. Functions might come before or after their arguments. And, of course, you leave everything that is "unambiguous" out of the notation. (Proving that it truly is unambiguous is sometimes nontrivial for the novice.) Guessing what notation a particular author is using has been known to be daunting, even for other differential geometers.
In grad school I remember sitting down with homework sets and spending longer working out what the question said than I did in solving it once understood.
That said, if you tried to actually write everything out explicitly, it would be impossible to read anything. Compact notation is a necessary evil - without it your brain simply can't think certain things. With it, it is hard for anyone else to figure out what you just said.
There's something to be said for the notion of a glossary or legend... have these been considered at all? Or a notational standards body of some kind, perhaps?
For me, it's because each symbol, letter or number is given a ton of weight in mathematical equations. You can't skip over and fill in the gaps later - like reading a book, for example - you have to understand each aspect before the whole makes sense.
The real reason math is hard to read is that it's the distillation of a thousand winding ideas and failed attempts that took hundreds of hours of thinking and hundreds of pages of scribbling, all condensed into the most concise possible three pages of unmotivated line-by-line proof with no context or explanation.
In other words, math is a language for formalizing intuition which, when finally written, unfortunately removes all traces of intuition and leaves only the end result of the thinking, not the path that it took to arrive at the critical insight.
That about sums it up. It's made worse by professors who have been trained to teach things in an unmotivated fashion.
I had a graduate probability teacher who said (paraphrased) "one of the beautiful things about math is that you start with a definition, and you prove some very simple things with the definition... so simple that you feel you're just using circular reasoning. Then you find that your definition is equivalent to a useful property!"
Actually, what happened when the stuff was developed is that someone wanted the useful property, and managed to rationalize a definition which fit. (example: group theory was done for hundreds of years before we had a definition of group. Noether noticed that a lot of folks were writing the same sorts of things in different contexts, so she distilled it down to three axioms.)
For a really concrete example, consider ordered pairs. A person being coy about their intent would say that "we define an ordered pair (x,y) on a set S as a set of the form {{x},{x,y}}, where x and y are in S."
It's safe to say that no one thinks of an ordered pair in that fashion. The intent was to have two ordered pairs be equal if and only if each of the two coordinates are equal. You can prove this from the above definition, but it's far more helpful to tell the audience in advance that we really want a structure with this property.
Which is a beautiful thing, no doubt. The creative process is amazing, but reading through years worth of scribbles isn't the point of the proof. Write a proof, then accompany it with a blog of sketchbook scans and a detailed walkthrough of getting there.
Math researcher here. I agree with the basic idea of this article, but not the specifics.
So: notation is a wonderful thing. If someone can improve on it, then great, but modern math notation is what makes math doable. It's true that there is poor notation and poorly used notation; these things need to be fixed.
Regardless, I agree that math is hard to read. I read a lot of math. It's hard. Part of this is because it involves abstract, complicated stuff that often must be understood exactly. But part is because we are not very good at writing it. Or perhaps we don't put as much effort into making it readable as we should.
Being someone who knows a fair amount about programming (or so I like to think) and also writes math research papers, I've found that some of the concepts used to manage complexity in software are also applicable to math.
In particular, I must agree about scope. I've made an effort in my recent writing to be careful about this issue. Another helpful idea involves how to write good comments in code. Too many mathematical proofs perform some operation without telling the reader why or giving any kind of overview. There is nothing wrong with saying, "Here's what we're going to do," and then doing it, and then saying, "Here's what we just did."
Really good point. That always annoys me with most formulas. I have to dig back several paragraphs (with no scope/indexing tool) to figure out what the individual variables mean.
And a lot of papers do not even bother to explain some variable but just assume the reader already knows it.
One of the earlier commenters made this mistake very prominently enthusing over Maxwell's equations.
Yes of course the formula looks better shorter when you already know what it means. But that completely misses the point that it was written down to explain it to someone who doesn't already know what it means. If you already do you're the wrong target audience.
Longer variable names would help. Or for online paper just have a tooltip that opens the paragraph that explains what the variable mans when you hover over it would also help a lot.
Or maybe some color coding to easily find it (there was a recent link here recent explaining FFT which used this trick very successfully)
But I guess most Mathematicians don't bother because they only write for a small circle of colleagues anyways.
Maxwell's equations were not "written down to explain it to someone who doesn't already know what it means". They, quite simply were written down to express relationships between the motion of charge, and the behavior of electric and magnetic fields as a consequence.
You have to understand the physics behind the equations (what's charge? what's an electric field) as well as the mathematical structure (what's a vector space? what's a line integral?) in order to make use of them. They are equations used to answer questions like "What happens to the strength of this magnetic field if I increase the velocity of this charged particle responsible for generating it"? They aren't meant to be explanatory (but have the beautiful side-effect of explaining how light propagates in a vacuum.)
I agree with you. A book that aims to TEACH should provide the most common form of the equation and provide detailed annotations of what everything is. (Wikipedia should be like this as well since I know no one is using it as a reference).
For a REFERENCE, you can just list the equations in their most commonly used form.
It seems to me like the author approaches math looking for a concise & elegant programming language, but instead just finds...math.
The notion of scope as an integral part of mathematics is a particularly interesting suggestion. Mathematics is the study of abstract, quantitative truths (and falsehoods), and these truths, being universal, have universal scope (although you can constrain some truths as special cases of more general truths, but I doubt this is what the author had in mind). To consider this a problem indicates that the author is searching for a tool rather than a way of thinking.
"Math is such a cognitive overload of operators" because it succinctly describes a great deal of action. A program in a conventional language like C might take a hundred lines to describe what could be described in a handful, using some mathematical notation. If you want a programming example, just look at the average length of an APL program vs the average length of a Javascript or C program.
Math is inherently hard. So is programming. Not many people can do either very well. Even fewer can do both very well. The notations of math are meant to be flexible because math is a tool for discovering new ideas and analyzing their properties. Programs are meant to accomplish some real world task. Criticizing the language of one solely from the perspective of the other is not a useful activity.
The notion of scope as an integral part of mathematics is a particularly interesting suggestion. Mathematics is the study of abstract, quantitative truths (and falsehoods), and these truths, being universal, have universal scope (although you can constrain some truths as special cases of more general truths, but I doubt this is what the author had in mind). To consider this a problem indicates that the author is searching for a tool rather than a way of thinking.
The scope issue is not just for theorems. In programming, scope is about what meanings get bound to what symbols, whether those symbols are types, operators, data, etc. Even in math, we can see a notion of "binding" vs. "bound" vs. "free" occurrences of a symbol: seeing something like "∀x∈N.x+k≥0" would make me think "Ok, I see what `x' is, but what's this `k'?" Treating every definition like it's at top-level makes for unreadable code and unreadable math. Ever had to search back through a document looking for the definition of some notation only to find it buried in an earlier proof of an unrelated theorem without so much as a "Definition: …" marker? I would much prefer not to have to do that again.
Transferred to another domain, "Why should we need scope? What good would that do?" becomes "Why should pronouns have antecedents? What good would that do?" It is important to acknowledge that mathematical notation is also a tool for explaining the ideas one has discovered with it.
Completely agree with you, but from your experience, how many fewer do both very well? I have always been interested in both and I can't imagine having one tool without the other, as they both enhance the way that I think about things. Programming makes the concepts of variables and sigma notation even easier to understand, while math makes it far easier to solve problems. Do people actually get by in programming without math?
> But are mathematicians too embroiled in some misguided quest for Huffman coding?
I think that yes, they are: for centuries, mathematics has been written by hand. It's much easier to write 'f' than 'force'. on a blackboard. Moreover, when writing on paper, with ink that you have to make or buy, with a pen that wears out as you write with it, it makes sense that early mathematicians would embrace a compressed notation. Scribes even did it, so it shouldn't surprise us that mathematicians would.
The compression is based on domain knowledge, and this is only problematic for people who are not intimately familiar with the domain at hand. Physics is similar. Decoding the rocket equation is only possible when you know what m1 and m0 are, or ve, and many properties, constants, and even operations are symbolized with a single greek letter. (Gradients, for example)
This is very powerful, because (as others have mentioned) it allows you to easily write more-complex things, and often times you (as a mathematician or physicist) prefer to think at a higher level of abstraction. It also makes it tremendously arcane for those of us who don't even know the names of the greek letters, let alone have a sound backing in that particular field of math or physics.
The compressed notation is not just about saving paper and ink.
It's all about the amount of mental power and work you have to exert. If I'm filling a few pages with equations, if I have to write 'force' instead of 'f' all over the place, it's going to take me about 5 times longer. I'm going to only fit 1/5 of the content on each line and page, (OK, 1/5 might be excessive, but the principle holds I feel).
Programming languages have similar properties. Why don't we always write integer instead of int, loop_variable instead of i, and have no compact way of writing unnamed lambda functions?
So this article is just the author complaining that he has no practice in reading mathematics. Would you be surprised if someone who never written a computer program before couldn't understand a complicated dynamic programming algorithm? Of course not, they have no practice or background!
And for the record, computer programming is much more precise than mathematics. Most people here don't do mathematics, so they don't realize how many implicit identifications we make between objects which are very very different. In programs, correct types are tantamount to the success of a program. In mathematics, we will readily identify rings with topological spaces and never think twice. Moreover, we need to readily be able to change the rules by which we consider two different things to be identical on the fly. It is simply a different way of thinking that programmers aren't used to.
Meyer's matrix analysis has an excellent, and relevant, quote:
The great French mathematician Pierre-Simon Laplace (1749–1827) said that, “Such is the advantage of a well-constructed language that its simplified notation often becomes the source of
profound theories.”"
If we ignore the cosmological constant contribution, for each of the 10 non-zero components of the Ricci curvature tensor, the sum of the component of the Ricci curvature tensor minus half of the corresponding component of the metric tensor multiplied by the Ricci scalar is equal to four times the ratio of the circumference of an arbitrary circle divided by its radius, multiplied by Newton's gravitational constant divided by the fourth power of the speed of light in vacuum and multiplied by the corresponding component of the stress-energy tensor.
Would this be clearer than the original [http://en.wikipedia.org/wiki/Einstein_field_equations]? After all, I don't use any weird symbols and avoid greek letters... Surely the author of that blog post would be satisfied by this ... Perhaps, I should include what I mean by Ricci curvature tensor ... but first define what I mean by a tensor, including the metric tensor, and explain other words like stress-energy tensor; it should not take more than a few lines of text for it to be completely clear, right?
Richard Feynman in one of his quantum physics lectures explains that he is going to abbreviate a few things because they are a pain to write out longhand on the chalkboard.
I think this is the main reason everything is so abbreviated, it comes from the era when doing lots of manual calculations by hand called for efficiency in notation over clarity.
This is an old chestnut. Mathematical notation is terse because mathematics doesn't consist entirely of notation! Look at any book or research paper; you'll find a great deal of text surrounding equations. Comparing this to code is nonsensical. Unless you're reading something in WEB, you are much more likely to encounter code than comments. There are certainly some unfortunate conventions and cruft in mathematics, but every generation of mathematicians rewrites in part what it inherited from its forebears.
The main reason it's hard to read math papers is because they contain complex, difficult ideas. Even professional mathematicians take a long time to read papers, and they usually won't be able to understand papers outside of their own specific field of math. I agree with the author that math papers could borrow some ideas from programming - explicitly defining all variables/functions, scoping, etc - but acquainting yourself with the conventions of the field alleviate most of those problems.
No doubt about it, mathematical notation can be difficult to read. It's terse, conceptually dense, operators are commonly overloaded, etc., etc.
But that's not why it's hard to read. Math is hard to read, I think, because math is hard:
(1) Some core mathematical concepts are fundamentally difficult. Notions like infinity (and different kinds of infinity!) are entirely foreign to our daily experience.
(2) In addition, mathematics is more abstract than most programming. Tim Gowers, in his lovely little book Mathematics: A Very Short Introduction, writes:
What is the black king in chess? This is a strange question, and the most satis-
factory way to deal with it seems to be to sidestep it slightly. What more can one
do than point to a chessboard and explain the rules of the game, perhaps paying
particular attention to the black king as one does so? What matters about the
black king is not its existence, or its intrinsic nature, but the role that it
plays in the game.
The abstract method in mathematics, as it is sometimes called, is what results
when one takes a similar attitude to mathematical objects.
This sort of abstraction is, I think, abhorred by many programmers. But it's the reason single-letter variable names are common in mathematics. There's no way to describe what a mathematical object is with a name relating it to some real-world, concrete, familiar thing, as programmers often try to do. You can only describe what it does by writing down some rules it obeys.*
(3) In math, concepts are inherently more hierarchical than is common in programming. As programmers, we're all familiar with building higher-level things from lower-level pieces. And we try, and often succeed, in building them up in ways that let other people use them without needing to know much of anything about the lower-level pieces. In math, there's a similar accretive process, but because the concepts are so much more abstract, it's often impossible to understand how a high-level concept behaves without understanding how its lower-level constituents behave. Mathematical concepts are conceptually hierarchical, whereas pieces of programs are often structurally hierarchical, but conceptually (nearly) orthogonal.
* — (A controversial aside.) An object-oriented mindset can exacerbate the problem by demanding that we reify concepts into objects that represent what they "are". This is often impossible, hence the inscrutable object names we've all laughed and despaired at.
Phil Karlton's well known witticism comes to mind:
There are only two hard things in computer science: cache invalidation and naming
things.
Naming things is hard because, just like in math, we sometimes lack a familiar semantic reference for the concepts we deal with as programmers.
Conversely, in functional languages, there's less need to think about what something "is", because the natural way of programming is to define relations (functions) on a small number of core data structures. Interestingly, single-letter variable names are also common in some functional languages, like Haskell, because it generally doesn't matter what the data is so long as you know what you can do with it. And that information is embodied in its type, expressed in its type signature, and enforced by the type-checker at compile time.
To your point on naming, the core of Hungarian Notation's thesis is that you can never get a name right, due to the creeping in of unfounded assumptions due to past experiences with that name, so you're doing more damage by trying. So, create a unique 3/4 letter nonsense tag (and define it), and get on with your life. Once you do this, it's crystal clear where the gaps in your understanding are, since if it looks like gibberish, there's a gap. And when you understand it, believe it or not, it doesn't look like gibberish. (Hungarian as originally defined, ie Apps Hungarian, not the horrible Systems Hungarian that came out of Programming Windows.)
It's a powerful yet under-appreciated idea. Of course the proof of the pudding is in the eating and so it does seem Hungarian takes it too far due to it's lack of success stories. But I can't help but think if that thread of an idea were not squished so resolutely that we'd be in a better place than we are now with the problem of naming.
> the natural way of programming is to define relations (functions) on a small number of core data structures. ...
> ... And that information is embodied in its type
If you encode information in type, you will have a very large number of data structures.
Hungarian notation for formulas would be terrible. It would make it that much harder to memorize, as well as increase the difficulty of seeing patterns between various formulas.
The real problem here is trying to force programming notation into mathematics notation. Programming variables emphasize what you're working with while mathematics variables emphasize ideas and relationships.
Math notation was't so much designed as accreted, and it shows in how awkward it can get. Dijkstra indirectly makes this point (among others) in "The notational conventions I adopted, and why" at http://www.cs.utexas.edu/~EWD/ewd13xx/EWD1300.PDF
It's hard to understand any programming language if you haven't taken the time to learn its syntax.
Unlike teaching programming, the problem I see in teaching math is that the syntax is slipped into the lessons without making a conscious effort teaching it.
If someone were awesome (maybe somebody like John Resig), they could come up with a math notation framework that started terse but could expand to more detailed definitions with a click. E.g., Starts with (capital sigma) but expands to 'Sum(1..n)' and then expands again to 1 + 2 + 3 + ...n. Clicking the + changes it to 'add' which maybe links off to Wikipedia or Khan Academy. Clicking the numbers might change them to a pictograph of a group of widgets, etc.
A math IDE. I think Wolfram Alpha may do just that, then again dealing with formal maths is not my thing.
An IDE makes sense if it can encapsulate the symbols and terms you're conveying. How many programmers sit down with notepad or vi and write huge programs? Yes, some do (and I have), but it's far easier with the IDE.
I never posted before but this article made me so mad I had to jump into the comments!
I am so glad to find everyone pointing out all the reasons the article is wrong, actually there is a highly relevant link on math.stackexchange about this particular issue:
I agree. I always felt that if math theorems were presented in a programming language, they would be way more understandable (especially if the good practices are respected: meaningful variable names, etc).
Mathematicians love to talk about rigor, but when it comes down to write their ideas, they often have little enough of it.
I always felt that if math theorems were presented in a programming language, they would be way more understandable ….
Find out empirically! Look at theorems in some formal methods/programming languages papers and their corresponding mechanizations and see which you find more understandable. (Actually, it might be better to start with something like Software Foundations to get a gentler introduction to a programming language used for this sort of thing -- http://www.cis.upenn.edu/~bcpierce/sf/ )
Hard to do, the one you will look at second will always be at an advantage. So you'd have to select two theorems of "equivalent difficulty", which is not easy. Also you need to select hard to understand theorems, else you'll understand both the representations easily anyway.
People seem to be pretty critical of this post, but can anyone give me a good reason why math notation shouldn't have some concept of scope? I can't see any downsides. It would certainly help me puzzle out my professor's notes when he has used f to mean three different things in as many lines.
There is some concept of scope in math. At a fairly early level, a student learns sums and integrals. The variable of integration/summation's scope is just within the integral/sum.
At a higher level, some things have scope equal to a subfield of mathematics. A gothic p, for example, represents a prime ideal in algebraic number theory. It represents a parabolic Lie algebra in Lie theory. (Actually this causes problems when you're doing Lie theory and number theory at the same time.)
I think for many, the problem is pacing. When you're reading text, your eyes are moving down a line at least once a second. But reading an equation takes much longer. You have to stop your eyes and ponder every piece of it. It can help to copy it down on paper.
Disagree strongly. Math is often the easiest and simplest thing that works for the class of problems it exists to solve.
Imagine trying to explain singular value decomposition without the notion of a matrix. No group theory, no well-formed concept of a linear transformation or function. You wouldn't be able to do it. No one even had such ideas before generations of mathematical machinery had been built.
I've come to the conclusion that there are lifer languages (/frameworks/technologies) and start-today languages. C, Lisp, Scala, and Haskell are lifer languages. (Now q, for those who've worked in finance, is even more of a hardcore lifer language.) You'll probably find them painful when you start out, but enjoy them a lot more once you really get them. Python, on the other hand, is much more of a start-today language. It's more modular, you can get up to speed quickly, and it's good enough the vast majority of the time. Also a start-today language is Java, with an IDE. PHP is a start-today language. This also applies to tools. Emacs or vi (both being keyboard-driven) are lifer tools. Unix is a lifer tool. IDEs are start-today tools.
This isn't about "better" or "worse". It's about tradeoffs: with a lifer language, it takes at least a few months before you get it (you start off hating it, because it feels weird and limited) but it becomes immensely powerful once you understand it in a deep way. Start-today languages give you a lot of power to start, but then the learning curve flattens. What's better is unclear. The lifer languages have a lot of benefits and depth, but in fast-moving disciplines like web programming I'd prefer that the frameworks be start-today rather than lifer-oriented.
I think you make a really good point about the power of mathematical notation and conventions.
However, I completely agree with the article.
Do mathematical formulas have to use Greek letters rather than useful variable names like "distance" or "speed"?
Does C's syntax actually allow you to express something you can't in Python? Or is it more terse for historical reasons? (And never mind that it's a good idea to use descriptive variables and function names in either language!)
I think the keyboard-driven vs GUI tool analogy is very apt. vi is very powerful. Would the average user like it much if we made vi the interface for all text forms in browsers? I think sometimes it's desirable to present a friendlier form of a thing. (You obviously agree regarding web programming.)
So I think the complaint is this: why is math always expressed in the tersest form? Is it really necessary?
Why does math have to be restricted to the lifers!?
Mathematical formulas with real-world interpretations can use those variable names. The problem occurs when the thing you're trying to express has no analogue in the real world. What variable name should I give to a reduced Noetherian subscheme of an arbitrary scheme? If you think "reducedNoethereianSubscheme" is a better variable name than X, then you'll love writing the notation for a function on it:
> Do mathematical formulas have to use Greek letters rather than useful variable names like "distance" or "speed"?
I feel that using "useful variable names" like you describe would actually make it less useful because they overlap with common language and writing them out in say, formula, would make them too long (and thus less readable).
I find that for any common topics, there's a subset of "default" variable names that are used so often that when you see it, you pretty much instantly recognize what it's meant to represent anyway. Using Greek letters is no problem as you get used to it eventually, and they make things stand out more than simply using the 26-characters Latin alphabet.
Also depending on the discipline you're working in you can make very confident assumptions about what a random mathematical object is depending on which language it is typeset in. In the context of programming language type systems, for example, a Greek letter is almost always an ML style type variable (think Haskell type a -> b -> a stuff) whereas Roman type is almost always a ground object (int, bool, whatever). Vectors are boldface, groups are capitals, fields are blackboard bold, lowercase letters in a group context are almost certainly group elements... In some contexts you're actually reaching for new alphabets (e.g. the Hebrew letters that are used for infinite cardinals) to be even more distinctive.
Do mathematical formulas have to use Greek letters rather than useful variable names like "distance" or "speed"?
The moment you are talking 'distance' and 'speed', you are talking physics (aka applied mathematics), not mathematics.
The same math might see applications in economics, where the variables are better called 'piggy bank target' and 'weekly savings', or in biology, where they call them 'weight needed to survive the winter' and 'excess food intake' or elsewhere in physics, where they are called 'velocity' and 'acceleration'.
Mathematics is abstraction; that is what makes it applicable in diverse fields.
In math, you have a trade-off between entropy (in the information-theoretic sense), rigor, and size of the document (e.g. proof). Entropy loses, so you get something extremely terse.
This is one of the hardest things about reading math. It's not the notation. You can pick that up pretty quickly. It's that every character counts: math is high in entropy. We're not used to that. In fcat, if you pmterue the inenr ltertes of tyicpal wtterin psore at rnoadm, msot pleope can read it at csloe to nmroal seepd. However, in math, one has to pay attention to every jot, because y-hat (ŷ) means something different from y.
What is x? What is a? The wonderful thing about math is that it doesn't matter. x could be length, dollars, area, volume, mass, potatoes, lines of code, chickens, or electron-volts. But in the general case, it represents a number. Why would anyone consider it an improvement to write
Given that number1 answer^2 + number2 answer + number3 = 0, answer = (-number2 ± sqrt(number2^2-4 number1 number3))/(2 number1).
The extra characters convey no additional semantics. Even when presented with an equation where things have specific units, you can usually mentally figure out the units of everything else, due to conventions, previous definitions, and mental dimensional analysis.
In my experience, the notation isn't so much of an issue as knowing definitions, closely followed by being able to translate those definitions into intuition. You must be able to remember them exactly, otherwise nothing makes any sense, and you must have an intuition for what they mean, otherwise you'll never get anything done. Consider some terms from a recent talk I attended that was outside my immediate areas of expertise: "solvable Lie group", "left-invariant metric", "upstairs", "double cover". I was able to understand the main idea of the talk, but my full understanding of the talk was sunk by not knowing the definition of left-invariance [1].
Since math is inherently abstract, it is hard, and there is no substitute for the hard work necessary to acquire an intuition for it. When doing high-level math, it is necessary to have a rigorous intuition for the subject, where you are able to intuitively see a path to a proof, and then are able to translate that intuition into a suitably rigorous argument.
[1] To illustrate my point that knowing definitions precisely is one of the keys to understanding math, here's the definition of a left-invariant metric. Let G be a Lie group with metric <,>, L_g be left multiplication, and L_g^* denote the pullback of L_g. The metric is said to be left-invariant if L_g^* <u,v> = <u,v> for all u,v in G. It makes no sense unless you know what metrics, Lie groups, left multiplication, and pullbacks are, and you'll only shoot yourself in the foot if you can't define them precisely.
Take Maxwell's equations for example. After learning vector calculus and becoming familiar with its notation, you notice how much meaning you can extract at a quick glance from the four Maxwell's equations expressed in differential form. The purpose is not to obfuscate the meaning (like the author implies when he mentions the Obfuscated C contest) but rather to reveal/encapsulate as much information as possible in a way that is easy to understand at a glance. You want to know what would happen if magnetic monopoles existed? Easy to do by just adding a few extra terms that are obvious by symmetry of the equations.
And, if you go beyond vector calculus, you can see an even clearer picture by writing