Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a lovely (edit: having been in similar shoes, also terrifying-in-the-moment) example of a broader problem in teaching mathematics: the language we use to describe mathematical reasoning is a natural language, like English or Latin, and therefore full of the sorts of bizarre irregularities you'd find in a natural language. Mathematics is also a language about rigorously and precisely defined objects. The conceptual shear between those two things is murder for lots of students.

Just like Tacitus omits his verbs (!), when we describe fractions we often omit the implicit definition of the whole. Turns out that's a problem for many students.

It's a bit like trying to learn a context-dependent programming grammar with an inconsistent API, but worse, because it's your first "mathematical" language so you're also trying to learn what the abstract objects the language manipulates are.

Some other lovely examples:

3(5) means three times five. 3(x) means three times x. 35 means three times ten plus five. 3x means three times x. x(3) means that x is the name of a function taking, in this instance, 3 as its input.

x^{-1} means \frac{1}{x}, but f^{-1}(x) doesn't mean \frac{1}{f(x)}.

\sin{30}. Radians or degrees? Probably the writer means degrees, but there's no way to tell.

There are many more.



Why can't we have mathematics devoid of these ambiguities? One reasons is that humans have small working memories, and novice mathematics students have even smaller ones. The ambiguity of notation, while confusing, once mastered, allows us to write shorter expressions, whose meaning we resolve from context, and which become both easier to write and to understand.

A second reason is that while mathematical logic is rigorous and precise, unequal mathematical objects are similar to each - even objects that on first glace seem nothing alike. For instance, the two types of inverses you mention, or sets and linear spaces, or groups and tetrahedrons. And because of the diffuse nature of mathematical objects, it is inevitable that the same notation will be used for unequal objects. Because, it advances our human understanding of mathematics to use the same notation for two unequal but similar objects.

This second reason, once understood, is one threshold between the mechanical mastery of the intermediate student and almost artistic use of mathematics by the advanced student.


As you say, the artistic use of cross-object synthesis and analogy definitely distinguishes the advanced students from the mechanical. Advanced students develop a sort of dialect that harmonizes the mathematical objects they encounter in a way that illuminates them all.

More than any of that, though, I think what you can see (even here in this thread!) is that, while we pretend that mathematics is a single cultural practice of rational humans communicating with other rational humans, it's really many smaller communities of mathematicians, all of varying skills trying to communicate with each other. My mathematical language as a teacher of 12-18 year olds is very different from my mathematical language when I did computational geometry for a living.

Because you have many communities of mathematicians producing new notation, you end up with dialects that all sort of meld together in the same way that reading Shakespeare is very different from reading Hemingway or Eco (in translation).

The closest programming analogue would be C++, where you have several mutually unintelligible dialects spoken by different communities of programmers with different concerns.


I think it's a lot less beautiful. I think it has to do more with historical accidents. We've stumbled our way forward in mathematics; there is no grand plan that unifies our notation. Just look at calculus for a prime example - df/dx and f'(x) come from two different lineages and get used interchangeably; the df/dx notation can be intensely misleading when students think that (for example) they should be able to use normal fraction rules.


Of course, there is no grand plan, and yes, a lot of notation is historical accidents. But my point is that attempting to "fix" it will probably yield a better notation, but not the "perfect" notation.

As an aside, df/dx treated a fraction is used as soon as second or third semester of uni when they learn how to solve differential equations using separation of variables, and physicists/chemists start using total derivatives for thermodynamics. I am not aware of notation different from df/dx where these subjects would be just as clear.


Another is sin^2(x) meaning (sin(x))^2, but by a more intuitive reading it should mean sin(sin(x)).

I don't know what exactly is being gained by the usual notation: surely better clarity should be preferred over the time/effort saved in writing the extra pair of paranthesis, and I would prefer it be written as (sin(x))^2 always.

Thing is mathematics, and mathematical pedagogy seems hardly concerned with such rampant notation confusion plaguing much of maths. Perhaps moving to some type of machine-readable notation will be better for consistency and avoiding of much notational confusion.


When actually working with trigonometric identities, using parentheses typically gets unwieldily very quickly. Same reason sin(x) often becomes sin x.


Then perhaps we should drop all parens and move to prefix notation.


Does f^2(x) in any scenario mean (f(x))^2? Usually the square of a function, with named arguments is written as f(x)^2. So, not clear on the confusion.


Yes, in trigonometry sin^2(x) is commonly written to mean (sin(x))^2


> x^{-1} means \frac{1}{x}, but f^{-1}(x) doesn't mean \frac{1}{f(x)}.

That notation for inverse functions is truly appalling. I don't know how the first mathematician to think of that didn't immediately discard it as nonsensical and misleading.


It's not because 'function powers' make sense, yhey are just iterated function application. That's how they work for the natural numbers and when you extend that to the integers, you immediately get f^-1 for the inverse.

Notation in higher level maths is almost always very ambiguous. Because many concepts are analogues of each other and to reflect that notation is just taken from the analogue. Within a single domain, (like high-school arithmetic) you will usually not have this ambiguous problem. But once you move past that, it is something to get used to.


The idea that "when you extend that" is something human beings can do is an absolute revelation to young mathematicians. The idea that our notations are, at least in some sense, choices that we make that come with tradeoffs is a huge point of mathematical maturity for them, and usually causes my students to look back with either deep awe or deep suspicion when they realize that even our choice of base 10 is a choice among many and we can make other choices.

As von Neumann said, very much something to get used to.


> It's not because 'function powers' make sense, yhey are just iterated function application.

To be clear, I think you aren't saying "It's not because 'function powers' make sense" (which seems to apply that's not the reason it's done, and possibly that the reason isn't correct), but rather "It's not [an appalling notation], because 'function powers' make sense"—to me, that extra comma changes the meaning!


Indeed!


They’re both inverses: one with respect to multiplication, the other with respect to function composition.

In abstract algebra, we observe that there are many types of “products”: multiplication, addition, function composition, composition of rotations, matrix multiplication, etc. A common, unified “power” notation for repeatedly taking the product of a single element, or of its inverse, has some value.

There is some ambiguity, since multiplication of functions can also often make sense. This is usually resolved by context, or by being explicit where the context isn’t clear.


> one with respect to multiplication, the other with respect to function composition

And, of course, multiplication is a function so one is just a specific instance of the other.

Infix notation is another blight on the mathematical notation landscape. The amount of human effort that has been put into figuring out how to parse a+b*c is staggering. All of this confusion could have been avoided if we'd just started with s-expressions in the first place.


Multiplication is a function of numbers, and function composition is a function of functions, so neither is an instance of the other (unless you take the unusual perspective of thinking of numbers themselves as functions, but I don’t think that’s what you meant).


> I don’t think that’s what you meant

Why not?

Multiplication is a composition of addition, and addition is a composition of the successor function. The successor function is not a composition of anything, but you can define it in terms of set theory if you don't want to simply accept it as a primitive. But sets are functions too.


In both of these instances, I think that you are using composition to mean iteration. Iterating is repeated self-composition, so it is composition, but it seems likely that you meant the more specific term. (For example, "the successor function is not a composition of anything" is definitely false in the literal sense of composition. It's also false in the literal sense of iteration, but there it's clear what you meant: there's no natural operation which we iterate some number of times greater than 1 to get the successor function.)


> you meant the more specific term

No, I meant what I said. The fact that iteration is a kind of composition is true, but it's a tangent from the point I was trying to make, which is that infix notation is a Really Bad Idea. No one in their right mind would use it if they were not indoctrinated into it.

> "the successor function is not a composition of anything" is definitely false

That's news to me. I genuinely thought that successor could legitimately be considered a primitive. What is successor a composition of?


> Multiplication is a composition of addition, and addition is a composition of the successor function. The successor function is not a composition of anything, but you can define it in terms of set theory if you don't want to simply accept it as a primitive. But sets are functions too.

> the point I was trying to make, which is that infix notation is a Really Bad Idea

Not to be snarky, but, reading these two (from your two thread-successive posts https://news.ycombinator.com/item?id=23312725 and https://news.ycombinator.com/item?id=23314776) in succession, I still can't see anything about the first one that indicates the point that infix notation is a bad idea. Not that I'm disagreeing with the point, just that I can't find it in the first post. Could you clarify the connection?

> What is successor a composition of?

It's a composition of, for example, itself and the identity function, like everything else; or you could view it as a composition (-1) . (+2). Those kind of silly solutions are why I thought you meant 'iteration'.

Iterative solutions are easy if you don't restrict yourself to natural numbers—for example, (+1) is (+(1/2)) composed with itself—but that's clearly not what you meant. (As soon as you leave the natural numbers, even the idea that addition is iterated successor becomes false.)

If you do so restrict yourself, then it becomes true that the successor is not a non-trivial iterate. (I just skated the edge of claiming the opposite in my post, but avoided error by not specifying what domain I meant. That's just luck, though; I meant a particular thing, and I was wrong. To prove it, supposing you start your natural numbers at 0 and that f is a function such that f^{\circ k} is the successor function for some k > 1, then note that f is injective (because a composition power is). If f(0) = 0, then succ(0) = f(f(0)) = f(0) = 0, which is a contradiction. Put n = f(0) and note that f^{\circ n k}(0) = succ^n(0) = n, but n k > 1.)

By the way, you quoted (https://news.ycombinator.com/item?id=23314776):

> > you meant the more specific term

Just to be clear, what I said (https://news.ycombinator.com/item?id=23314530) was "I think you meant the more specific term". And I was wrong, but I intentionally didn't just assume I knew you what you meant!


> I still can't see anything about the first one that indicates the point that infix notation is a bad idea.

I didn't make a very good argument for it. I really intended that to be more of a throwaway rant than a serious critique. But since you ask...

There are two problems with infix:

1. It's hard to parse. It requires precedence rules which are not apparent in the notation. In actual practice, the precedence rules vary from context to context and this causes real problems. It's an unnecessary cognitive burden that pays very little in the way of dividends (a few less pen strokes or key strokes).

2. It obscures the fact that infix operators are just syntactic sugar for function applications. It leads people to think that there is something fundamentally different about a+b that distinguishes it from sum(a,b) and this in turn leads to a ton of confusion.

> that's clearly not what you meant

Indeed not. I meant the successor operator as defined in the Peano axioms.

> you quoted

Yeah, sorry about that. When I first replied, I thought you were the same person who posted the grandparent comment. My first draft response turned out to be completely inappropriate when I realized you were a different person, but some of my initial mindset apparently leaked into the revised comment. My apologies.


I didn’t think you meant that numbers are functions because you said: “multiplication is a function so one [multiplication] is just a specific instance of the other [function composition]”.

Thus I thought your point was that, since multiplication is a function, it’s a form of function composition. But that wouldn’t follow, for the reasons I said.

As for multiplication being “composed” addition, addition being “composed” succession, etc.: The multiplication function (at least of integer arguments) is composed of addition. Function composition is a function that returns a composition of two other functions. The way you’re using the term blurs the distinction between the thing that is composed and the thing that does the composing.

Function composition is a specific operation that takes two functions, f and g, and returns the function f∘g defined by the behavior (f∘g)(x) = f(g(x)). The function which does function composition is not itself a composed function, and the act of multiplication is the act of a composed function, but not an act of function composition.


> The way you’re using the term blurs the distinction between the thing that is composed and the thing that does the composing.

Yes, that is intentional. Both of these things are functions. Even numbers are functions, they just happen to be functions of zero arguments, or functions that ignore their arguments, or functions whose value is constant regardless of what the argument(s) is(are). It's all the same thing. That is the whole point.

Numbers and addition and multiplication happen to be particularly important functions, but they are not structurally different from any other functions. Giving them special notation, especially when you are first introduced to them, obscures this fact. This kind of mental damage is very hard to recover from in later life. I believe it's one of the reasons so many people think they hate math. Math can be beautiful and elegant, but the standard notation used for school-book arithmetic is arbitrary and perverse, a bizarre accident of history with no actual merit.

IMHO of course.

[UPDATE:]

> Function composition is a specific operation that takes two functions, f and g, and returns the function f∘g defined by the behavior (f∘g)(x) = f(g(x)).

Function composition is a function, no different from any other function. There is no more reason to use infix notation for it than there is for any other function. In fact, if you drop the infix notation it immediately becomes obvious how ubiquitous and non-special function composition actually is:

  compose(f,g)(x) = compose(f)(g)(x) = f(g(x))
On that view, the COMPOSE function is actually the identity function!


You seem to be jumping around in terms of what position you're taking. Is times(a, b) function composition because a and b are functions, or is times(a, b) function composition because it's composition of the addition function, or is times(a, b) function composition because multiplication is a function? Those are very different assertions, with very different implications to the original discussion, and so far as I can tell you’ve made all three of them. Maybe they're all true, maybe one or two of them is true, maybe it's a matter of perspective which are true. But I don't think you're being clear or consistent in which assertion your position is based on.

And no matter what, there's still a difference between composing a with b, and composing b with itself a times, which is what I mean by the distinction between composition and iteration (sort of like the distinction between a brick and a brick wall).

At this point I feel we're going around in circles, so I'll bow out.


> Those are very different assertions

But they are not mutually exclusive. That's the whole point.


> I don't know how the first mathematician to think of that didn't immediately discard it as nonsensical and misleading.

Maybe they did.


A good example of this can be found in an incredible algorithm with a terrible name: Fast inverse square root.

It sounds like an algorithm for computing the inverse of the square root function, so one might think it's for squaring non-negative numbers, or something along those lines. Not so. It's for computing the reciprocal (the 'multiplicative inverse') of the square root of a number. [0]

Related to this: the way a superscript '2' next to a function means the function shall be applied twice (that is, composition with itself)... unless it's a trigonometric function, in which case it means the square of the result. [1] [2]

[0] https://en.wikipedia.org/wiki/Fast_inverse_square_root

[1] https://en.wikipedia.org/wiki/Function_composition#Functiona...

[2] https://en.wikipedia.org/wiki/List_of_trigonometric_identiti...


That one is clear to me, although it might be for a reason you find displeasing: if the thing being computed was an "inverse square root" in the sense of a functional inverse, then you'd be computing a square and it would make much more sense to call it that instead.


There is logic to be had. Think of "x" as being the process/operation of "multiply by x". Now you want to invert that, so you multiply instead by x^{-1}.

Raising to the power of "-1" means that we are inverting the operation in question, so inverting the application of function "f" to argument "x" will be to apply f^{-1} to x instead.

Now I'm not saying it's good notation, but some of the fields I've worked in, the notation is not only OK, it's genuinely empowering, as a good notation should be.

I have a lot to say about notation in math, but this isn't the forum. There are a lot of crimes committed, agreed, but some of the things people pick out are only bad because they don't know the context where they redeem themselves.


> I have a lot to say about notation in math, but this isn't the forum. There are a lot of crimes committed, agreed, but some of the things people pick out are only bad because they don't know the context where they redeem themselves.

Indeed - I wish we could conceal those notations until the redemptive context became more apparent, in the same way that Latin teachers can conceal Sallust's inconcinitas until students are ready for it.


But the inverse function is the "multiplicative inverse" in the group of functions with composition as "multiplication". In that way, it makes a ton of sense. It's only a problem because you are mixing together two group operations.


To me it always seemed pretty natural. \frac{1}{f(x)} would be (f(x))^{-1}, by analogue to x^{-1}+1 vs (x+1)^{-1}. I would have been very surprised if raising just the 'f' part of the function application expression to some power were to mean raising the whole expression to that power.

I also would expect f^{2}(x) to mean f(f(x)), not (f(x)) * f(x)) (which would be (f(x))^2).


> \sin{30}. Radians or degrees? Probably the writer means degrees, but there's no way to tell.

I once had a professor that insisted that sin(30) meant sin(30 pi) in radians, with the pi being implicit. Unsurprisingly, it was the worst class I've ever taken.


Are you sure it wasn't sin (30 pi / 180), because that actually makes sense.


Absolutely terrifying. How did he communicate with other peer mathematicians?


I honestly don't know - it was an intro level physics class.


Mathematicians understand how to read definitions of notation.

If all of your trig is fractions of pi, writing pi redundantly everywhere is not useful.


…but that's not a fraction of pi?


30pi is a fraction of pi.


Is that commonly done?


It is not. Though it wouldn't be too hard to adapt, I think, if it was noted at the beginning of a text. For it to take, it would have to power some influential result, sort of like how the summation convention in tensor notation works (something that infuriated me as a kid).


Allow me to introduce modern theoretical physics, where comments like "X is obvious in context" never is, except to the person that wrote it. Don't worry though, they helpfully add that "When X isn't what it seems in context, this will be called out". Okay. "Sometimes it may not be." Wat?

Of course, every individual researcher, or at least each individual Physics department has their own conventions, and the conventions are critical to the meaning. It's like the programming paradigm where the naming of functions invokes "magic glue" instead of using strongly typed interfaces. It's unbelievably confusing to the uninitiated.

If you're trying to "cut across" a bunch of theories being worked on by different groups, you basically have no hope. Everybody ends up being super specialised not only to a specific sub-field of study, but to a specific research group.

I just watched a 1 hour lecture on extending GR by some physicist last night. He was reading the equations out loud, and at one point he was making noises like the following non-stop for about 2 minutes: "Eta mu nu, one minus one zeta nu mu, mu nu eta zeta one". It was ludicrous.


I also remember having constant confusion of what were assumed associations in written bigger formulas especially involving the functions and the practice of not writing the braces, and before that, if 3 1/5 meant three times 1/5 or 3 wholes and 1/5th.

I thought it was only on these low levels of educations that people settle on such confusing practices.

Later on the university I've discovered that it's the same on all levels. The notations used outside of programming are simply very context-dependent, and it's pity that that dependency is not made clear more often. Also, some practices directly come from some historical uses, and different paths to the current notations still end up to the similarly looking but different meanings.


I think x(3) would more commonly mean x times 3. f(3) would be function application. Even more context dependence.


Right.

Of course, if you write x(3), other mathematicians should frown at you because you're making bad notational choices.

It's a bit like explaining to students that the real way to know which 3rd declension nouns are i-stems in Latin is to say the genitive plural both ways. The one that doesn't sound wrong is correct. But you have to have a lot of time in the language for that to work.


Well unnecessary parentheses are often used to indicate a substitution has happened. E.g. in a topic I just taught I would write things like ∫_{y=0}^3 x dy = [xy]_{y=0}^3 = x(3) - x(0) = 3x. In context I think it's perfectly clear and a good notational choice.


And so it is.

edit: I suppose, what I'm trying to get at, perhaps too glibly, is that audience matters terribly much in mathematical writing. In the same way that Latin students don't start with Tacitus or Sallust, famous for their idiosyncratic grammar, math students shouldn't jump into the full context-dependent mess of the notation that experienced mathematicians use.

But I think we often thrown them in unintentionally because we're so used to it.


You used too little context when arguing with an audience who isn't used to that style of writing...


Also, there are actual quite reasonable rules to know which 3rd declension nouns are i-stems, so it doesn't seem too right too just say "follow your gut".

Btw, I wouldn't abuse latin comparisons on an american forum, I don't think it's quite in the culture ^^


x(3) is a function "x" being applied to the number 3, because "three times x" is 3x.

Of course context would help resolve this if you have a function named x or not, or a variable named x or not, and if you have both a function and a variable named x, well, you worked for your confusion and you have obtained it; congratulations. :)


I'm sure some would take it that way as "x" isn't commonly used in mathematical notation as a function name, the norm is to use f, g, etc.

However, the notation is relatively clear notwithstanding. It almost has to be x as a function with an input of 3.


I think x(3) would more commonly imply a function of 3, since you'd virtually always otherwise just write 3x.


It depends on the context. As an intermediate step I definitely write things like x (3) meaning multiplication, as it can more clearly indicate what's just happened (see my other comment in this thread).

There's other context too, based on what is known. Up to a certain point in first year at my university, most engineering students haven't ever seen functions named x and y, and so they'd mostly interpret x(3) as multiplication. Then we show them parametric curves, and suddenly x(3) looks like the x coordinate of a point on the (x(t), y(t)) curve.


This is one of the reasons I annoy people by following Wolfram’s convention in Mathematica of using square braces to denote arguments passed to a function: f(x)=fx=f×x while f[x] means “apply the function f to the argument x”. An unusual convention it may well be, but at least it’s one devoid of ambiguity.


It's not devoid of ambiguity, because f[A] usually means the image of the set A under f.


Not at all. f[x] is clearly the xth component of the array f :-D


On the blackboard in the mathematics department it’s not.

But I get your point.

I have also heard the objection that [x] is a 1×1 matrix whose only element is ‘x’.


Square brackets can also denote the floor function (especially in old works where '⌊' and '⌋' were typographically unavailable).


An array is merely a function from N to some set.


If you meant "function", would you generally write 𝑓(3) instead of f(3)?

(I have no mathematics background past K-12)


So, have been any sufficient effort in design a new syntax of maths that could help the regular folk?

If we learn new syntax(langs) all the time then push for a new syntax(math) could not be that bad of a idea...


There are some gentle efforts in that direction (tau radians for example), but mathematical notation is old (except the parts that are new). And it's really hard to strike the right balance between concision and ambiguity.


I think it's important to know that this really isn't true. Maths at all times is a subjective language. Maths notation is imprecise "intentionally confused", or just ad-hoc defined all the time.

When you see something like.

    f(x) = summation(x^n, n=0, 10)
We conveniently ignore that this polynomial is defined at x=0 despite 0^0 not making any sense by ad-hoc defining 0^0 = 1 in this context.


> We conveniently ignore that this polynomial is defined at x=0 despite 0^0 not making any sense by ad-hoc defining 0^0 = 0 in this context.

What do you think the polynomial is? I ask, because in all situations similar to this that I've encountered it made sense to define 0^0 as 1, not as 0. If you genuinely have a case where 0^0 = 0 makes consistent sense then I'd be interested in understanding it.

So, what do you think the summation actually is when expanded?


Sorry you're right, it should be 0^0 = 1.

An example where 0^0 = 0 occurs when dealing with areas. The measure of the real line in the plane is zero but it's also a rectangle with sides (0, inf) and we define the area of a rectangle to be l*w.

You usually see this written as inf × 0 = 0 but you sometimes you see the interpretation as 1/0 × 0 = 0^0 = 0. And you know this is a an ad-hoc definition because you're not allowed to algebraically manipulate it at all.


It makes perfect sense, everyone knows:

0^0 = 0

1^0 = 1

0^1 = 1

1^1 = 0


That's just lazy, and I haven't ever read a good text use summation where any of the terms would be outright undefined if evaluated.


You've never seen a textbook present the series form of e^x?

    exp(x) = summation(x^n/n!, n=0, inf)
exp(0) = 1 is perfectly valid but the identity above only holds for x=0 only if we define 0^0 = 1. And we do!

Fair, infinite series are pretty esoteric. How about derivatives. The power rule:

    d/dx x^n = n x^(n-1)
This identity doesn't hold for n = 1 and x = 0 unless 0^0 = 1.

Eh maybe not, programmers don't use calculus that often. But surely statistics!

    (1 + x)^n = summation((n choose k)x^k, k=0, n)
Take x = 0 and n = 0

    (1 + x)^n = (1 + 0)^0 = 1 = 0^0 = x^0 = x^0 + ... + x^n
0^0 is undefined in general but locally we sometimes need to define it.


Ok, fair, I think I have seen those. Personally I would prefer noting the special case even when using the fairly standard degenerate case of 0^0=1, but I agree that a lot of people don't.


actually 0^0 is usually chosen equal to 1... which probably still supports your argument.


Corrected! Thank you!


What isn't really true?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: