Ah, an obscure point of absurdity which utterly kills my pending interest in the language. If this sort of thing exists under the hood, revealed only by a detailed analysis of the specification, what other nonsense is there? Going so far as analyzing a string to determine whether it consists entirely of numbers for the non-sequitur process of then and only then converting it to what it isn't for logical evaluation is working pretty hard to do something counter-intuitive; might be tolerable if it actually preserved all digits, but not only does it work hard to convert a string to an integer, it then converts large integers in to floating-point values - not just one, but two layers of explicitly undesired and unnecessary and unreasonable typecasting.
I'm currently working with barcodes: numerical strings from 6 to 55 digits. In no way can I risk having one barcode be evaluated as equal to a literally different barcode just because the symbols in that string just happen to exhibit a passing resemblance to data of a different type.
Again, it's not just that it has loose typing. It's that it's taking what is OBVIOUSLY a string, converting it to an integer, THEN converting it to yet another data type which imposes data loss.
Intolerable for real-world use. A toy language. Alas, PHP, we hardly knew you...
ETA: Oh, I'd love to know the justification for the downvoting.
> Going so far as analyzing a string to determine whether it consists entirely of numbers
I was about to give an outraged reply that, if PHP is like Perl, then it doesn't scan the string afresh, just keeps a flag indicating whether or not it thinks a string is numeric. However, it turns out that's not true at all. `Perl_looks_like_number`, defined in `sv.c`, calls `Perl_grok_number`, defined beginning on l. 577 (as of v5.14.2) in `numeric.c`, which (after some book-keeping) does this:
if (s == send) {
return 0;
} else if (*s == '-') {
s++;
numtype = IS_NUMBER_NEG;
}
else if (*s == '+')
s++;
if (s == send)
return 0;
if (isDIGIT(*s)) {
UV value = *s - '0';
if (++s < send) {
int digit = *s - '0';
if (digit >= 0 && digit <= 9) {
value = value * 10 + digit;
if (++s < send) {
digit = *s - '0';
if (digit >= 0 && digit <= 9) {
value = value * 10 + digit;
if (++s < send) {
digit = *s - '0';
if (digit >= 0 && digit <= 9) {
value = value * 10 + digit;
and goes on and on and on and on in the same vein. Sheesh! (I didn't forget to close that last brace; the next line is de-dented, but that seems to be a mistake.)
Perl does cache whether an SV contains something usable as an integer (the IOK flag) or a floating point number (the NOK flag). That's why you almost never see `looks_like_number` on its own and always called after using one of the appropriate flag checking macros.
Seems like a silly thing to say. I wouldn't write avionics software with it, but there are a billion websites demonstrating that it's pretty decent for real-world use. At least as good as any other language, I'd guess.
So you could have predicted this yesterday? Just because it's codified somewhere, it doesn't make it clear, or anything other than a whim, or a product of circumstances, at best. That's not how languages should be defined, even if PHP clearly demonstrates that they can end up that way by chance.
Rasmus' lack of foresight does not make it reasonable.
Earlier versions have said the comparison converts the numbers to integers though, which may be incorrect, and misleading if it was. Did PHP not convert float-like strings to floats in, eg, 2009? http://web.archive.org/web/20091024233139/http://www.php.net...
The point, however, is that you shouldn't pepper your language with operations which have consequences as hard to foresee as this with no good reason, and I really don't think that saving yourself some type conversions here and there would do.
"no good reason" is entirely subjective, though. If your purpose is to make the language simpler to newcomers, implicit conversions everywhere are a great way to get things done. And the popularity of PHP (especially for new-to-programming people) heavily supports that they made the correct decision to work well for that market.
The same kind of logic is used to make `false == ""` true. Or any 'falsy' language. If you want strictly typed behavior, yes, it's stupid to do that. If you don't, then it makes some things simpler, at the expense of more edge cases that are unlikely to happen - note that this bug was reported in 2011, and people are acting like it's a new thing. Because it comes up so rarely that, while it technically exists, many people never encounter it.
> "no good reason" is entirely subjective, though. If your purpose is to make the language simpler to newcomers, implicit conversions everywhere are a great way to get things done. And the popularity of PHP (especially for new-to-programming people) heavily supports that they made the correct decision to work well for that market.
You are right, in a way. Sure, it may attract and retain more newcomers, but that's like saying that tobacco is "teenager friendly". I think it's not beginner friendly at all if you must have years of experience to avoid the innumerable pitfalls which PHP lays for you all over the place, learning, e.g. the range of Integers in PHP, which defines when a string will be either a float or an int, or that you should actually use strcmp.
In Python, Ruby, or heck, Haskell, you'd just have to do == and there would be no surprises.
I agree entirely, but we're thinking like programmers. Grab someone who's never programmed at all and ask them if `123 is equal to "123"`.
This essentially breaks down to the top-down vs bottom-up education style debate. You can learn the gritty details and get caught up in minor details that may not matter in other languages, or learn how to do something, and get tripped up by the details in other languages. Similarly, we could teach kids abstract algebra, or basic +-*/ and then over-simplify when they try to divide by zero.
Neither is ideal, both have useful traits and problems, so we have to pick one. Or try to come up with something radically different.
edit: to ask it another way: if PHP is a massively-popular gateway drug to the world of programming, but it gives some people horrible flashbacks for the rest of their lives, do you want to make it illegal and close the door to a huge number of people?
Oh, and "everyone knows that PHP lights the upper-rightmost pixel in you screen purple and will crash if there's no screen" would not, in fact, justify such a thing.
Alright, I'll spell it out for you: the behaviour may be what you'd expect from floating-point comparisons, but it doesn't have to be a floating point comparison in the first place.
No, it doesn't. Language designers make lots of decisions that end up being silly. But they make them. In PHP, if it looks like a number, it will get treated like a number when being compared via ==. It's a simple, well-established, fundamental rule.
Sadly, while '===' is the quick fix, you then have to litter your code with type casting operators if you're comparing numbers, particularly those sourced from, say, a database, where everything is returned as a string. Or from GET or POST data, where everything is a string.
This tripped me up when I was trying to compare two numbers, one of which was the result of a COUNT query via PDO. Of course, that COUNT result was a string.
I suppose if you worked entirely with strings it's alright. Or it wouldn't be so bad if you could make the reasonable assumption that functions returned appropriately typed data.
It does what it's intended to do. Use the methods you are supposed to use. You're just arguing for the sake of arguing now, or you just don't know what you are talking about at all.
> Intolerable for real-world use. A toy language. Alas, PHP, we hardly knew you...
I'm sorry, I would like that to be true, but programmers rarely are half as smart as they think they are. We have many more years of PHP and its resulting insanity ahead of us.
Downvoting because you're railing against a language without understanding it. That, and your C++ comment above, make you sound like the programmer version of internet tough guy. Whatever your real life skills may be, it certainly sounds like a lot of posturing.
All languages have their idiosyncrasies. You can pick out some obscure aspect of any language and say $LANG sucks.
Btw, PHP's behavior doesn't totally make sense to me either. But I'm willing to assume that its users and designers have thought this through and it makes sense for PHP's intended use cases, because I don't know PHP.
Javascript (which most people on HN seem to like) also has similar issues (null vs. undefined, == and === etc). It got so bad that "Javascript: the good parts" had to be written to define a de-facto sane subset of the language. People are actually writing in Coffeescript (in part) to avoid Javascript's pitfalls.
Your medical application should probably be using === for comparisons. I'm not defending PHP's language design, but I think it's pretty well known among professional PHP developers that you should almost always use === and avoid implicit type conversions.
Got me. How do old-school C programmers ensure they don't accidentally use = when they mean == ?
There are actually some decent commercial PHP IDEs, believe it or not. I wouldn't be surprised if some of them are able to Warn on loose equality comparisons. I don't have much direct experience with them though.
"obscurely-documented"? That seems pretty clear to me, given that it's a fundamental feature of the language, and documented (floating point problems too) in an obvious location.
Would your medical application fail FDA approval if you used a language like C or C++ that contained strncmp? Because that function throws away data, too.
There's a difference between strncmp existing precisely so you can specify how many characters to compare, vs. throwing away trailing characters in a string just because, by sheer chance, it contains only numerics.
I'm currently working with barcodes: numerical strings from 6 to 55 digits. In no way can I risk having one barcode be evaluated as equal to a literally different barcode just because the symbols in that string just happen to exhibit a passing resemblance to data of a different type.
Again, it's not just that it has loose typing. It's that it's taking what is OBVIOUSLY a string, converting it to an integer, THEN converting it to yet another data type which imposes data loss.
Intolerable for real-world use. A toy language. Alas, PHP, we hardly knew you...
ETA: Oh, I'd love to know the justification for the downvoting.