Hacker News new | past | comments | ask | show | jobs | submit login
PHP gets Generators (php.net)
73 points by TazeTSchnitzel on Sept 1, 2012 | hide | past | favorite | 80 comments



PHP is the perfect showcase of how good languages are not sum of features at all. I don't think you can fix PHP just adding stuff that it lacks. It can be improved in this way as, at least, it is possible for a good programmer to pick a subset of features that makes the language less a pain to use, but I wish PHP could start changing in a "let's redesign it" way.

Disclaimer: I'm a big fan of PHP, not because I like the language, but because I like the pragmatic approach.


> I wish PHP could start changing in a "let's redesign it" way.

I think that's foolish. Redesigning a language is making a new language -- and there are lots of other new languages to choose from. What features of PHP would you want to keep that aren't possible in other languages?

Radically changing the language also leads to heavily splitting the community like with PHP4/PHP5 or Perl5/Perl6 or Python2/Python3 and nobody wants that. Backwards compatibility doesn't have to be fully maintained but it does have to be respected.

The biggest issue right now with PHP is that there isn't a clear direction on where to go. Nobody wants to "fix" some of the old problems because the gains don't seem to outweigh the costs.


There is room for a language that generates HTML by default, compiles every library anyone could ever need into the core, and puts it all in the default namespace. All you need to make that socially acceptable is something like GHC's -XNoImplicitPrelude option. By default, you get what might be considered to be a big mess. But when someone with a clue shows up to turn your prototype into something maintainable, they can turn on "expert mode" and program normally again.

That's what PHP is missing: expert mode. As it stands now it's very difficult to turn off all the traps, so experienced programmers are forced to switch to other languages. Compare this to Perl, where it has silly behavior by default, but you can turn it off with "use strict" at the top of the file.


My PHP code looks pretty much identical to my code in other languages. Everything is in namespaces and classes. There is abstractions for database access and other common operations. I don't think PHP needs an expert mode; it just needs to continue to be a better language.

However, I agree, all languages should have the ability to turn off the backwards compatibility crud with some kind of statement. In PHP, it would be great to do that at the namespace level -- allowing different libraries to operate at different levels of support/strictness in the same project.


I've used PHP a lot, work at a name-brand startup that uses a fair bit of PHP in the stack, and have an OSS PHP project I'm proud of (https://github.com/shaneharter/PHP-Daemon). That is to say, I'm not a PHP hater. I see many flaws in the language, many many actually, but I'm not a hater.

But I wonder if maybe you're blind to the ways your PHP is different. Dozens of modules and hundreds of functions come directly with the language core. And very very esoteric stuff. Why does my programming language have a function to tell me specifically what day Easter is? Whaa? Why do I have to recompile my programming language to install new libraries? (PCNTL, for example).

If PHP is a salad bar, Python (for example) is prix fixe.


You don't. Watch:

  wget http://www.php.net/get/php-5.3.16.tar.gz/from/us2.php.net/mirror -O php-5.3.16.tar.gz
  tar zxvf php-5.3.16.tar.gz
  cd php-5.3.16/ext/pcntl
  phpize
  ./configure
  make
  sudo make install
At this point you should see a pcntl.so in your extension_dir. Add an extension=pcntl.so to your php.ini file and you are good to go.

Of course with non-bundled extensions from pecl or github you can skip the tarball step.


That's no less omplicated than downloading the PHP source and recompiling with the --with-pcntl flag. Now compare that, just for your own wonderment, to how it works in other modern languages..

pip install package_name or gem install package_name

not to mention, with bundler and virtualenv you can easily, easily keep multiple versions of different packages for different projects without any juggling and hacking your way through.


Easily? I remember having quite some difficulties getting gems with binaries to work on Windows (meaning: lots of wrestling before I even got the MySql driver working). Might be different now though but I have never experienced anything like that when dealing with PHP extensions or Composer packages.


That is only because you picked an internal extension that is usually already built-in. If you install the php-cli package from any major distro it includes pcntl. For normal extensions you either do an apt-get install php-imap (for example) or a pecl install stem if your distro hasn't packaged the extension you need or you just prefer to install it directly.


You don't have to recompile PHP to install anything, PHP has loadable module and had them since forever. As for Easter - why do you care that there's a function for that? How does it hurt you? Somebody needed it, so it was added. Why having function that somebody needs and you do not is a problem?


>There is room for a language that generates HTML by default, compiles every library anyone could ever need into the core, and puts it all in the default namespace.

When I have time for side projects again, I'm going to make this happen for CoffeeScript running under v8cgi or node. The code for the "HTML by default" is trivial enough that I could write it in fifteen minutes. The code for the global namespace (I'd prefer a short namespace actually) stuff is a bit more tedious, and the parts with making it run a server under node and automatically parse and cache the files are the boring details that I might get stuck on and cause the project to languish. Ideally, I'd want to make it flexible regarding frontends, so that if you could get it running in any environment that accepted JS code and could wrap the built in module behavior.

The frustrating thing is that until we see more widespread support for server-side JS on reasonably cheap hosts, or at least some well-implemented JS transpilers/interpreters for commonly installed languages like Python and PHP, making this work for Joe Everydev is going to be virtually impossible.

I'd be interested in doing something similar for Python (mod_python supports something close in PSP, but nobody seems to use mod_python anymore), but while I'm capable with Python, I've never really been interested in learning the ins and outs of the "Pythonic" style, and that makes me reluctant to put any code out where the Python community can see it.


What I mean for "redesigning" is to do what they are doing now (adding what was clearly missing), but at the same time doing what they are not doing now, that is, removing what was clearly not ok and is now taken inside just for backward compatibility. With a switch to allow for backward compatibility, that is disabled by default.


They can't do that though. PHP is so successful and ubiquitous because it's been backwards compatible for a really long time. Most PHP apps run very well on 5.2 Which was released over 6 years ago.

I think the way PHP is evolving is how it has always evolved. No one's thinking 5 years into the future. But I argue that they don't have to. PHP doesn't win on it's big design advantage, it wins on all of the small parts that it haphazardly releases (I don't mean that in a bad way).


For some people. there's just no pleasing of them. Short of whole PHP community deciding to scrap the whole project and came to them for directions how to live further, there's nothing to be done to make them happy.

This "redesign" thing you talk about would never happen. Or it happened already dozens of times, if you want. It's just called other languages, not PHP. PHP is something that it is, and it develops and gets better, but if you think the whole premise is wrong - fine, there are tons of other approaches around. That doesn't mean every topic on PHP should be accompanied by good old "yeah, that's nice, but you have to redesign it". What's the point of it?


PHP does make big, backwards-incompatible changes at major versions. They are slowly happening, but only slowly, unfortunately. But I hope to see some big changes before trunk becomes PHP 6.


I'll bite... like what?

All of PHP4's syntax is still valid in PHP5 even if it does raise a deprecated notice.

The only real thing that broke in the upgrade is if you had a LOC that was expecting an object to be passed by value. Theoretically possible but practically very very rare.

And of course config settings like register globals being changed, but that's just that: a config change.


Which is a problem in and of itself...languages shouldn't have config files.


register_globals was removed.

PHP5 changed the semantics of objects entirely. Before they were value types.

Still, yes, not enough has changed IMO.


This is certainly a debate about semantics, but I don't think you mean "semantics." They changed the implementation of objects. And they added new syntax. But they made no changes that affected the old syntax -- except the one I mentioned.


> They changed the implementation of objects

You're implying it was an implementation detail. They fundamentally changed how objects worked, the semantics, that made it completely impossible to move any of my PHP4 code over in a reasonable way.

It is possible to make code, that uses objects, work in both PHP4 and PHP5 but you have to restrict yourself to a subset of both versions.


That's interesting. What were you doing with your PHP4 classes that didn't run in PHP5?


The difference between value semantics and reference semantics is pretty significant. It's almost impossible to do any meaningful OOP without taking that into account. Getting PHP4 to work reasonably requires a lot of work (hence the change in PHP5).

None of the PHP4 frameworks moved over to PHP5 without a lot of re-writing.

My code simply assumed value assignment semantics and changing that everywhere was just too much work. At this moment, I still have sites running PHP4.


I'd love some specifics.

Were you passing around objects and mutating them while expecting the existing object to be unchanged? For example:

  $o = new MyObject()
  $o->important_data = 'clean';
  function foo($give_me_o) {
    $o->important_data = 'dirty';
  }

  foo($o);
  echo $o; // Need this to say "clean"
Because that, literally, is all that changed. Yes, the implementation changed. But PHP4 syntax (including explicit pass-by-reference and class-named constructors) still works in PHP5. That's not an accident. That was by design. Go back and read the discussions.

Frameworks chose to upgrade because you get MUCH better memory management and cleaner syntax from PHP5. But many, many people run PHP4 code in a PHP5 runtime. It works just fine. Because that example I used above -- while it is totally possible you have code like that, that's in practice pretty rare.


You make it sound like it was a small change. In PHP4, you had to be careful to use references everywhere to get "proper" behavior. To return an object out of collection in your function, it better be both declared with a reference and to use a reference whenever you do an assignment or you won't get what you expect.

The potential for subtle bugs with this change is massive.

Your example is very simple. I've got objects with other objects as properties (composition), collections of objects, hierarchies of object references, as well as methods that take objects by reference or by value. Not using references everywhere is so much easier in PHP4 that whenever possible, I did just that.


I guess so, I'm not well-versed in programming nomenclature, I haven't done a CS course.


I agree that it would be great if they actually were fixing stuff instead of only adding stuff, but as someone who works in PHP on a daily basis, I certainly appreciate the addition of modern features as well.

It seems like one way for them to fix a lot of the old stuff while maintaining backwards compatibility would be to add new string and array (and dictionary?) classes with sensibly-defined interfaces. I imagine there are a fair number of less-obvious issues around that that make not as easy as that sounds, though.


Those already exist, they are in Spl if you want them.


This feature seems seriously contrived and they could have tried to learn a thing or two from ruby here.

The idea that the code flow is now no longer linear, and that the programmer now has to understand perfectly the odd rules by which the execution of code now just jumps around… boggles the mind.

A closure-based approach would have been much more semantically sane; if a little syntactically ugly. But then again, how hard would it have been to add a nicer closure syntax to PHP, instead of creating this mess of a feature?

Take this:

    def each_line_in_file
      f = File.open('hello.txt', 'r')
      while line = f.gets
        yield line
      end
      f.close
    end
    
    each_line_in_file do |l|
      puts l
    end
This is a real-world Ruby example of the feature. The semantics here are clear, each_line is a higher-order function which takes a closure as an arguments, and calls it (yields to it) for each line in the file. Ruby allows the programmer to use the yield keyword to avoid having to declare the closure as an argument (`def each_line(&block)`).

The same can even be implemented in PHP as it stands today, although the syntax is quite unsightly:

    function each_line_in_file($callback) {
        $f = fopen('hello.txt', 'r');
        while (null !== ($line = fgets($f))) {
            $callback($line);
        }
        fclose(f);
    }
    
    each_line_in_file(function ($l) {
      echo $l."\n";
    });


I'm not really sure what I should with this comment. It doesn't make sense to me.

Generators and coroutines are a well-known concept implemented in many languages, in particular Python and C#. It is also part of ECMAScript Harmony as far as I know. Other languages implement the more generalized concept of continuation passing (usually providing generators implemented on top of them).

Callbacks can be used to implement some of the use cases of generators. To be more precise they can be used if the callback is stateless (as in your above case). If the callback had to handle state, you would have manually implement a state machine, making the code hard to understand and adding a lot of boilerplate.

Generally PHP does not make extensive use of callback-y programming, but does make extensive use of iteration. That's why generators integrate much better than callback based producers. The same applies to Python, that's why they too have generators.

You need to understand that Ruby has a different underlying philosophy, which is not directly transferred to other languages. What might be nice in Ruby will be considered really ugly in Python or PHP and vice-versa.


> This feature seems seriously contrived and they could have tried to learn a thing or two from ruby here.

That's way too late, PHP is an imperatively styled and statements-based language, it uses iterables and external iteration, not internal. That's what the language is.

> The semantics here are clear

There's nothing unclear about generator semantics, regardless of your refusal to understand them.

> The same can even be implemented in PHP as it stands today

Not quite, PHP doesn't have non-local returns, so you soon get into annoying issues of not being able to break away from an iteration (or of having an utterly terrible API to do so, as in [NSArray enumerateObjectsUsingBlock:]

> Ruby allows the programmer to use the yield keyword to avoid having to declare the closure as an argument (`def each_line(&block)`).

So you dislike one magic but you like an other for pretty much no reason at all beyond knowing one and not the other? Consistency not your style?


> So you dislike one magic but you like an other for pretty much no reason at all beyond knowing one and not the other? Consistency not your style?

The thing I love about Ruby's magic is that it's not really magic, it's an abstraction of solid fundamentals.

There's nothing really magic about ruby's yield, because it's a shortcut for explicitly declaring a closure and calling it. These are real language concepts that are well understood. PHP's yield, however, is a completely new construct. The jumping around does not follow well-understood rules (like calling a function). In ruby's case, that's what it does. It simply executes the function, and calls a closure.

In PHP now when you define a function, and then call that function; it may or may not execute the code inside that function, depending on whether later on in the function there's a `yield` keyword. That's insane!

Say you can see this in your code editor:

    public function hello() {
        // a lot of important code
        $myvar = $this->other_cool_function();
        // more code
    }
    public function other_cool_function() {
        echo "Hello, world!\n";
    . . .
You're looking at line 3, the line which calls the other cool function. Looking at this code you'd expect that to print "Hello, world!" first, and do some other stuff in the rest of the function. But wait! You didn't notice, 30 lines down the other function, there's a yield keyword, which means that the entire function doesn't ever execute until you start iterating over it…


> In PHP now when you define a function, and then call that function; it may or may not execute the code inside that function, depending on whether later on in the function there's a `yield` keyword. That's insane!

It's exactly how it works in C# and Python. When you call a function, you don't care how it does what it does -- only that it returns a result. The syntax and design works this way in all sorts of languages because it's easier to reason about. I've only used generators in C# but it's amazingly simple to use. The point of an abstraction is that it hides the details. You're overly concerned with the details that are neatly and logically hidden away.


This isn't really PHP's fault. These generators and the generators that are creeping into Javascript are based on Python's, which have the same issue you mention:

  >>> def other_cool_function():
  ...     print "Hello, world!\n"
  ...     yield 7
  ... 
  >>> other_cool_function()
  <generator object other_cool_function at 0x10081b140>
Of course, I would prefer to have coroutines or call/cc, but generally these get shot down as terrible ideas by the people who are actually in charge of making decisions. In Python you can get around this using greenlet. I'm told HipHop has these goodies as well. No dice if you need to write Javacript or non-Facebook PHP, though.


So how function executes depends on the content of the function? And return value depends on what's inside the function? Insane! Next thing they'd say you actually have to document your functions and describe how it works and what it does! That you have to comment code! Adness, utter adness! Nobody would use a language that doesn't have every function's purpose obvious from the name!

Wait, what? Numerous languages implement it in exactly the same way? Are they called "Ruby"? No? Then who cares, it's not Ruby so it must be wrong. Not just wrong, it's insane - nobody sane would implement a language that doesn't work exactly like Ruby!


Not familiar enough with Ruby, but wouldn't your Ruby sample have this same problem?


No, the ruby "generator" is actually a function that takes a function and calls it with each line from the file, like this:

  >>> the_list = [1,2,4,7]
  >>> def ruby_generator(f): 
  ...   for x in the_list: 
  ...     f(x) 
  ...     
  >>> def print_it(x): 
  ...   print x 
  ...   
  >>> ruby_generator(print_it)
  1
  2
  4
  7


That's not different than what PHP does now.

I think you might be confused. I'm referring to the GP post that has the example with the yield call in it, the ruby example.


> That's not different than what PHP does now.

You could write code like that in PHP, and the GP post with the ruby example does so, but PHP generators do not work like that.

> I think you might be confused. I'm referring to the GP post that has the example with the yield call in it, the ruby example.

The Ruby example doesn't have the problem you mentioned because Ruby doesn't have generators at all. In Python and PHP, yield means "Suspend execution of this function and return a value to the caller, who can later call this function again to resume its execution." In Ruby, yield means "Call the block that was passed to this function with the values after yield," or exactly what I've written in Python without using the yield keyword. Ruby's "yeild line" is Python's "f(x)".

Edited: I think I understand what you mean after reading once more. If you mistook the function containing yield in Ruby for a function that does not contain yield you would still have a problem, but not the same problem you would have in PHP or Python. It would raise an exception, while PHP and Python's would just return an iterable.

  def each_line_in_file 
  ..   yield 1 
  .. end
  => nil
  each_line_in_file
  (eval):2: (eval):2:in `each_line_in_file': no block given (LocalJumpError)
      from (eval):3


> I think I understand what you mean after reading once more.

Thanks for taking the time to understand. My apologies for not being clear from the outset.


>The idea that the code flow is now no longer linear, and that the programmer now has to understand perfectly the odd rules by which the execution of code now just jumps around… boggles the mind.

They aren't odd rules. It's just a coroutine. It executes, yields a value to the calling function and yields back control, the calling function does stuff, and yields back control, and this repeats. It's quite simple.

Also, don't argue against it here. Argue with the Python and C# folks, they did it first.


Not sure if you are familiar with Enumerator class in Ruby, it does pretty much what generators in PHP are about:

  require 'enumerator'

  def each_line_from_file file_name
    Enumerator.new do |yielder|
      File.open(file_name) do |file|
        while line = file.gets
          yielder << line
          ## or, alternatively:
          # yielder.yield(line)
        end
      end
    end
  end

  each_line_from_file('test.txt').each do |l|
    puts l
  end


I'm not quite caught up on the mailing list - but it looks like the last-minute point about traversing an already-exhausted generator (see [1]) has held.

That's really the most important thing for developers to notice - whether you use the feature often or not, you must be aware that foreach() might throw, not just emit an E_WARNING.

____________________

1. https://wiki.php.net/rfc/generators#rewinding_a_generator


I guess you'll be upset then that I, personally, was for it using Exceptions, because Generators are Iterators.


It's a sensible choice and i understand the reason for it being this way. It does seem like it's exposing an implementation detail, though, and out of the rest of the core language syntax, nothing else throws an exception.

I'm happy to see the feature, but this is one of those quirks that, in another five years, will end up in the next "A Fractal Of Bad Design" blog post.


Interesting - only one vote against (final numbers were 24-1).


I find it amusing that, on the PHP internals mailing list, RFCs will be heavily "debated" (i.e., argued) about, and you'd think from reading it everyone hates something.

Then it comes to vote and >90% vote for it, usually :)


You misunderstand how the list works :) If people discuss small details of it passionately, that means they care about the big concept enough that they want to get the details right. If they hated the whole idea, most people would just say "it doesn't make sense" and move on, and there would be 2-3 people discussing something irrelevant until the thread dies off. That happens too, but arguing about a feature doesn't mean people hate it. More often than not it means people are interested in it.


Everything should be debated. It ensures that you get the best possible product. We may agree on something, but if no opposing views are given, how do we know it's the best?


I'm not opposing debate.

You'd understand what I was on about if you read the PHP internals mailing list - the same points are argued to death; instead of a good discussion, it becomes a huge argument.


Debate isn't always about arguing your side. Sometimes it's about making sure all the sides are argued so that you can know you're going the right direction.


I used to say that PHP tried to be Java. Closures, traits and now generators, linguistically it's getting ahead.


PHP is still very Java-like in some ways though. Well... PHP likes to pretend it's a dynamic language (Closures, for instance), but at its core it's a very strange dynamic/static hybrid, utilising a single-pass compiler and with type checking done with syntax definitions(!)

Which is why this doesn't work: (actual intended production code)

  self::$views[$path]();
Why? Because the syntax doesn't accomodate calling a closure that is an item in an array, which is a static member of a class.

But this works:

  $x = self::$views[$path];
  $x();


> PHP likes to pretend it's a dynamic language (Closures, for instance),

PHP is a dynamic(ally typed) language, and closures have nothing to do with static/dynamic.

> at its core it's a very strange dynamic/static hybrid

No, at its core it's completely dynamically typed, it got static features when it decided to base its OO on Java's.

> Which is why this doesn't work:

This doesn't work because PHP uses a shitty ad-hoc parser for a shitty ad-hoc syntax, much like closures it's completely orthogonal to the language being dynamic or static.

Also, this piece of code doesn't involve any closure. It looks like you don't know what a closure is.


>Also, this piece of code doesn't involve any closure. It looks like you don't know what a closure is.

Yes it does, it executes what PHP calls a "Closure": http://php.net/manual/en/class.closure.php


Actually, not really.

Your code breaks on any PHP "callable" http://php.net/manual/en/language.types.callable.php of which Closures are just one.


I'm well aware of that.

Although it seems PHP tried to treat it as a string callable IIRC.


> PHP is still very Java-like in some ways though.

PHP has much more in common with Java/C# than Python and Ruby. If you took Java and added dynamic typing and compiled it fresh on each request you'd get something very close to PHP.


Exactly.


Just like how they added someFunc()[0] syntax recently I guess they will make these work one by one where the demand is by people wanting to use them. Granted it would be great if the compiler wasn't so brittle.


I suspected a weird kind of type checking. It's so dynamic they're hitting walls whenever they want to improve things I guess.

One PhD students tried to make a PHP compiler, hair loss ensued.


I assure you, I still have all of my hair ;)


I meant mine obviously ;)

For the curious, here's the talk (1 hour) about ahead of time compilation of PHP:

http://www.youtube.com/watch?v=kKySEUrP7LA


What's the link to your research?


I was trying to be funny, insinuating your talk was so extensive that my hair fell.


There's no reason in PHP's design that wouldn't work -- they just need someone to work on improving the parser to handle that.


It's not improving the parser, it's improving the language's syntax definition, which shouldn't care about this in the first place.


PHP's parser is the language syntax definition. It's perfectly reasonable for any PHP developer to attempt that call because it should work. PHP's parser, for no good reason, just doesn't like it. I'd consider it a bug rather than design problem.


Without having investigated it, I would guess that the parser isn't abstract and modular enough so they end up with a mess of code trying to handle all the different possible combinations of syntax.


As a point of fact, it is not a single pass compiler.


In what sense?


How many senses are there?

It parses, then does "bytecode" generation, then fiddles with the bytecodes a bit. So it does multiple passes.


Well... it parses, and after parsing each group of tokens runs a function. And yes, it replaces some bytecodes, but its lack of an AST means it can only post-process the bytecode.

And everyone I've spoken to refers to it as single-pass.


Post processing the bytecode is what makes it multi-pass. "One-pass" means that it literally makes one pass over the code. Whether it uses an AST or not isn't really relevant.


I think probably the biggest hurdle for a lot of devs is really a fix on naming conventions and argument order. I think this could be accomplished with a large aliased namespace. maybe something like

use \stdlib as _;

_\array_walk(<array>,<callback>) _\array_map(<array>,<callback>) _\array_reduce(<array>,<callback>) ...


    the/worst/php/travesti/is/namespaces


You\mean\travesty\obviously


Backslash.

And honestly, it's really not bad.


Not getting the problem either, working with PHP namespaces daily.


They couldn't have made it a forward slash and you damn well know that.


I hated \ as well when I first heard about it, but frankly, after actually working with it, it doesn't matter one bit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: