Hacker News new | past | comments | ask | show | jobs | submit login
Pyston-lite: our Python JIT as an extension module (pyston.org)
125 points by kmod on June 8, 2022 | hide | past | favorite | 31 comments



I love how easy this is to try out!

I use pipenv, so I ran this:

    pipenv shell --python=python3.8
    pip install pyston_lite_autoload
And it worked: I got a small but material speed improvement from a tiny benchmark I ran against my own project: https://simonwillison.net/2022/Jun/8/pyston-lite/


My very unscientific test (just running the test suite on the Django project I happen to have up at the moment) proved to be very slightly slower with it installed: 11.2s - 11.3s without Pyston, 11.7s - 11.8s with Pyston.


A test suite would be designed to call every routine in a project a limited number of times. This means the JIT compiler will be compiling each routine, for it to be run a few times then thrown away.

Realistic loads might have the same routines run more intensely and might see more benefit from the JIT compiler.


Ahhh, of course. Good point, thanks.


As someone else mentioned, test suites are a bit of a tough case for JITs. That said, we really don't want to slow down any workloads, and I suspect that something might be going wrong if you are getting a measurable slowdown.

Is this project public? I'd love to investigate


It's not, but after reading the other answer here I can totally see why it might be slightly slower. I'll definitely do some more testing with it beyond just timing my test suite runs.


See also upcoming Python 3.11 performance improvements

https://news.ycombinator.com/item?id=31642793


Pyston is substantially faster for our internal benchmarks than Py3.11

The only downside is it’s a 3.8 fork, which fortunately isn’t an issue for us.


Can someone ELI5 how this works as an extension module ? Is it possible to do a similar thing with pypy ?


I haven't read their source code, but the API extensions defined in PEP 523[1] lets extension modules (written in C or C level languages) to replace the default interpreter loop[2] with a custom one. I suspect they're using this mechanism to replace the default interpreter loop with their custom one when the module is imported and the enable function is called.

[1]: https://peps.python.org/pep-0523/ [2]: https://github.com/python/cpython/blob/main/Python/ceval.c


You inspired me to take a look at their code.

The autoload module at https://github.com/pyston/pyston/blob/96d5d33186b81f96ce3d9a... just calls:

    __import__("pyston_lite").enable()
It took some digging, but it looks like that enable() method is defined in the C code here: https://github.com/pyston/pyston/blob/96d5d33186b81f96ce3d9a...

It calls jit_start() which I think is here and does more of the interesting work: https://github.com/pyston/pyston/blob/69b190003f14dfd2f6d276...


> Is it possible to do a similar thing with pypy ?

The predecessor of Pypy is called Psyco http://psyco.sourceforge.net/

To enable JIT in Python2.x, simply `import psyco; psyco.full()`

It was fun while it lasted.

For Pypy3, check out https://github.com/fijal/jitpy



The main difference is that Pyston-Lite is a CPython plugin, while MyPyC is an AOT compiler that emits CPython extension modules.

But if you're asking about performance and use cases, there are a lot of Python compilers and alternative implementations out there now!

Not just MyPyC, but also Nuitka, Shedskin (not sure if still in development), GraalPython, PyPy, Cinder, Pyjion, IronPython (not sure if still in development), and even Stackless Python (yes it's still in develoment). Not to mention Cython and, for specific numerical tasks, Numba.

I think Microsoft also recently announced a CPython JIT extension that's analogous to Pyston-Lite.

A benchmark or comprehensive comparison suite would be pretty cool, but I'm not aware of one. Personally I think CPython plugins are the most user-friendly option, because they require the fewest changes in your deployment setup and runtime environment. PyPy isn't that bad but isn't perfect either. Whereas ahead-of-time compilation is a pretty big change from the usual Python developer workflow and might not be easy to convince a team to adopt it.


Is there a reason for the use of system() here? https://github.com/pyston/pyston/blob/69b190003f14dfd2f6d276...

Seems easier to use the C functions to do this, rather than rely on system commands.


That is definitely unportable code that is not intended for anyone but developers to use.


With the large performance increases brought by the more recent versions of Python, it's not clear to me that installing this is faster than just upgrading Python. It is more simple though I'll give you that. If it also faster with more recent versions?


I believe the article answers all your questions!

(Also, IMO, it is much easier to install Pyston-lite than upgrade a whole Python version, which is an absolute nightmare on some of my specific scenarios with somewhat embedded hardware)

They are currently stuck on 3.8, but believe they will be able to support multiple versions easily because pyston-lite is much easier to develop for than pyston (because it is an extension and not a whole CPython fork).

They also present benchmarks at the very end comparing Pyston, Pyston-lite and the latest CPython 3.11.0b3, all compared to Ubuntu's default Python 3.8.10. It shows that while CPython offers ~8-15% improvements (macro-micro benchmarks), Pyston-Lite offers 8-27% improvements and "OG" Pyston offers 35-65% improvements.

So compared to regular CPython you might get an extra 10% or so, which might or not be worth it depending on your case. Excited to see what happens if they port Pyston to Python 3.11, though!


Pyston is neat, but I'm a bit unclear. Why should we invest time and effort into more proprietary shenanigans?


Is it proprietary? According to their blog, Pyston was open sourced with version 2.2 [0], and the LICENSE file in the repo [1] appears to be the same as upstream CPython.

Not a rhetorical question BTW. A pluggable JIT for Python could be a boon for some projects at my dayjob, but if it's proprietary that would put a bit of a damper on things.

[0]: https://blog.pyston.org/2021/05/05/pyston-v2-2-faster-and-op...

[1]: https://github.com/pyston/pyston/blob/pyston_main/LICENSE


This is a drop-in replacement for the CPython bytecode interpreter, right?

Didn't Microsoft also recently announce something similar? I could have sworn I saw a thread about it here recently, but i couldn't find it again. Or was that the Facebook project Cinder?

There are so many "high performance Python" projects now (and in the past) that it's hard to keep track!


Thanks, I didn't know Pyston had been de-proprietarized and open sourced.


My guess is that all these major tech companies are seeing Microsoft, Google, and Facebook gain a lot of recruiting clout and general industry goodwill through their other successful open-source projects.

So now there is legitimate incentive to develop and get adoption for your own "faster Python" project, which (if successful) could end up being used by millions of developers around the world. Imagine the clout and name recognition that would come with being the company that finally made Python fast after 20+ years of failed attempts. Not to mention all the free labor (bug reports and PRs) from the open-source community, all while optimizing the tool to serve their own internal technical needs above all else.

That's my only explanation for why these companies are all DIYing their own project and not contributing to existing efforts like PyPy and HPy. I think ultimately this work is all good for the Python ecosystem, but clearly I'm a bit cynical and skeptical of large tech companies' motivations in general.


Maintaining internal forks is time consuming and expensive. For big tech companies they might end up saving money on infrastructure but small companies can end up sinking a lot of human resources into the project (especially if you're playing whack-a-mole with upstream changes moving in different directions and breaking things)

Open sourcing things increases the likelihood it'll be upsteamed (the use case is more obvious) and increases the likelihood you'll get support (although you now have a community to support). In addition, it's easier to plead your case if upstream breaks something (here's my source vs vague statements about an internal thing at company x)


Because it’s been decades, and the Python team has repeatedly made it clear that interpreter source simplicity is more important than performance. They’ve admitted it many times.


It might finally be changing, though! We'll have to see how the following releases go


Yes, it might be, but until them, some of us have to use options that exist.


Also single threaded performance over parallelism


Why not to just write entire project in cython?


Why not just write entire project in (pick a language).

Sometimes you have a Python project already, you aren't planning to write a new project. In fact there's a ton of non cython python in existence, every time I use a new library, your solution is to rewrite it in cython?

Lots of ways to speed up python. It's great to have this option, which will work for some without having to change much of anything. Cython, Numba, C (etc.) extensions, ... All are good, too, where they fit into a project and development cycle.


Cython's error messages are terrible. That's one reason is a great tool, but not a great language to use on its own.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: