Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Package reproducibility in Python notebooks using uv isolated environments (marimo.io)
37 points by mscolnick on Sept 16, 2024 | hide | past | favorite | 14 comments


Stuff like this is so convenient and intuitive that it makes you wonder why we didn't start here in the first place!

One thing I wonder is how many data scientists will use this feature given that it is not enabled by default (which is understandable, would be messy for every notebook to have a venv), and only via command-line arguments.

I guess this is easily remedied by helpful beginner tooltip UX ("This notebook has several dependencies. Would you like to build it in a sandbox?").


We would like to make this configurable via a setting so you don't need to pass it through the CLI every time. And I agree, maybe a tooltip/notification to nudge them in that direction.

We would also like to auto-detect a pyproject.toml, and use that when desired.


Seems like an admirable goal and the utilization of UV is great! I'm wondering if there's any possibility for integrating this sort of package management into a project like Quarto which already has a strong ecosystem of plugins and outputs from their scientific notebooks. Quarto notebooks are also markdown based and can utilize Jupyter as a back end for Notebook rendering?


Nice!

I haven't tried this yet, but I love that the functionality of Jupytext is also incorporated, so I guess you get to reproduce the whole end product and all dependencies just from a plain-text script that you can track in Git.


How've you found `uv` as a package manager? I've generally been a fan of Astral's tools in the Python ecosystem and I'm considering making the switch from `poetry`.


it's an absolute beast.

They need to add a couple minor painful paths that people usually use package managers with (private indexes, binding packages to third party indexes...) but if you don't need fancy corner cases, it's the best thing ever.


For any commercial company, those are not corner cases!


Also any machine learning use cases! There’s a big caveat with uv where they merge your Pipfile and pip installations. For example if you do uv add flask (this is added to pipfile) but then you need to add PyTorch with custom indexes (because most ML libraries distribute driver specific versions) and you do uv pip install PyTorch —-index-url what you end up with is a pipfile.lock that merges the two. uv won’t automatically add your dependency to the Pipfile.

https://pytorch.org/get-started/locally/


There's an issue about pytorch (I think you might already have seen it since you encountered this). This is still due to lack of support for pinning a package to a third party index (see https://github.com/astral-sh/uv/issues/171).

I am not sure I totally understand what you mean: yes, the lockfile might be messed up, but is the environment working? I'm a bit blessed because I don't have CUDA on my machine so I save myself a lot of hassle.

I am trying to submit an improvement to the docs about this. https://github.com/astral-sh/uv/pull/6523 I'd love to hear your feedback.


I think the docs should be more clear that the uv pip command is an escape hatch, and the moment you use it you basically lose the ability to have the Pipfile as a source of truth. Most devs won't be looking too closely at the lockfile unless it's to debug issues, they usually only refer to the Pipfile. It would also be nice to have a pip to uv/rye command converter like those curl command generators.


I wouldn't say it's an escape hatch: pip and poetry/pdm/uv serve different purposes. I am not sure what you refer to with `Pipfile`. As per the "command converter": you mean something like from `uv pip` to `uv add`?

BTW, there is a proposal to standardise the lockfile format in Python: https://discuss.python.org/t/pep-751-lock-files-again/59173/...


Yes, a command converter as you described (that supports converting all the pip flags and command parameters) would be very useful. At the end of the day, you need a (readable) single source of truth somewhere (i.e. not the lockfile, the lockfile is fine for deployment and operations but not for developers). As a developer, I don't want the `uv pip` commands to mess up my environment, but at the same time, 90% of the Python world assumes dependency management is done with pip, and when I am installing dependencies, I don't really have time to chase down the exact flags that I should be using to add it to the Pipfile and to ensure uv compatibility.


You are correct - I should've been more specific. Binding a package(s) to a third party index is currently unsupported (see issue: https://github.com/astral-sh/uv/issues/171) as it is something that pip itself does not support. PDM/Poetry implemented this on their own - though this it's an important feature and as far as I can tell it's coming.


It is absolutely blazingly fast and surprisingly so. Few small pain points but the speed increase makes it absolutely worth it for many cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: