I have a feeling that we're making things for "human augmented" AI developer agents! Observable have existing AI integration, to me it seems like this could just be a more easy to compose wrapper for it to make use of! Your strategy assessment didn't sit right without AI. Thanks for the nice write up.
Thanks for your writeup on Observable Framework. I had bookmarked both Observable, and Observable Framework before, but not started looking into the details of it.
Today I was beginning to look at how to host a static Jupyter Notebook, or hosting it interactive with WASM.
But actually I think that for most of my purposes Observable Framework will be a better fit.
My issue with Observable is that it appears to be the examples-resource for d3 [0], but you can't just copy-paste the code because it is designed to run in that framework.
And it's not like d3 is easy to use so that you can use it without examples, specially considering that changes between versions are often incompatible.
But apart from this, there's a lot of incredible graphics to find on the site.
100%. Was never able to get past the fact that it's slightly just not quite actual JS. Close enough to the base language that it seems they could have easily used JS, probably with some additional APIs for showing graphics.
Some community members have made available resources for converting Observable flavoured JS to vanilla (it mostly involves rewriting top-level cell definitions):
with alex garcia's dataflow it has been possible for some time to self-host observablehq notebooks without proprietary software: https://github.com/asg017/dataflow
This does not seem to be an issue with Observable (more an issue with d3 that it has not chosen to have copy-pastable examples elsewhere), but in any case this comment does not seem relevant to this post, as this post is precisely about how the new Observable Framework removes some of the earlier problems with Observable notebooks (“It’s all just standard JavaScript now—no custom syntax” etc).
Like if Microsoft for some reason decided to put all their Azure examples in files that opened in MS Paint, imagine a “My issue with Paint is that…” comment like the above. You should judge MS Paint by how good it is for its stated purpose, not by how it gets in the way when you're not trying to use it. It makes no sense to say “My issue with X is that Y uses it and...” (that would an issue with Y) especially on a post that's about Z and how it's different from X.
In this case, Observable Framework is an open-source static-site generator that runs on your machine and uses standard Javascript, so really don't understand what's the connection to specific hosted Observable notebooks.
(But I guess it's of some interest/disappointment to the Observable folks that at least some people are encountering their Observable project mainly in the context of looking for d3 examples.)
I have a setup with poetry that runs the python data loaders in the poetry-managed virtualenv.
I just created a python project and then instead of `yarn run dev` to start the dev server, just run `poetry run yarn run dev` so the python is executed within the virtualenv.
This setup also lets you use a custom python package to define reusable and unit-testable code for the dataloaders that you can import into the *.json.py files to keep those really simple.
It’s totally possible to decouple these if your python outputs plain JSON/csv into the data/ directory that you commit into the repo or generate just before build time. Then you can import that raw json data into an Observable .md file.
But if you want dynamically generated data at build time and want to make use of Observable’s dataloader automatic execution of data/*.json.py, for instance, while still maintaining a custom virtualenv for the project rather than your system python, you’ll need some way to specify that virtualenv’s interpreter while observable executes the build for the dev server or the full dist/ output.
So for both options it’s largely a matter of taste. I personally like using the poetry virtualenv because it’s simple to manage dependencies and the venvs in one tool, while letting me use observable’s dataloaders with third-party or custom python packages. It sounded like the parent comment wanted to use this type of approach so I focused it to that scenario specifically. I like the simplicity of the single command to generate the data and build the site.
Couldn't you solve that issue with nodeenv in python? This is how I usually add J's to my projects. It will keep node &nom/yarn/whatever else and your js in the venv as well.
You can. But then the only time it realizes that the code has been updated is when you update the script or touch it. It’s a minor annoyance but it adds up when making lots of changes. Periodically deleting the cache works too but also annoying.
I recently finished my first "in anger" project with an Observable notebook. This involved learning Observable Plot, Arquero, relearning bits of Javascript, and integrating it with a Rust based simulator that's my data generation process.
It's honestly been really wonderful. Learning all of those tools has taken some significant energy, and I'm missing some functionality I'd love around parameterizing my data generator, but the final notebook is beautiful and functional.
Using markdown and reactivity makes notebooks like this actually feel usable. Jupyter's custom format made version control a giant pain and without reactivity your iteratively designed notebook easily becomes a write-only, stateful mess. I've also tried making this work using Quarto and their Observable integration and it was hacky and piecemeal.
Genuinely, this was the first time I've been pleasantly surprised and excited to write a notebook and share it with others. I'm sure there will be more sharp edges, but it's become my first choice notebook tool after this project.
For those looking for an alternative to Quarto, check out the recently released Living Papers for authoring reactive/static documents from a single source:
If you want to quickly try and tinker with Framework in your browser, I've set up some Codespace devcontainers that automatically configure Node and Python environments here:
I think the question is are you likely to be more productive with python and its attendant ecosystem or js and the packages it has available (most saliently D3 etc.)
:D I mean personally I prefer Observable becasue my interest is primarily in data presentation. I imagine if you're more on the analysis side of things (tidyverse etc) Jupyter would still be the way to go.
> Everything in a code block with the js content hint will be executed in the users browser immediately. If you want to show the code you have to hint 'js echo'
Am I the only one thinking that it would have been better for backwards compatibility if it where the other way around? I.e: having an opt-in code hint like 'js exec' that runs code in a user's browser and leaving the widely used 'js' hint alone? The way this currently is set up, you cannot integrate that renderer in an existing app without having to manage where it is allowed to run.
We’re planning to allow changing the default options for blocks (either per-page in front matter or across an entire project using the project config); you could then make `js run=false` the default and `js run` to opt-in to live code as you wish. But we chose to make live code the default since that’s our primary use case.
Yes, it has the same issue as e.g. automatically rendering Mermaid diagrams on GitHub: now you can't just show a block of Mermaid code, without dropping the language annotation.
I spent a night going down a rabbit hole with observable framework and it was terrific! It more or less just got out of the way and I was able to visualize and explore my google maps history in detail. Some of the data loader environment stuff wasn't especially clear but running in a poetry env did the trick for python.
I love kotlin and tried creating a data loader for a kotlin script but that had some rough edges. Kotlin expects script files to be named foo.main.kts but observable expects executable shebang loaders to have a foo.exe extension. So I created a proxy exe script to call the kotlin script, but it then doesn't trigger auto reloads of the data.
A bit of friction compared to marimo or jupyter is using variables between data loaders and the notebook. For example, I want to use the date picker view component to change the range of data fetched by my loader. It's not clear how to do that, so exploratory analysis is slowed down a little. I'm aware this goes against the paradigm but just wanted to point it out. It ends up with you potentially moving a lot of the data munging to the notebook as you explore, which isn't ideal from a performance perspective.
One last thing is I wish you could define dataloaders inline. I'm a big fan of single files, so being able to just add a python code block and let Framework extract that as a file would be a nice little QoL improvement.
Still the early days, but Framework seems promising! I'd love to have my all my markdown notes running through it to get a sort of org-mode type situation without going full emacs.
Thanks for the feedback. We have a PR open to make it easier to register new interpreters (without needing to fallback to .sh or .exe); it’ll let you specify the interpreter associated with a given file extension (e.g., .kts for Kotlin). https://github.com/observablehq/framework/pull/935
As for inputs-driving-data-loaders, that does go against the grain a bit since Framework favors static data snapshots so that the built site is self-contained and performant. But a technique that works well is to generate Parquet files in data loaders representing the superset of data that you want to interact with, and then using DuckDB/SQL in the client to extract the subset you want to visualize. This tends to perform well, though obviously it’s dependent on the size of the superset you want to interact with.
But I didn't try the new Observable Framework - interesting to see similar examples where it queries a database live. I hope that preloading and caching all the data is not the only option because these types of apps should be interactive. Ideally, it should expose SQL for live editing.
I just feel like they are limiting their user base by only supporting Javascript.
It's of course the de-facto language for interactive display in browsers. The use case for dashboards and data visualisation is clear. But it's an awful language for data science and data analysis, compared to Python or R.
One of the neat new features of Observable Framework is you can drop in a build script to gather the data that's written in any language you like. https://observablehq.com/framework/loaders
So you absolutely can do the data processing step in R or Python and have that output JSON or CSV which is then visualized at the end using JavaScript.
Not a small feature, but I bet it would be possible to use WebAssembly to add support for Markdown blocks that get executed in other languages as well, using Pyodide for Python for example.
>It's of course the de-facto language for interactive display in browsers
This is it, more or less.
It is far, far easier to build an app like this where you want a plethora of users as a web application than a native one, for instance.
For anything JavaScript as a runtime / language is missing, WASM can boost as well. For math and data science, WASM is a natural choice for any missing pieces
I used Pyodide (https://pyodide.org/en/stable/) as a python execution environment in browser. That was pretty successful. There are also python libraries that let you generate a config which is later made responsive via a JS library. Pyodide runs in wasm.
In Observable Framework data work can be done before hand with Python or anything else really. JavaScript enjoys best integration into browser which is hard to deal with from server side.
I myself like the self-contained aspect of it, since I can publish static files. Also, D3 is the pioneering library for data viz on the web. Especially with maps, which is what I've used it for back in the day. Time to refresh those skills.
Take a look at my dashboard example, it should help show why this is useful. You can get a lot done with very little combined Markdown and JavaScript - building the same thing in HTML plus JavaScript would have taken a bunch more code.
This looks fantastic! I’ve been waiting for something like this.
My only gripe is that data loaders don’t seem to support Parquet files, which is really annoying.
There’s an interesting possibility here where you can have large datasets in Parquet, exposed via HTTP, whilst being generated at build time with all the benefits that gives you (being able to read only specific columns, filtering via row group statistics etc). Not dissimilar to the “SQLite over http” WASM demo I guess.
Because right now I need to take my large, nicely compressed dataset and export it as either a CSV or a zip file? And then the browser needs the entire thing, even if I’m just viewing a subset of the data? Which is much bigger and much slower than it needs to be.
If you want to run a Data loader that outputs to parquet there are plenty of ways to do that - I would suggest a Bash or Python script that wraps DuckDB.
Just wondering, can one embed the observable framework page into another site ? Or should it have to be a separate static site as demoed in the website.
that's an unhelpful articulation that runs the risk of having the opposite effect from the one you want.
That would be a pity, because Quarto is really good. I haven't tried Observable yet, but in outline they have some similarities:
1. Documents written in Markdown
2. Ability to embed code blocks in the Markdown, with code executed when the document is rendered.
3. Ability to embed output of the code blocks in the rendered result (e.g. tables, charts).
4. Ability to render to multiple formats (pdf, static site, ...).
Quarto supports Python and R as languages in the code blocks (maybe more, not sure). I personally prefer it to Jupyter notebooks because the source is plain text so (1) there's a choice of editor and (2) moving between text and code blocks is seamless.
I can't say Quarto is better than Observable but it is good. It has depth from its history in RMarkdown (like rendering mathematical equations, naming & cross-referencing).
It's certainly worth consideration for anyone looking for a "code notebook" solution.
I don't know much about observable, but it seems like they might be hijacking d3 a bit too much for my taste - it makes me a bit nervous about the future of d3.
Observable is from the same creator as D3 (Mike Bostock) and D3 has been a core component of the Observable platform since they first launched their notebook product back in 2018.
I don't see Framework changing things there - if anything the ISC license should make it a better partner for D3.
It brings together d3, Observable, Observable Plot, HTL and layers on a bunch of new ideas as well.