Interesting ideas in Observable Framework

simonw · on March 3, 2024

In a way, Observable Framework is the Avengers: Endgame of the Mike Bostock Cinematic Universe.

It brings together d3, Observable, Observable Plot, HTL and layers on a bunch of new ideas as well.

drewda · on March 4, 2024

For what it's worth, Polymaps is probably still my favorite of his creations...

NelsonMinar · on March 4, 2024

Not to forget: Polly-B-Gone https://cs.stanford.edu/people/mbostock/polly/

luke-stanley · on March 4, 2024

I have a feeling that we're making things for "human augmented" AI developer agents! Observable have existing AI integration, to me it seems like this could just be a more easy to compose wrapper for it to make use of! Your strategy assessment didn't sit right without AI. Thanks for the nice write up.

codetrotter · on March 3, 2024

Thanks for your writeup on Observable Framework. I had bookmarked both Observable, and Observable Framework before, but not started looking into the details of it.

Today I was beginning to look at how to host a static Jupyter Notebook, or hosting it interactive with WASM.

But actually I think that for most of my purposes Observable Framework will be a better fit.

ak39 · on March 4, 2024

Thanks for the easily understandable TLDR!

qwertox · on March 3, 2024

My issue with Observable is that it appears to be the examples-resource for d3 [0], but you can't just copy-paste the code because it is designed to run in that framework.

And it's not like d3 is easy to use so that you can use it without examples, specially considering that changes between versions are often incompatible.

But apart from this, there's a lot of incredible graphics to find on the site.

[0] https://observablehq.com/@d3/gallery

iansinnott · on March 4, 2024

100%. Was never able to get past the fact that it's slightly just not quite actual JS. Close enough to the base language that it seems they could have easily used JS, probably with some additional APIs for showing graphics.

simonw · on March 4, 2024

That's one of the biggest features of Observable Framework compared to regular Observable: it's just vanilla JavaScript now.

iansinnott · on March 4, 2024

Fair point, I hadn't made the distinction in my mind. Perhaps that resolves my previous qualms.

dleeftink · on March 4, 2024

Some community members have made available resources for converting Observable flavoured JS to vanilla (it mostly involves rewriting top-level cell definitions):

[0]: https://observablehq.com/@bumbeishvili/convert-observable-co...

btbuildem · on March 3, 2024

Yup, I find that incredibly frustrating -- it's a platform lock-in that any corp would be proud of.

I've had this gripe more with ObservableHQ notebooks -- great examples and a pointless resource all at the same time.

This framework effort seems to be a bit more open though (at least you can self-host), so I'm keeping an eye on it.

kragen · on March 4, 2024

with alex garcia's dataflow it has been possible for some time to self-host observablehq notebooks without proprietary software: https://github.com/asg017/dataflow

i, embarrassingly, haven't tried it

svat · on March 4, 2024

This does not seem to be an issue with Observable (more an issue with d3 that it has not chosen to have copy-pastable examples elsewhere), but in any case this comment does not seem relevant to this post, as this post is precisely about how the new Observable Framework removes some of the earlier problems with Observable notebooks (“It’s all just standard JavaScript now—no custom syntax” etc).

svat · on March 5, 2024

Like if Microsoft for some reason decided to put all their Azure examples in files that opened in MS Paint, imagine a “My issue with Paint is that…” comment like the above. You should judge MS Paint by how good it is for its stated purpose, not by how it gets in the way when you're not trying to use it. It makes no sense to say “My issue with X is that Y uses it and...” (that would an issue with Y) especially on a post that's about Z and how it's different from X.

In this case, Observable Framework is an open-source static-site generator that runs on your machine and uses standard Javascript, so really don't understand what's the connection to specific hosted Observable notebooks.

(But I guess it's of some interest/disappointment to the Observable folks that at least some people are encountering their Observable project mainly in the context of looking for d3 examples.)

llimllib · on March 3, 2024

Framework is also super easy to publish to a github site, I wrote up a note with the steps and a sample github action: https://notes.billmill.org/programming/observable_framework/...

willmeyers · on March 3, 2024

The author’s spot on about framework.

I tried out Observable Framework and built a little interactive plot (https://github.com/willmeyers/observable-ssta). It was incredibly easy to setup and get data plotted.

My only gripe is that I wish you could configure Python data loaders to use virtualenvs.

learned · on March 3, 2024

I have a setup with poetry that runs the python data loaders in the poetry-managed virtualenv.

I just created a python project and then instead of `yarn run dev` to start the dev server, just run `poetry run yarn run dev` so the python is executed within the virtualenv.

This setup also lets you use a custom python package to define reusable and unit-testable code for the dataloaders that you can import into the *.json.py files to keep those really simple.

eacapeisfutuile · on March 4, 2024

Why do you need to bundle these, is it to simplify iterating on frontend and data loaders simultaneously? Why not run them separately?

learned · on March 4, 2024

It’s totally possible to decouple these if your python outputs plain JSON/csv into the data/ directory that you commit into the repo or generate just before build time. Then you can import that raw json data into an Observable .md file.

But if you want dynamically generated data at build time and want to make use of Observable’s dataloader automatic execution of data/*.json.py, for instance, while still maintaining a custom virtualenv for the project rather than your system python, you’ll need some way to specify that virtualenv’s interpreter while observable executes the build for the dev server or the full dist/ output.

So for both options it’s largely a matter of taste. I personally like using the poetry virtualenv because it’s simple to manage dependencies and the venvs in one tool, while letting me use observable’s dataloaders with third-party or custom python packages. It sounded like the parent comment wanted to use this type of approach so I focused it to that scenario specifically. I like the simplicity of the single command to generate the data and build the site.

eacapeisfutuile · on March 4, 2024

Thank you for the thorough response, that makes sense

sberder · on March 4, 2024

Couldn't you solve that issue with nodeenv in python? This is how I usually add J's to my projects. It will keep node &nom/yarn/whatever else and your js in the venv as well.

simonw · on March 3, 2024

Can you put a shebang line in a .sh data loader that points to the full path to bin/python within the virtual environment directory?

timmattison · on March 3, 2024

You can. But then the only time it realizes that the code has been updated is when you update the script or touch it. It’s a minor annoyance but it adds up when making lots of changes. Periodically deleting the cache works too but also annoying.

tel · on March 4, 2024

I recently finished my first "in anger" project with an Observable notebook. This involved learning Observable Plot, Arquero, relearning bits of Javascript, and integrating it with a Rust based simulator that's my data generation process.

It's honestly been really wonderful. Learning all of those tools has taken some significant energy, and I'm missing some functionality I'd love around parameterizing my data generator, but the final notebook is beautiful and functional.

Using markdown and reactivity makes notebooks like this actually feel usable. Jupyter's custom format made version control a giant pain and without reactivity your iteratively designed notebook easily becomes a write-only, stateful mess. I've also tried making this work using Quarto and their Observable integration and it was hacky and piecemeal.

Genuinely, this was the first time I've been pleasantly surprised and excited to write a notebook and share it with others. I'm sure there will be more sharp edges, but it's become my first choice notebook tool after this project.

dleeftink · on March 4, 2024

For those looking for an alternative to Quarto, check out the recently released Living Papers for authoring reactive/static documents from a single source:

[0]: https://living-papers.vercel.app/

dleeftink · on March 4, 2024

If you want to quickly try and tinker with Framework in your browser, I've set up some Codespace devcontainers that automatically configure Node and Python environments here:

[0]: https://github.com/dleeftink/observable-codespace

wodenokoto · on March 4, 2024

Should I move from jupyter notebooks to Observable? Or is that the wrong dichotomy?

tomgp · on March 4, 2024

I think the question is are you likely to be more productive with python and its attendant ecosystem or js and the packages it has available (most saliently D3 etc.)

wodenokoto · on March 5, 2024

I was hoping for some more opinionated responses, but I guess the truth is always “it depends”

Asking between Python and R tends to get people to throw around opinions.

tomgp · on March 5, 2024

:D I mean personally I prefer Observable becasue my interest is primarily in data presentation. I imagine if you're more on the analysis side of things (tidyverse etc) Jupyter would still be the way to go.

theK · on March 4, 2024

Paraphrased from the article:

> Everything in a code block with the js content hint will be executed in the users browser immediately. If you want to show the code you have to hint 'js echo'

Am I the only one thinking that it would have been better for backwards compatibility if it where the other way around? I.e: having an opt-in code hint like 'js exec' that runs code in a user's browser and leaving the widely used 'js' hint alone? The way this currently is set up, you cannot integrate that renderer in an existing app without having to manage where it is allowed to run.

mbostock · on March 4, 2024

We’re planning to allow changing the default options for blocks (either per-page in front matter or across an entire project using the project config); you could then make `js run=false` the default and `js run` to opt-in to live code as you wish. But we chose to make live code the default since that’s our primary use case.

crabmusket · on March 4, 2024

This is great to hear, though your decision does make sense! I'm really keen to play with Framework.

crabmusket · on March 4, 2024

Yes, it has the same issue as e.g. automatically rendering Mermaid diagrams on GitHub: now you can't just show a block of Mermaid code, without dropping the language annotation.

simonw · on March 4, 2024

There's a trick you can use there (which works with GFM and with Framework too): wrap a block in four backticks like this:

    Here is a mermaid example:
    ````
    ```mermaid
    Code here
    ```
    ````
    Copy that into a Markdown file to try it

mbostock · on March 4, 2024

You can write ```mermaid run=false for that.

jsnelgro · on March 5, 2024

I spent a night going down a rabbit hole with observable framework and it was terrific! It more or less just got out of the way and I was able to visualize and explore my google maps history in detail. Some of the data loader environment stuff wasn't especially clear but running in a poetry env did the trick for python.

I love kotlin and tried creating a data loader for a kotlin script but that had some rough edges. Kotlin expects script files to be named foo.main.kts but observable expects executable shebang loaders to have a foo.exe extension. So I created a proxy exe script to call the kotlin script, but it then doesn't trigger auto reloads of the data.

A bit of friction compared to marimo or jupyter is using variables between data loaders and the notebook. For example, I want to use the date picker view component to change the range of data fetched by my loader. It's not clear how to do that, so exploratory analysis is slowed down a little. I'm aware this goes against the paradigm but just wanted to point it out. It ends up with you potentially moving a lot of the data munging to the notebook as you explore, which isn't ideal from a performance perspective.

One last thing is I wish you could define dataloaders inline. I'm a big fan of single files, so being able to just add a python code block and let Framework extract that as a file would be a nice little QoL improvement.

Still the early days, but Framework seems promising! I'd love to have my all my markdown notes running through it to get a sort of org-mode type situation without going full emacs.

mbostock · on March 5, 2024

Thanks for the feedback. We have a PR open to make it easier to register new interpreters (without needing to fallback to .sh or .exe); it’ll let you specify the interpreter associated with a given file extension (e.g., .kts for Kotlin). https://github.com/observablehq/framework/pull/935

As for inputs-driving-data-loaders, that does go against the grain a bit since Framework favors static data snapshots so that the built site is self-contained and performant. But a technique that works well is to generate Parquet files in data loaders representing the superset of data that you want to interact with, and then using DuckDB/SQL in the client to extract the subset you want to visualize. This tends to perform well, though obviously it’s dependent on the size of the superset you want to interact with.

zX41ZdbW · on March 3, 2024

Observable integrates really well with ClickHouse using its REST API, like in this example: https://observablehq.com/@stas-sl/github-issues-survival-ana...

But I didn't try the new Observable Framework - interesting to see similar examples where it queries a database live. I hope that preloading and caching all the data is not the only option because these types of apps should be interactive. Ideally, it should expose SQL for live editing.

simonw · on March 3, 2024

Fetching data live still works - the static data loader piece is optional. My demo here uses fetch() to load data at runtime: https://simonw.github.io/observable-framework-experiments/pa...

floodle · on March 3, 2024

I just feel like they are limiting their user base by only supporting Javascript.

It's of course the de-facto language for interactive display in browsers. The use case for dashboards and data visualisation is clear. But it's an awful language for data science and data analysis, compared to Python or R.

simonw · on March 3, 2024

One of the neat new features of Observable Framework is you can drop in a build script to gather the data that's written in any language you like. https://observablehq.com/framework/loaders

So you absolutely can do the data processing step in R or Python and have that output JSON or CSV which is then visualized at the end using JavaScript.

Not a small feature, but I bet it would be possible to use WebAssembly to add support for Markdown blocks that get executed in other languages as well, using Pyodide for Python for example.

no_wizard · on March 3, 2024

>It's of course the de-facto language for interactive display in browsers

This is it, more or less.

It is far, far easier to build an app like this where you want a plethora of users as a web application than a native one, for instance.

For anything JavaScript as a runtime / language is missing, WASM can boost as well. For math and data science, WASM is a natural choice for any missing pieces

jamra · on March 4, 2024

I used Pyodide (https://pyodide.org/en/stable/) as a python execution environment in browser. That was pretty successful. There are also python libraries that let you generate a config which is later made responsive via a JS library. Pyodide runs in wasm.

FarhadG · on March 3, 2024

Can you elaborate further on what you have in mind here?

jarpineh · on March 4, 2024

In Observable Framework data work can be done before hand with Python or anything else really. JavaScript enjoys best integration into browser which is hard to deal with from server side.

I myself like the self-contained aspect of it, since I can publish static files. Also, D3 is the pioneering library for data viz on the web. Especially with maps, which is what I've used it for back in the day. Time to refresh those skills.

I wonder if one could combine this and reactive Python based Jupyter notebook alternative https://docs.marimo.io/guides/wasm.html

Perhaps with web component packaging it should be doable. Web component attributes might allow tying reactive events from one side to the other.

lakomen · on March 4, 2024

So, instead of HTML you have to use Markdown and special tags if you want to use JS.

I don't see the advantage

simonw · on March 4, 2024

Take a look at my dashboard example, it should help show why this is useful. You can get a lot done with very little combined Markdown and JavaScript - building the same thing in HTML plus JavaScript would have taken a bunch more code.

Demo: https://simonw.github.io/observable-framework-experiments/pa...

Source code: https://github.com/simonw/observable-framework-experiments/b...

orf · on March 4, 2024

This looks fantastic! I’ve been waiting for something like this.

My only gripe is that data loaders don’t seem to support Parquet files, which is really annoying.

There’s an interesting possibility here where you can have large datasets in Parquet, exposed via HTTP, whilst being generated at build time with all the benefits that gives you (being able to read only specific columns, filtering via row group statistics etc). Not dissimilar to the “SQLite over http” WASM demo I guess.

Because right now I need to take my large, nicely compressed dataset and export it as either a CSV or a zip file? And then the browser needs the entire thing, even if I’m just viewing a subset of the data? Which is much bigger and much slower than it needs to be.

simonw · on March 4, 2024

There's a parquet example here:

https://github.com/observablehq/framework/blob/main/examples...

Using data from here: https://github.com/observablehq/framework/tree/main/examples...

Rendered version here: https://observablehq.com/framework/examples/api/

If you want to run a Data loader that outputs to parquet there are plenty of ways to do that - I would suggest a Bash or Python script that wraps DuckDB.

mbostock · on March 4, 2024

We do support it. (And use it!) Please see: https://observablehq.com/framework/lib/arrow#apache-parquet

orf · on March 4, 2024

Ahh, amazing! That’s not entirely clear from the data loading docs[1], which when I read it seemed only focused around CSV and JSON.

1. https://observablehq.com/framework/loaders

johnnunn · on March 4, 2024

Just wondering, can one embed the observable framework page into another site ? Or should it have to be a separate static site as demoed in the website.

lioeters · on March 4, 2024

From what I read when Framework was released, currently it is a static site generator and can't be embedded into another site like a library.

johnnunn · on March 4, 2024

Thank you.

skadamat · on March 4, 2024

Simon -- have you played with Evidence.dev much?

nojito · on March 4, 2024

Observable still pales in comparison to Quarto.

https://quarto.org/

eacapeisfutuile · on March 4, 2024

In what way? Isn’t a big part of observable the community provided content, which does not seem to be what this tool provides?

spinningslate · on March 4, 2024

that's an unhelpful articulation that runs the risk of having the opposite effect from the one you want.

That would be a pity, because Quarto is really good. I haven't tried Observable yet, but in outline they have some similarities:

1. Documents written in Markdown

2. Ability to embed code blocks in the Markdown, with code executed when the document is rendered.

3. Ability to embed output of the code blocks in the rendered result (e.g. tables, charts).

4. Ability to render to multiple formats (pdf, static site, ...).

Quarto supports Python and R as languages in the code blocks (maybe more, not sure). I personally prefer it to Jupyter notebooks because the source is plain text so (1) there's a choice of editor and (2) moving between text and code blocks is seamless.

I can't say Quarto is better than Observable but it is good. It has depth from its history in RMarkdown (like rendering mathematical equations, naming & cross-referencing).

It's certainly worth consideration for anyone looking for a "code notebook" solution.

kragen · on March 4, 2024

klysm · on March 4, 2024

I don't know much about observable, but it seems like they might be hijacking d3 a bit too much for my taste - it makes me a bit nervous about the future of d3.

simonw · on March 4, 2024

Observable is from the same creator as D3 (Mike Bostock) and D3 has been a core component of the Observable platform since they first launched their notebook product back in 2018.

I don't see Framework changing things there - if anything the ISC license should make it a better partner for D3.