Hacker News new | past | comments | ask | show | jobs | submit login
Automatic Generation of Visualizations and Infographics with LLMs (microsoft.github.io)
175 points by monkeydust on Aug 29, 2023 | hide | past | favorite | 53 comments



Last week I helped someone organizing and analyzing their data in Excel. As I'm using Excel only once every couple of years, I had to rewatch the wonderful "You Suck at Excel with Joel Spolsky" to be productive again. Now seeing this announcement page, I was immediately reminded of the mini-rant towards the end of the video [0]:

> On average, once every three months, there's a startup that makes a thing that they say is going to be amazing, and it's just PivotTables. They're like, "It works with Excel, and it does this amazing consolidation, and slicing and dicing of all your data, and it's amazing, and we're going to make a startup. I'm going to sell this for four hundred ninety-five dollars." And that happens at least once every three months. The trouble is, the VCs usually know about PivotTables.

Of course this product goes a little further, making suggestions what columns to analyze and chart with an LLM. But it's quite funny to me that this Microsoft Research product is reinventing the PivotTable (+PivotChart) part with Python and Pandas.

[0]: https://youtu.be/0nbkaYsR94c?si=kkfFHZ_fyGmG3Lnj&t=2988


It's impossible to use excel for big data, the application has soft limits due to responsiveness.


But nobody today is going to read let alone promote a blog about pivot tables. Sprinkle in LLM references, and the fad wave riders will sing its praises


Even Microsoft has to know Excel is shit software for large datasets. I can’t even get it to do a VLOOKUP correct half the time.


To be nitpicking here: Its not reinventing if its new.

And the focus of this research was probably not to invent PivotTables but the Interface for these through LLMs


> Interface for these through LLMs

no code pivot tables?


Pivollama


I took this a step further, turning charts into infographics using stable diffusion (SDXL)

https://karimjedda.com/beautiful-data-visualizations-powered...


It's chartjunk and should be used sparingly (e.g., on covers) rather than the actual content. I think it would be used frivolously in the hands of an undisciplined person.

https://en.wikipedia.org/wiki/Chartjunk


That's a really nice idea. Have you thought about making it a product?


Thank you for the kind words. I wouldn't even know where to start. There's a lot I can code and do, but making something into a product is something I have no experience with.

What's your recommended approach? Even if open-source, curious to learn.


Continue as an open source project by building features and refining the product. Once you have enough users you can start offering premium features and support to organizations. Maybe even apply to an accelerator? Good Luck!


Thanks for the positivity, I'll give it a shot.


I would pay $5 for the occasional, chart-heavy month to upload my graphs without proprietary detail and get a few images to reroll through.


If the product got good enough you’d have a shot at selling to Canva.


The technology for this type of generation (ControlNet) is already open source and relatively straightforward to reproduce the charts demoed in that post without shenanigans.

There's no moat.


Excellent technical work but subject to the same major questionmarks around the morality and legality of LLM business models. From the discussion section:

> Low Resource Grammars: ... LIDA depends on the underlying LLM s having some knowledge of visualization grammars as represented in text and code in its training dataset (e.g., examples of Altair, Vega, Vega-Lite, GGPLot, Matplotlib, represented in Github, Stackoverflow, etc.). For visualization grammars not well represented in these datasets (e.g., tools like Tableau, PowerBI, etc., that have graphical user interfaces as opposed to code representations) ), the performance of LIDA may be limited without additional model fine-tuning or translation.

In other words, open source programmatic visualizations are required to feed the LLM, which then can, e.g., be licensed to corporates to accelerate various internal exploratory data analyses. A win-win for corporates and LLM providers.

Spot the loser.


And what in particular is now novel or unique to the general 'issue' you mention?

Most companies use OpenSource in one way or the other.

Nonetheless, a company like MS has probably already build visualizers purely commercially (see excel) or/and is absolutly able to write it themselfs.


I am not sure what you are talking about.

If I release a novel visualization library on github under some open source license I want it to be attributed to me. I don't want some specialized LLM to be lifting and offering the same visualization concepts to unnamed corporates for a hefty fee without me ever even knowing about it and these corporates pretending they don't know where that concept is coming from.

It is you choice whether you think that is a problem and how "novel" it is. Theft after all has a very long history.


Everything you described been possible since the dawn of intellectual property. Just replace LLM with "person".

Furthermore, it isn't theft to learn from others' work and reproduce similar qualities.


Possible is not the same as admissible.

Good to know that the prevailing commercial tech culture now sees plagiarism and stealing ideas without attribution as the modern way of doing business and hopes that dressing things up under some algorithmic veil will hide the act.

I guess the pit of moral decline has no bottom. The consolation is that theft has never been the road to wealth. Once the plundering is over the only thing that is left is a wasteland.

It seems that Microsoft has finally found a way to kill the open source "cancer".


I'm afraid I'm just unclear on exactly what part of this you argue is crossing a moral line.

I.e. what is being stolen without attribution? I'm genuinely not getting what you mean in this specific case.


Limited visualization grammar means that any non-trivial visualization request will be lifting a particular solution, more or less verbatim.


I don't see how it's possible to show that the solution is lifted by the LLL as opposed to a arrived at by the LLM.

It seems to me that such solutions are soon to be within the set potentially constructed by an LLM.


As they say, people are unwilling to understand something if their monetary gain depends on not understanding it.

Let me break it down for you. If I ask for a visualization that squares the circle and there is one repo that has an example of squaring the circle, the LLM will "arrive" at a way of squaring the circle.


That's not really answering my question.

If (1) an LLM is able to arrive at solutions in the same class of difficulty as the solution for the target problem and (2) it's not possible to establish the provenance of the solution actually offered by the LLM, then what's the argument for assuming that the solution is based on IP rather than constructive reasoning?


thats too many ifs.

Retrain the LLM without access to the repo data. Ask for the same solution. Enjoy the hallucination. Provenance established.


By the way - your haste in ascribing bad motives to those disagreeing with you rather turned me off continuing this conversation.


Super cool.

Here the viz-related prompts (generation, editing, etc), for those interested: https://github.com/microsoft/lida/tree/main/lida/components/...

I built a tool that lets you use GPT to analyze data and build interactive graphs on the browser (https://deepsheet.dylancastillo.co/). I may try to adapt it to use LIDA or a similar approach.


Weird landing page - I’m expecting to see infographic examples but the images I see are of people looking at screens and smiling


No, absolutely not. How can you trust the output from such a black box system? Who is to say that the LLM won't add or remove data points to make the chart "look good"? Heaven help us if decision makers start taking this output seriously. But of course they will, because the charts will look professional and plausible, because that's what the prompt requires.


> Who is to say that the LLM won't add or remove data points to make the chart "look good"?

I don't think you're thinking creatively enough here. A good system that makes use of these concept (because it's a research project, not a product!) will likely ensure that actions the LLM takes are non-destructive and inherently undoable. For example, if the underlying data was changed by the LLM, you can statically verify that and show a warning, emit an error, or ... something else entirely!


Agreed. Our customers on the regulated side cannot use an unexplainable UI like that by law.

We take a middle ground with louie.ai of showing the database queries, data transforms, chart config, and any other decision or generation. It's nice being able to watch & check each step and then write in natural language what you want changed, so ends up feeling more like the easier side of pair programming than a blackbox.


"You are a helpful assistant highly skilled in writing PERFECT code for visualizations. Given some code template, you complete the template to generate a visualization given the dataset and the goal described. The code you write MUST FOLLOW VISUALIZATION BEST PRACTICES ie. meet the specified goal, apply the right transformation, use the right visualization type, use the right data encoding, and use the right aesthetics (e.g., ensure axis are legible). The transformations you apply MUST be correct and the fields you use MUST be correct. The visualization CODE MUST BE CORRECT and MUST NOT CONTAIN ANY SYNTAX OR LOGIC ERRORS. You MUST first generate a brief plan for how you would solve the task e.g. what transformations you would apply e.g. if you need to construct a new column, what fields you would use, what visualization type you would use, what aesthetics you would use, etc. YOU MUST ALWAYS return code using the provided code template. DO NOT add notes or explanations." (https://github.com/microsoft/lida/blob/main/lida/components/...)

They prompted that things MUST be correct in their prompts and it reports any transformations it does to your data, it might give you some insight into its logic to test yourself against the data.


Telling the LLM that it must do something is not a guarantee that it'll follow through.


True. This is an open area of research. Tools like guidance (or other implementations of constrained decoding with llms [1,2]) will likely help improve this problem.

[1] A guidance language for controlling large language models. https://github.com/guidance-ai/guidance

[2] Knowledge Infused Decoding https://arxiv.org/abs/2204.03084


How do you trust matplotlib? Same way: if you need to audit plots, audit the generated source code.


So instead of auding MPL once (or never because MPL doesn't have a habit of broken output) I should audit the output of this LLM for every query because it does have a habit of hallucinating?


Was playing with the library this morning, the interesting part to me was the 'goal explorer' which generates the questions to asks of the data.

Keen to see more research into this part specially making the questions more specific to the dataset in question and overlaying real-world situations.


Honestly, that's the more interesting and more difficult part. Anyone with basic training can be coerced to slice and dice schemas and configs until pretty graphs are produced. LLMs might not even be the best for that.

But knowing _what_ to look for in the data given a problem statement - that's valuable, and hard to teach. LLMs have such a broad base of "knowledge", they can be reasonably good at this in just about any domain.


Right, isn't knowing what to look for a must have on the path to AGI?


I would agree -- that's why (to me at least) the recent wave of LLMs is such a big deal. They make semantic contexts accessible for interaction with code logic.


Ironically I've found GPT to be pretty terrible with plotting libraries like plotly/dash or even matplotlib compared to just about anything else in python.


I wrote a simple wrapper around Matplotlib and ChatGPT-3.5-turbo. The LLM response is a Python code that is executed to get charts. It is working very nice. Here is a repo https://github.com/mljar/plotai - you will find two videos in the readme. Maybe you should work on your prompts?


Huh, neat. It never really bothered me enough/was important enough to spend specifically time on it since i was able to just hit a button to send them back for fixing but that's good to see the extra passes aren't explicitly needed.


I find the quality of the code really questionable:

system_prompt = """ You are a an experienced data analyst that can annotate datasets. Your instructions are as follows: i) ALWAYS generate the name of the dataset and the dataset_description ii) ALWAYS generate a field description. iv.) ALWAYS generate a semantic_type (a single word) for each field given its values e.g. company, city, number, supplier, location, gender, longitude, latitude, url, ip address, zip code, email, etc You return an updated JSON dictionary without any preamble or explanation. """

Some spelling errors, and where is number 3?


All you need is attention?


They should get rid of the infographer module as it really undermines the rest of the work.


I sort of know what I'm doing with data, so I don't want LLMs building any models for me, but I do like the concept of making my lame visualizations look more professional and slicker.


In the horsepower versus mpg example, every generated result creates new data on the plot that isn't there in the original plot. this is terrible.


This would really be something if you could just give it voice commands. Typing is absurd amidst all this wonderful automation!


Uncanny fingers.


Am I missing something here? From the video and examples this looks like it's helping you make Excel charts with less suck (slightly stylized), not really building what I would consider "infographics" in the traditional marketing sense. I guess it counts as visualizations, but not what I was expecting.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: