Murex shell now supports mixing SQL and shell pipelines

bob1029 · on April 30, 2021

You could also go entirely into the matrix and write custom functions + wrappers for SQLite that are able to parse & pass-through shell commands. For instance:

  ./customsqlite "SELECT 1, 3, 5 FROM MyLogParser(Shell('tail -n 50 mylog.txt'))" | grep 'something really important'

This stuff is dangerously powerful. User/Application-defined functions in SQLite do not have to be free of side-effects. You could also do something like:

  tail -n 50 mylog.txt | customsqlite "SELECT Shell('ping ' || ip.IpAddress) FROM ParseAsMyIpData(STDIN()) ip"

  Where the following are custom UDFs:
  - MyLogParser
  - Shell
  - ParseAsMyIpData
  - STDIN

See: https://www.sqlite.org/appfunc.html

patrickdevivo · on May 1, 2021

This sounds similar to a proof-of-concept project I've been putting together for simila ruse cases: https://github.com/augmentable-dev/flite. Basically UDFs + virtual tables (table-valued functions) that are smaller helpers for using SQLite to slice and dice shell output. Still totally WIP, but I'm curious to play it out and see where it goes

hnlmorg · on May 2, 2021

That's the same as what I've built except I've integrated it into the shell whereas you've integrated it into sqlite3.

Take your README examples:

  -- flite
  SELECT * FROM lines("/path/to/some/file.ndjson")

  # murex
  open /path/to/some/file.ndjson | select *

.

  -- flite
  SELECT readfile("/path/to/file.json")

  # murex
  open /path/to/file.json | select *

.

  -- flite
  SELECT yaml_to_json("hello: world")

  # murex
  tout yaml "hello: world" | format json

.

  -- flite
  SELECT json_to_yaml('{"hello":"world"}')

  # murex
  tout json {"hello":"world"} | format yaml

What's happening here is the pipes are typed (rather than dumb byte streams) so subsequent commands, if they support _murex_ pipes as opposed to dumb POSIX pipes, are able to take advantage of a multitude of different marshallers without each command needing to worry the underlying file format of the content.

`open`, as you might expect, opens the file and pushes the content to STDOUT. It's a little bit like `cat` except it pushes data type information down the pipe. It also differs a little from cat in that it can render different output depending on whether STDOUT is a TTY or a pipe. eg JSON can be formatted for human readability if it's a TTY or minimised for bandwidth if it's a pipe. The shell gets a little bit more clever here though in that images are rendered to the terminal if STDOUT is a TTY but pushed as a byte stream if it's a pipe. https://murex.rocks/docs/commands/open.html

`tout` is `typed output`. So you're saying "echo this output and set the data type to ..." https://murex.rocks/docs/commands/tout.html

`select` is the sqlite3 connector https://murex.rocks/docs/optional/select.html

`format`, as you might have guessed, is for changing the data type and data format (you can change the data type without the format by using `cast`). https://murex.rocks/docs/commands/format.html

---

Looking through your code, it's pretty cool what you've written too. Also loving the clean readable code. Good work there.

hnlmorg · on May 2, 2021

That's basically the same as what I'm showing off here. Except that the log parsing happens prior to the data being imported into sqlite3 but aside from that all the field names in the sqlite3 table are the field names in the log files.

Because the log files are parsed prior to importing them, it also means the SQL you write in is cleaner and more readable.

Not taking anything away from what you've exampled there though. That does look really cool too.

kbenson · on April 30, 2021

There are projects like fsql[1] that support this and make it easier in many respects too. SQL is a powerful syntax for tabular data.

This is also one of the selling points of Microsoft's PowerShell (which I think you can also install on Linux?).

1: https://metacpan.org/pod/distribution/App-fsql/script/fsql

rstarast · on April 30, 2021

That's pretty cool. Feature docs including an example: https://murex.rocks/docs/optional/select.html

So apparently it runs the query against an in-memory sqlite instance. I'm curious what the rough time overhead for that is (compared to, say, the overhead of calling out to grep). Does it feel noticeably slow to run a query on ps output?

And the other thing I'm wondering, can this drop you into a sqlite command line? I.e.,

» ps aux -> select -i

sql> SELECT ...

hnlmorg · on April 30, 2021

> I'm curious what the rough time overhead for that is (compared to, say, the overhead of calling out to grep). Does it feel noticeably slow to run a query on ps output?

It took 0.2 seconds to run the command on my relatively modest 4 yr old MBP.

Running a similar pipeline (below) took 0.4 seconds. But there are optimisations that can be made to the command line:

    time {ps aux -> awk {print $1} -> sort -> uniq -c -> sort}

This is just loading 510 rows of data though. While sqlite3 is very well optimised for loading in millions of records, the interface that sits between sqlite3 and the shell isn't (yet). Currently it reads in the entire dataset then imports it into the database. The reason it does this is so that it can work with any tabulated data format that is supported by the shell eg jsonlines that might look like the following:

    ["Name", "Session", "Score", "Completed"]
    ["Gilbert", "2013", 24, true]
    ["Alexa", "2013", 29, true]
    ["May", "2012B", 14, false]
    ["Deloise", "2012A", 19, true]

However since UNIX pipes run concurrently it would mean if you were to load in millions of lines, the `sort`/`uniq` version would be better optimised I would imagine. Grep certainly would be.

The long term plan for this feature I've written is to make it streamable. In fact I do already have some draft code written to do that. But I figured I'd get an early draft of this out there to gauge peoples thoughts.

> And the other thing I'm wondering, can this drop you into a sqlite command line?

Currently no. And it's not something I'm looking to add support for because there's already a lot of decent SQL tools out there. But I was thinking of adding an option to write the database to disk as a standard .db file and you can then use your favourite interactive sqlite3 tool to query the data.

spockz · on April 30, 2021

> However since UNIX pipes run concurrently it would mean if you were to load in millions of lines, the `sort`/`uniq` version would be better optimised I would imagine.

Iirc, sort still needs to load the input in memory to be able to sort correctly, I presume. Uniq indeed works on the stream directly.

jhgb · on April 30, 2021

Also a comparison with http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/NoSQL/ or similar might be in order if we're talking about shell pipelines and relational data.

shawkinaw · on April 30, 2021

Boy do I hate blog posts with no link to the page for the actual project. I’m tapping “Murex” in the banner, that should take me to the homepage so I can see what the heck it is!

hnlmorg · on April 30, 2021

I'm guessing you're on a phone? The desktop experience is better because you get links on the left but I haven't yet worked out how to replicate that on mobile yet.

Apologies for the frustration there. The project about page is https://murex.rocks and the content there mirrors the README on the git repo which can be found at https://github.com/lmorg/murex

eb0la · on April 30, 2021

Murex is also a risk management software for financial institutions. I saw the title and said WTF?!

nchi3 · on April 30, 2021

Yeah, I was surprised to see Murex mentioned on HN as well until I realized it's not the Murex I was thinking of!

jl6 · on April 30, 2021

My first thought too. I could find no mention of why this project chose this particular name. It feels like it’s only a matter of time before lawyers descend.

hnlmorg · on April 30, 2021

> I could find no mention of why this project chose this particular name.

There's so much information you need to pack into a landing page that for independent projects which would automatically have less eyeball time anyways, information like "why did we choose this name" is just wasted space which could discuss a more compelling feature. Maybe I'll write a blog post on it one day but I don't want to take attention away from the features.

But in answer to your question: This is a shell, like Bash but less POSIX compliant. So I wanted a name that sounded like a shell. Murex is a type of shell (a genus of see snail). Murex also sounds a lot like Unix, Linux, Minix, etc which sounded kind of apt given this is inspired by UNIX shells and chiefly designed for UNIX and UNIX-like platforms.

Frankly name clashes happen all the time. Half the "Show HN" threads have comments akin to "poor choice of name". I spent probably longer than I should admit to looking for a name with zero collisions and this was the best I could find since at least the other murex is in a totally different domain. However if I do get a complaint from the lawyers of the other murex then I'll consider a name change. Suffice to say, naming things is hard.

chubot · on April 30, 2021

Hm I'm surprised that sqlite seemed to introduce some sort of cross compilation issue? I would have thought it's very easy to cross compile.

I'd be interested to see some examples of the 'select' builtin. (I also think it's nicer if SQL queries start with "from", but sqlite doesn't seem to support that, unlike LINQ and I think a few other SQL engines)

hnlmorg · on April 30, 2021

> Hm I'm surprised that sqlite seemed to introduce some sort of cross compilation issue? I would have thought it's very easy to cross compile.

To be fair, it might not be a problem / or might be very easy to resolve. I do vaguely recall problems cross compiling on Windows but that was a while ago. However given this is quite an experimental feature I think it's still fair not bundling it with the default build.

> I'd be interested to see some examples of the 'select' builtin. (I also think it's nicer if SQL queries start with "from", but sqlite doesn't seem to support that, unlike LINQ and I think a few other SQL engines)

I actually drop the FROM clause because that felt superfluous to needs for one table queries. An example:

    ps aux | select count(*), user GROUP BY user ORDER BY 1

https://murex.rocks/docs/optional/select.html

vluft · on April 30, 2021

Go is extremely easy to cross-compile, but adding C libraries will mean you also need a C cross-compiler (and the various go headers & libs for CGO built for that platform)

Blahah · on May 1, 2021

Given the fun name I was expecting to see the website have some Tyrian purple elements :)

throwaway823882 · on April 30, 2021

Very hard to read this website.

hnlmorg · on April 30, 2021

Why? I've spent a lot of time purposely to make it readable and, being dyslexic myself, I've found this final style sheet more readable than 99% of the other sites I visit. But if there are specific problems you're finding then I'd love to hear it

mixmastamyk · on April 30, 2021

I was curious so took a look. IMHO:

- The body font is odd, I deactivated it in the dev tools and preferred the default.

- There is a letter spacing of 0.08em, which makes things very wide. I first deactivated it, but then it felt cramped. So I modified it to .02 and it looked about right.

- Text shadow on titles makes them look blurry. Looked a little better when I reduced the blur, but probably would look better to remove it.

hnlmorg · on April 30, 2021

> The body font is odd, I deactivated it in the dev tools and preferred the default.

Is it actually hard to read though or just a preference? I don't mind changing things if they're objectively worse. But if it's just personal preference then that's a different matter.

The reason I say that is I do actually find that font easier to read than your typical sans-serif. Which is why I picked it. But maybe I'm not a good data point being dyslexic -- perhaps I'm just wired differently when it comes to what's readable?

> There is a letter spacing of 0.08em, which makes things very wide. I first deactivated it, but then it felt cramped. So I modified it to .02 and it looked about right.

Cool. I'll have a play there. Thank you

> Text shadow on titles makes them look blurry. Looked a little better when I reduced the blur, but probably would look better to remove it.

There's text shadow all over the place. I'll remove all of it shortly and see if that makes a difference. Thank you

hnlmorg · on April 30, 2021

Changes made. Is that any better?

mixmastamyk · on April 30, 2021

Yes, does look better, although looking on an iOS device, which may also have an effect. For example, the font.