You could also go entirely into the matrix and write custom functions + wrappers for SQLite that are able to parse & pass-through shell commands. For instance:
This stuff is dangerously powerful. User/Application-defined functions in SQLite do not have to be free of side-effects. You could also do something like:
tail -n 50 mylog.txt | customsqlite "SELECT Shell('ping ' || ip.IpAddress) FROM ParseAsMyIpData(STDIN()) ip"
Where the following are custom UDFs:
- MyLogParser
- Shell
- ParseAsMyIpData
- STDIN
This sounds similar to a proof-of-concept project I've been putting together for simila ruse cases: https://github.com/augmentable-dev/flite. Basically UDFs + virtual tables (table-valued functions) that are smaller helpers for using SQLite to slice and dice shell output. Still totally WIP, but I'm curious to play it out and see where it goes
-- flite
SELECT yaml_to_json("hello: world")
# murex
tout yaml "hello: world" | format json
.
-- flite
SELECT json_to_yaml('{"hello":"world"}')
# murex
tout json {"hello":"world"} | format yaml
What's happening here is the pipes are typed (rather than dumb byte streams) so subsequent commands, if they support _murex_ pipes as opposed to dumb POSIX pipes, are able to take advantage of a multitude of different marshallers without each command needing to worry the underlying file format of the content.
`open`, as you might expect, opens the file and pushes the content to STDOUT. It's a little bit like `cat` except it pushes data type information down the pipe. It also differs a little from cat in that it can render different output depending on whether STDOUT is a TTY or a pipe. eg JSON can be formatted for human readability if it's a TTY or minimised for bandwidth if it's a pipe. The shell gets a little bit more clever here though in that images are rendered to the terminal if STDOUT is a TTY but pushed as a byte stream if it's a pipe. https://murex.rocks/docs/commands/open.html
`format`, as you might have guessed, is for changing the data type and data format (you can change the data type without the format by using `cast`). https://murex.rocks/docs/commands/format.html
---
Looking through your code, it's pretty cool what you've written too. Also loving the clean readable code. Good work there.
That's basically the same as what I'm showing off here. Except that the log parsing happens prior to the data being imported into sqlite3 but aside from that all the field names in the sqlite3 table are the field names in the log files.
Because the log files are parsed prior to importing them, it also means the SQL you write in is cleaner and more readable.
Not taking anything away from what you've exampled there though. That does look really cool too.
So apparently it runs the query against an in-memory sqlite instance. I'm curious what the rough time overhead for that is (compared to, say, the overhead of calling out to grep). Does it feel noticeably slow to run a query on ps output?
And the other thing I'm wondering, can this drop you into a sqlite command line? I.e.,
> I'm curious what the rough time overhead for that is (compared to, say, the overhead of calling out to grep). Does it feel noticeably slow to run a query on ps output?
It took 0.2 seconds to run the command on my relatively modest 4 yr old MBP.
Running a similar pipeline (below) took 0.4 seconds. But there are optimisations that can be made to the command line:
time {ps aux -> awk {print $1} -> sort -> uniq -c -> sort}
This is just loading 510 rows of data though. While sqlite3 is very well optimised for loading in millions of records, the interface that sits between sqlite3 and the shell isn't (yet). Currently it reads in the entire dataset then imports it into the database. The reason it does this is so that it can work with any tabulated data format that is supported by the shell eg jsonlines that might look like the following:
However since UNIX pipes run concurrently it would mean if you were to load in millions of lines, the `sort`/`uniq` version would be better optimised I would imagine. Grep certainly would be.
The long term plan for this feature I've written is to make it streamable. In fact I do already have some draft code written to do that. But I figured I'd get an early draft of this out there to gauge peoples thoughts.
> And the other thing I'm wondering, can this drop you into a sqlite command line?
Currently no. And it's not something I'm looking to add support for because there's already a lot of decent SQL tools out there. But I was thinking of adding an option to write the database to disk as a standard .db file and you can then use your favourite interactive sqlite3 tool to query the data.
> However since UNIX pipes run concurrently it would mean if you were to load in millions of lines, the `sort`/`uniq` version would be better optimised I would imagine.
Iirc, sort still needs to load the input in memory to be able to sort correctly, I presume. Uniq indeed works on the stream directly.
Boy do I hate blog posts with no link to the page for the actual project. I’m tapping “Murex” in the banner, that should take me to the homepage so I can see what the heck it is!
I'm guessing you're on a phone? The desktop experience is better because you get links on the left but I haven't yet worked out how to replicate that on mobile yet.
My first thought too. I could find no mention of why this project chose this particular name. It feels like it’s only a matter of time before lawyers descend.
> I could find no mention of why this project chose this particular name.
There's so much information you need to pack into a landing page that for independent projects which would automatically have less eyeball time anyways, information like "why did we choose this name" is just wasted space which could discuss a more compelling feature. Maybe I'll write a blog post on it one day but I don't want to take attention away from the features.
But in answer to your question: This is a shell, like Bash but less POSIX compliant. So I wanted a name that sounded like a shell. Murex is a type of shell (a genus of see snail). Murex also sounds a lot like Unix, Linux, Minix, etc which sounded kind of apt given this is inspired by UNIX shells and chiefly designed for UNIX and UNIX-like platforms.
Frankly name clashes happen all the time. Half the "Show HN" threads have comments akin to "poor choice of name". I spent probably longer than I should admit to looking for a name with zero collisions and this was the best I could find since at least the other murex is in a totally different domain. However if I do get a complaint from the lawyers of the other murex then I'll consider a name change. Suffice to say, naming things is hard.
Hm I'm surprised that sqlite seemed to introduce some sort of cross compilation issue? I would have thought it's very easy to cross compile.
I'd be interested to see some examples of the 'select' builtin. (I also think it's nicer if SQL queries start with "from", but sqlite doesn't seem to support that, unlike LINQ and I think a few other SQL engines)
> Hm I'm surprised that sqlite seemed to introduce some sort of cross compilation issue? I would have thought it's very easy to cross compile.
To be fair, it might not be a problem / or might be very easy to resolve. I do vaguely recall problems cross compiling on Windows but that was a while ago. However given this is quite an experimental feature I think it's still fair not bundling it with the default build.
> I'd be interested to see some examples of the 'select' builtin. (I also think it's nicer if SQL queries start with "from", but sqlite doesn't seem to support that, unlike LINQ and I think a few other SQL engines)
I actually drop the FROM clause because that felt superfluous to needs for one table queries. An example:
ps aux | select count(*), user GROUP BY user ORDER BY 1
Go is extremely easy to cross-compile, but adding C libraries will mean you also need a C cross-compiler (and the various go headers & libs for CGO built for that platform)
Why? I've spent a lot of time purposely to make it readable and, being dyslexic myself, I've found this final style sheet more readable than 99% of the other sites I visit. But if there are specific problems you're finding then I'd love to hear it
- The body font is odd, I deactivated it in the dev tools and preferred the default.
- There is a letter spacing of 0.08em, which makes things very wide. I first deactivated it, but then it felt cramped. So I modified it to .02 and it looked about right.
- Text shadow on titles makes them look blurry. Looked a little better when I reduced the blur, but probably would look better to remove it.
> The body font is odd, I deactivated it in the dev tools and preferred the default.
Is it actually hard to read though or just a preference? I don't mind changing things if they're objectively worse. But if it's just personal preference then that's a different matter.
The reason I say that is I do actually find that font easier to read than your typical sans-serif. Which is why I picked it. But maybe I'm not a good data point being dyslexic -- perhaps I'm just wired differently when it comes to what's readable?
> There is a letter spacing of 0.08em, which makes things very wide. I first deactivated it, but then it felt cramped. So I modified it to .02 and it looked about right.
Cool. I'll have a play there. Thank you
> Text shadow on titles makes them look blurry. Looked a little better when I reduced the blur, but probably would look better to remove it.
There's text shadow all over the place. I'll remove all of it shortly and see if that makes a difference. Thank you