I used to be a submariner and now work in an unrelated acoustic space (acoustic analysis of the electric grid), but I'd love to learn more about the DAS world — my email is in my profile.
This reminds me of _why's final writing, where he discusses leaving his public persona in programming. He specifically cites the preservability of code as a reason for his disillusionment.
Code as art or political movement always baffled me.
I guess anything can be a statement about the world, and we all want our lives to have meaning, so something you poured most of your waking hours into must mean something.
But as someone that sits by a computer for the better part of 30 years I just cant find beauty in it, just sadness for the human condition. Old hardware and software do not make me nostalgic, or makes me feel like we lost anything of value. It gives me the same feeling those abandoned listening posts in Chernobyl gave me, decay, the fleeting nature of existence.
You can't fight the fact that the world will keep on keeping on, with or without you. Be a person that matters to those around you, it's the best you are going to get.
Galois theory investigates the way a subobject sits inside an object. This paper examines the way different programs sit inside the same algorithm, and the way different algorithms (mergesort, quicksort, etc.) sit inside the same class (sorting algorithm).
What are some alternatives that you'd recommend people look at?
I'm keen for any potential successor, but a lot things come back to effectively text, and anything we could build on top of it. Binary object formats seem like an alternative for faster parsing of structured data, so while that ability has always been it, it's a matter of people actually sticking to it. Maybe some coordination between the program and the shell?
Some alternatives is a structured data interface, a bit like PowerShell where not everything needs to be serialized and deserialized and parsed in every program in a pipeline.
Another approach is that the OS is basically a complete environment where everything is code. You can see the idea in the Smalltalk environments where you can theoretically interact with every object by sending messages. Lisp machines come to mind as well and one could even consider early personal computers that booted into BASIC as an idea of this (though in BASIC instead ov everything is a file everything is memory)
Generally I think people are happy piping structured data around, the problem is that ties you into a particular data structure. JSON seems to be winning there, and I think people would be very happy if more commands had a JSON environment variable, or even if it was possible to output data on a /dev/stdjson pipe.
Anyone know how to add a new stdout-like interface to unix-like OSes?
> The functions defined in libxo are used to generate a choice of TEXT, XML, JSON, or HTML output. A common set of functions are used, with command line switches passed to the library to control the details of the output.
JSON is a good start, but Powershell is great in that a date is actually a date object, which means that I can do operations on that without faffing around with parsing and worrying about whether what comes in the JSON might depend on the locale of some system, or that somebody didn't take into account that I might want microseconds and truncated the timestamp.
Right, but at that point you're tying it to a universal type system.
Personally I like YAML, since you can add type-hints to data which you can use to turn simple text types into more complex types, but I can see why that standard wouldn't take off.
Most of the unix-philosophy people I know are interested in stuff like that, it just has to be implemented in a thoughtful way.
What value would a new standard pipe for JSON bring? It's already serialized text that can be sent to stdout. The real rub is getting programs to speak JSON natively.
>> The real rub is getting programs to speak JSON natively.
You mean getting programs to ingest arbitrary data structures? OK, JSON is not arbitrary - it is limited to a certain overall format. But it can be used to serialize arbitrarily complex data structures.
Meaning most Unix command line tools are line-oriented meaning input and output are expected to have a series of text records, each on its own line. Stray too far from this lingua franca and you'll need to get creative to get piped IO to work.
JSON doesn't care so much about lines and doesn't necessarily represent an array of records, so it doesn't fit into this box.
I think it would be cool if stdout worked something like the clipboard, where you have different representations, and when an application copies something from the clipboard it selects the representation it wants. I'm not sure how to avoid making it terribly wasteful though, so that the producing application doesn't have to write in every format possible.
Imagine a magical `ls` that can emit text/plain, application/json, what have you:
ls -f text
ls -f json
ls -f csv
ls -f msgpack
...
Now instead of specifying formats on both ends:
ls | jq # jq accepts application/json => ls outputs as json
ls | fq # fq accepts a ton of stuff => "best fit" would be picked
ls | fq -d msgpack # fq accepts only msgpack here => msgpack
ls # stdout is tty, on the other end is the terminal who says what it accepts => human readable output
Essentially upon opening a pipe a program would be able to say what they can accept on the read end and what they can produce on the write end. If they can agree on a common in-memory binary format they can literally throw structs over the fence - even across languages, FFI style - no serialisation required, possibly zero-copy.
And I mean, we really know: the last one we already do! Tons of programs check for stdin and/or stdout being tty via an ioctl and change their processing based on that.
It'd allow a bunch of interesting stuff, like:
- `cat` would write application/octet-stream and the terminal would be aware that raw binary is being cat'd to its tty and thus ignore escape codes, while a program aiming to control the tty would declare writing application/tty or something.
- progressive enhancement: negotiation-unaware programs (our status quo) would default to text/plain (which isn't any more broken that the current thing) or some application/unix-pipe or something.
- when both ends fail to negotiate it would SIGPIPE and yell at you. same for encoding: no more oops utf8 one end, latin1 on the other.
Interesting post! when prototyping what would end up being fq i did quite a lot of tinkering with how it could work by splitting it up into multiple tools, and use it like: <decode tool> | <query tool> | <display tool>. Never got it to feel very neat and problem is what would be piped? i tried JSON so that <query tool> could jq but whatever is would be would have to be quite verbose to be able for <display tool> to be able to show hexdump dumps, value mappings, descriptions etc. So in the end i ended up doing more or less the same design but in jq where the values piped between filters is kind of a "polymorphic" JSON values. Those behave behave like JSON value but can be queried for which bit range and buffer they originate from or if the value is symbolic representation, description etc.
Maybe an interesting note about fq is that for format like msgpack it kinds of can decode in two modes, by default it decodes msgpack into like a "meta"-mode where things like integer encoding, sizes etc can be seen and then there is another "representation"-mode that is JSON value of what it represents.
One alternative is Apache Arrow. It's an efficient in-memory format used to represent flat or hierarchal data. Multiple databases and programs like pandas support this, so they become compatible with each other.
> What are some alternatives that you'd recommend people look at?
For me, this would mean an OS that supports both static and runtime reflection, and therefore code-generation and type generation. Strong typing should become a fundamental part of OS and system design. This probably means having an OS-wide thin runtime (something like .NET?) that all programs are written against.
UNIX's composability was a good idea, but is an outdated implementation, which is built on string-typing everything, including command-line options, environment variables, program input and output, and even memory and data itself ('a file is a bag of bytes').
The same thing has happened to C (and therefore C++) itself. The way to use a library in C (and C++) is to literally textually include function and type declarations, and then compile disparate translation units together.
If we want runtime library loading, we have to faff with dlopen/dlsym/LoadLibrary/GetProcAddress, and we have to know the names of symbols beforehand. A better implementation would use full-fledged reflection to import new types, methods, and free functions into the namespace of the currently-executing program.
James Mickens in The Night Watch[1] says 'you can’t just place
a LISP book on top of an x86 chip and hope that the hardware
learns about lambda calculus by osmosis'. Fair point. But the next-best thing would have been to define useful types around pointers as close to the hardware as possible, rather than just saying 'here, this is the address for the memory you want from HeapAlloc; go do what you want with it and I will only complain at runtime when you break it'. Pointers are a badly leaky abstraction that don't really map to the real thing anyway, given MMUs, memory mapping, etc.
There are so many ways we could make things a bit more disciplined in system and OS design, and drastically improve life for everyone using it.
Of the major vendors, I'd say only Windows has taken merely half-hearted steps towards an OS-wide 'type system'. I say half-hearted, because PowerShell is fantastic and handles are significantly more user-friendly than UNIX's file descriptors, thread pointers, and pipes. But this is still a small oasis in a desert of native programs that have to mess with string-typing.
> The presence of the negative signs in (1) may seem surprising at first, but this is due to the fact that (1) is describing the effect of a passive change of units rather than an active change of the object {x}.
This is where the limits of my brain were reached. Is there a translation of this into category theory terms? Is this where category theory could help formalize units in physics?
However, his paragraph after that is pretty interesting, which I read as sort of treating units as variables since you couldn't combine them, and he only has length, mass, and time for these examples. But then there's an exponent piece? Okay now I'm lost again.
Maybe this is obvious and not what you are asking, but he's just saying that if you increase your unit of measurement by a factor of X, then the number of such units that comprise your object decreases by the same factor.
(Also, this quote is from the Terry Tao blog post that dang links below, not the OP, right?)