Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not sure I actually understand why K is pushed here almost daily. Is it just the esoterics? Alright, APL and K are expressive for array computation. Do many people here deal with array computation? K is also supposedly fast—not once have I seen an explanation of what makes it fast and why those techniques aren't looted for open-source languages.

I'm told that the interpreter is so small that it fits in the cache. Yeah, so how much time does a non-K program normally spend loading the code? How does this free K from loading and writing data?

Is K in fact a thin layer on top of assembler (specifically SIMD and other extensions), for number crunching in the vein of shaders/CUDA?



Not sure I actually understand why K is pushed here almost daily.

Most programmers find k interesting. k is fast. k is small. k & APL are great as a tool of thought. It has a "lost art" feel. It's an intellectually satisfying endeavor.

Do many people here deal with array computation?

Like three Dyalog employees occasionally contribute to discussion, fintech employees make up like 10% of the people regularly writing comments on the site, and array programming is incredibly popular when in disguise and made less efficient as Python (numpy), pretty much essential for multiple subfields of computer science, so forth.

K is also supposedly fast—not once have I seen an explanation of what makes it fast

People talk about it all the time here, though. 'geocar mentions why in almost every thread, and he should know: he helped write an operating system in the language.

and why those techniques aren't looted for open-source languages.

Because they don't want to believe it to be true! When told you're messing up something simple, you have to question your assumptions a bit. People hate doing this, especially when they've invested time in seeing them through.

Is K in fact a thin layer on top of assembler (specifically SIMD and other extensions), for number crunching in the vein of shaders/CUDA?

k is higher-level than C, Haskell, Rust, etc.


As a long time programmer, q is the only language I can hack at, look up 10 minutes later and 2 hours have gone.

Frustrating. Obtuse. Overall: fun.


There are open source implementations of Kx, e.g.

https://github.com/kevinlawler/kona

I do not know how good/performant they are.


I think everyone in this thread probably knows about kona. It's neat, but it's also trying to be an implementation for a very old version of k.

Here's a list of a few different k implementations. Most aren't aiming for performance.

https://ngn.bitbucket.io/k.html


>> Is it just the esoterics?

Kinda, yeah. It's more interesting--along some axes--than 'how we went from 100 customers to 101 customers' (which some how required a re-write in Java?)

But also I feel like a lot of people look at their C/Python/SQL/JS/HTML/CSS stacks and their thousand person company and hundred thousand lines of code that kind a sorta just directs people at a Stripe widget and think: We Clearly Are Doing Something Wrong.

Not that K/APL or Prolog or Smalltalk or whatever would actually magically make people do-the-right-thing or do-a-thing-better but they're sort of different worlds so they inspire a bit more wonder and reflection than 'how we moved from python to go to rust'


> Not sure I actually understand why K is pushed here almost daily. Is it just the esoterics? Alright, APL and K are expressive for array computation. Do many people here deal with array computation?

People like k for different reasons. For me, it's the right blend of fast to develop and runs fast enough. Someone recently made a database engine called okon which is designed to process HIBP and looking at their github it took 20+ days of C++ code to do. And they managed to get under 49µsec so they pat themselves on the back for it being so good.

But my k(q) solution was written in five minutes and runs in under 5µsec -- almost 10x faster. I used binary search instead of b-tree (a decision the okon author considered but eventually went the b-tree route instead), but it's otherwise the same basic strategy.

So for me, the decision is, 5 minutes, or 20 days, and that's a no-brainer.

> How does this free K from loading and writing data?

Your CPU has a shockingly small amount of directly-attached memory (around 64kb). That's been true since the 1970s, and is likely to be true for another fifty years.

This used to be called ram, but now we call anything directly addressable by the CPU as ram, and we call this thing cache (or L1 or whatever) because that's basically how most people use it.

Now what your CPU does all day, is look at a memory address, and if that's not in the cache, it sends a message to a memory controller and waits for that memory to show up (somewhat a simplification). Those instructions show up, now your CPU can execute them, and those instructions require more memory so again the CPU sends a message to the memory controller and waits for that memory to show up. All day long, alternating between these two wait states, we wait a bit for code, we wait a bit for data, and then we repeat.

Because K is small: Because the interpreter, the application source code, and all your state variables fit into that 64kb, you can see that memory waiting drops by half.

> Is K in fact a thin layer on top of assembler (specifically SIMD and other extensions), for number crunching in the vein of shaders/CUDA?

No. It's a very simple interpreted language that (in most implementations) dispatch directly on the source code (or in some cases, on the AST).

> why those techniques aren't looted for open-source languages.

If you really understand the above, the answer to this question is obvious (hint: try to think about values, not features). Until then; until you really understand the question you're asking, you're not going to find the answer very useful.


> But my k(q) solution was written in five minutes and runs in under 5µsec -- almost 10x faster. I used binary search instead of b-tree (a decision the okon author considered but eventually went the b-tree route instead), but it's otherwise the same basic strategy.

Out of curiosity I did the same in python (with numpy) simply memmap-ing the sorted binary hash list and doing a binary search (again using numpy) on that. I got around 40us lookup time using that solution. Not too bad for about 5 minutes of coding. Also tried using leveldb as datastore and got around 6us when using that (and solution is also only couple lines of code).


I’d be interested in seeing your five minute python solution and your benchmarks.


The leveldb solution is quite straightforward (running in ipython3 for easy benchmarking with timeit):

  import leveldb,hashlib
  db = leveldb.LevelDB('db')
  def lookup(pw):
    m = hashlib.sha1()
    m.update(pw)
    try: # the leveldb api is a bit minimalist.. 
      db.Get(m.digest())
      return True
    except:
      return False
  %timeit lookup(b'password')
Creating the database was just calling db.Put(x) for every entry.

The numpy solution is also quite easy (db3 is essentially your ..|cut|xxd .. file I think:

  import numpy as np
  db = np.memmap('db3', dtype=np.dtype('S20'), mode='r')
  def lookup(pw):
    m = hashlib.sha1()
    m.update(pw)
    idx = np.searchsorted(db, m.digest())
    return db[idx] == m.digest()
  %timeit lookup(b'password')
And I was running this on a system where not even db3 (which is just the binary hashes concatenated) would fit into my RAM, so for "real world" applications the results would be much worse since it's more limited by the disk IO than what's running on top. I just reran both and got 3us for the leveldb solution and 13us for the binary search/numpy solution.


Cool. What cpu/ram did you make your timings on?

I think the author of the okon benchmarks was using a i7-6700 CPU @ 3.40GHz, 16GB RAM

How long did the import take you?


> Not sure I actually understand why K is pushed here almost daily. Is it just the esoterics?

Kind of - they have best practices that are almost the opposite of the standard software dev best practices - and they still produce high quality maintainable software.

It’s interesting to see that a lot of “truths” in software development are at the very least context dependent if not outright pseudoscience.

I don’t think you can simply “loot” their best practices - although some devs do write c code the way they would write k code.

Edit: To expand on this for example even if you want to use short identifiers, multiple statements on one line, and avoid abstraction in the Java world you will be fighting the entire ecosystem to do it.


although some devs do write c code the way they would write k code.

Including Arthur Whitney himself; this is one famous example of that, and is worth close inspection if you're curious about the whole concept of array languages in general:

https://code.jsoftware.com/wiki/Essays/Incunabulum

Observe how quickly it "climbs the abstraction ladder", making heavy use of macros --- if you try to read it like "normal" C and just jump into the middle it's definitely not going to be readable, but if you start at the top and "consume" the definitions first, it becomes surprisingly easy to understand. One thing that stands out is how it makes almost all the loops implicit, via the DO macro.


I am not a professional programmer, I use it as a tool and most of my stuff is written in high level languages, but I have some C in there. Having said that, I feel like I am competent enough programmer that if I look at someone's low level code I get the gist of it almost immediately. I looked at this and it in no way was clear to me at all. I don't see the reason to shorten stuff like printf and main. Reducing key strokes adds something?

> if you try to read it like "normal" C and just jump into the middle it's definitely not going to be readable, but if you start at the top and "consume" the definitions first, it becomes surprisingly easy to understand.

I will take your word for this as you might know more than me about programming, but I feel like at least half of my colleagues need to get a gist of what is happening not what exactly is happening to the word. I make simulators for very complex machines. If we went around "climbing the abstraction ladder" every time something needs to be modified or added I think I will personally not get any work done.


> I don't see the reason to shorten stuff like printf and main. Reducing key strokes adds something?

Bugs hide in the extra chars.

On a more serious note, there are several (debatable) benefits to brevity (in programming & in general communication);

- less code to keep in your head - less chance of stupid typos (e.g. prinft instead of printf) - more "expressive" in that each char has more meaning - to someone who is fluent in the "shorthand", it is quicker to both write & read


All these benefits don't seem like benefits to me. I don't want to learn a new short hand language when I already learned an extra language, English, to learn programming. Also I don't see expressiveness in terms of keystrokes, you don't store bytes in your head, you store tokens. I can have very different words for a character (char, "", etc.) but in my head they take exactly the same amount of space.

Are simple typos that annoying? Writing a wrong function call will lead to either a compile time error or a wrong behavior, but latter happens when the function names are not chosen smartly. Also we have IDEs now, they are pretty good at handling typos.

I don't understand how it is quicker to remember printf("hello world") than P("hello world").


> I can have very different words for a character (char, "", etc.) but in my head they take exactly the same amount of space.

how sure are you? given how little we know about the brain, i'd be cautious with asserting something like that as a fact. while i don't think we store bytes in our head, i'm not certain that the length of a "symbol" doesn't matter either.


Maybe different people do it differently, after all some have photographic memory, I don't. But I certainly don't think about all the letters involved in a variable but rather the concept of the variable, or what that variable represents. Doesn't matter if it's called "numberOfBurgersPerCustomer" or "nb" as such. But in case I forget, "numberOfBurgersPerCustomer" immediately tells me what is going on compared to just "nb".

I guess the point is that I don't want to wade in definitions just to get the gist of a code. I want to jump in there, see immediately what's going on and where the problem can be.


pritnf is an awesome typo! When you're make it your compiler can catch it. "n" spelled instead of "m" is worse


Another thing about that code that I think irks some folks but is also reminds me of reading math is that it's much easier if you are aware of some common conventions and notation. For instance, the lines:

    #define V1(f) A f(w)A w;
    #define V2(f) A f(a,w)A a,w;
Are much more quickly recognizable if you know that in APL the left and right arguments to a function are alpha (α) and omega (ω), so you'd immediately recognize these macros as for defining 1-argument and 2-argument functions.

For folks in the know, conventions like this allow quicker communication and expression, but they also raise the barrier somewhat to the uninitiated.

Side note for folks used to "modern" (post 1989) C code - this is old K&R style function definition. This SO post (https://stackoverflow.com/a/3092074) shows a comparison.


There's definitely been a marketing push, I believe kx systems is aggressively trying to diversify outside of finance. They recently scored a contract for a Canadian electric company for storing electric usage data for 5 million customers. In a data science meetup I went to, they presented and I thought it'd be about machine learning. It was just a generic presentation on kdb.


They are marketing agressively. For example they were pitching to a bunch of people at AWS re:Invent, and very definitely marketing along the lines of it being a generic language for data + a database.

When I told their marketing folks that I had used kdb/q a bit, didn't think it a good fit for our use-case and knew from a brief sojourn in hedge fund-land how hard it was to hire kdb programmers their response was on the lines of "It's not hard to program in and if your guys find it hard they can just use our python wrapper".

Honestly it is an interesting thing but the extent to which it's being pushed here on a daily basis at the moment feels "inauthentic" and somewhat tiresome.


Honestly it is an interesting thing but the extent to which it's being pushed here on a daily basis at the moment feels "inauthentic" and somewhat tiresome.

Try posting something else? The front page is populated with links that others find interesting. I started posting on HN mainly because I thought the front page was declining; it's worked out for me, really, APL is on the front page every day now. If you have any good links hanging around, people would be glad to see them, and it will probably tilt the front page into things you like a little more.

There are also 29 other spots on the front page, and some of those spots contain interesting links too! No one has time to read all of the comments on HN every day, so only reading the comments on posts you find interesting will probably be more enjoyable than reading the ones you find tiresome.


That Canadian electric company scenario happened awhile back and I don't think the system has seen much use.


Damn, really? Has it been kind of a waste then? Seemed like a good way to handle smart grid data, for an old school utility that can't feasibly roll its own large time series storage solution.


Not 100% sure, but my limited understanding is that they have everything, but don't do anything with it at the moment. However, when they need it, it'll just be there which is much better than needing to build something and get all that data.


Have you tried programming in it? I would call K a general purpose language that can do pretty much anything. I can only speak personally, but learning k and q were a transformative experience that have changed how I think in a fundamental way (in a similar way that lisp and haskell did when I learned them in the past).


> not once have I seen an explanation of what makes it fast and why those techniques aren't looted for open-source languages

The language design is such that the programs you write can make optimal use of cache and SIMD. The interpreter can only take advantage of that if the language offers you that. It's like saying "not once have I seen an explanation of what makes c so fast and why those techniques aren't applied to python and javascript."

(Note also that the reference J interpreter is opensource and very competitive, performance-wise.)


This should apply to any high-level glue language that's wrapping primitives implemented in a "fast" language, right? I'm thinking in particular something like python with numpy. As long as most of the work is done within the primitives I would think they'd be pretty comparable.

Is there something about the primitives that APL-derivitives provide that mean that more of your program can happen inside the primitives so the interpreter is doing less work?

I'm also curious how the performance compares against something like Julia, which is JIT compiled so you avoid the distinction between the high-level and low-level code. Seems like a rather different region of the high-performance design space.

edit: also how does this relate to the "the whole interpreter fits in the cache" explanation that's also widely-invoked to explain K's speed? It seems like if you're spending most of your time in the primitives than the interpreter time doesn't matter. Maybe it's a combination of both - K and friends encourage you to organize your program in ways that support burning through data quickly, and additionally when the interpreter is operating it's a particularly fast interpreter?


> Is there something about the primitives that APL-derivitives provide that mean that more of your program can happen inside the primitives so the interpreter is doing less work?

I'm not an {APL,K,J} expert by any definition, but FWIW I think that's the "array-oriented" part of it: syntactically the operators don't specify the "rank" of their args so the interpreter is free to select the most efficient low-level semantics. Something like that.


Is there something about the primitives that APL-derivitives provide that mean that more of your program can happen inside the primitives so the interpreter is doing less work?

Yes. Well, maybe not the primitives themselves, so much as how the language pushes you to use them.

Last I looked, execution of the J interpreter, especially on large data, tends to be a handful of very tight loops inside the built-in verbs' implementations, and those loops are glued together with a small amount of the higher-overhead dispatch code normally associated with interpretation. Implicit looping incurs a once-per-array-op cost, but the cost of the computation done on each array element dominates the total cost. The interpreter is not especially clever, but the language encourages the programmer to write code that's easy for the interpreter to handle.

Performance is a bit fragile though. If the operation you're lifting over a whole array includes array-lifted operations internally, now you're doing the heavy interpretive dispatch inside the loop. As a simple example, {.+{: means "add the first item to the last item." If you apply it to a vector, it adds the first and last scalars. If you apply it to a matrix, it adds the first and last rows (and so on). Applying it to a very wide matrix is cheap because {. and {: just have to select the right rows, and then you have a simple loop over those rows' elements. It's about 166ms on my laptop with ten million row elements:

       wide =. i. 2 10000000
       6!:2 '({.+{:) wide'
    0.166085
With the data oriented the other way, there's no inherent reason computation should be any slower, but it is. ({.+{:)"1 means, "apply {.+{: to each individual vector." So we've changed from selecting the top and bottom from each column to selecting the left and right from each row. The way the J interpreter is structured, we have to figure out the loop for {.+{:)"1, but the loop body isn't just grabbing two numbers and adding them. The loop body jumps back to the interpreter to figure out how {.+{: should work on a 2-element vector. That takes about as long as figuring out how it should work on a 2-row matrix, but we do that for all ten million vectors.

       tall =. i. 10000000 2
       6!:2 '({.+{:)"1 tall'
    2.99855


I think it's the dreams of possible job heaven. With an esoteric language that can be used for finance, you can not only grow your mental gonads ("it's even smarter than Haskell!!") but also possibly get a six-figure job that will be very secure.


>not once have I seen an explanation of what makes it fast and why those techniques aren't looted for open-source languages.

It's fast (for an interpreter), because instead of for example a bytecode interpreter where each instruction operates on a single value, APL/j/k interpreters execute operators over whole arrays at once with optimized C code.


And the set of operators has been designed to work with each other and cover the basic concrete needs (i.e what the CPU actually needs to do rather than how you would describe it in words)

,// is a complete implementation of “flatten”.

|/0(0|+)\ is a complete implementation of an efficient the maximum-subarray-sum

In both cases, each character does one very well defined thing implemented in tight C (sometimes SIMD) code, and K interpreter just orchestrates how they interact.

(Parentheses here group as a pair, but everything else is an independent operation per character)


This is also the reason that numpy is fast.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: