Python may be terrible, but this poster doesn't know Python at all. Here's how t...

oisdk · on Oct 3, 2019

I really wasn't trying to compare Python to Haskell, rather I was trying to show a few example features in Haskell with the Python code as a reference for the "standard" way to do a binary tree type thing. Other than the (admittedly awful) `__dict__` stuff, the rest of it is pretty standard. In contrast, the code you've written here is non-mutating, and uses tuples to represent a tree. If you were to google, say, "BST in Python" I'd wager almost none of the implementations would follow that style. If I was to write a skew heap in Python (that I intended to use), I would likely do it in a non-mutating way (although I certainly wouldn't use tuples and `leaf = object()`).

The point of the post was really to argue that simple features like pattern matching, ADTs, and so on, should be in languages like Python and Go. Also I wanted to make the point that functional non-mutating APIs could be simple and tend to compose well: the `unfoldr` example was all about that. In that vein, it was important that I compare the Haskell code to an imperative version.

For instance, with your `reduce` improvement: I agree that the `reduce` version is better! It's simpler, cleaner, and easier to read. But Python these days is moving away from that sort of thing: `reduce` has been removed from the top-level available functions, and you're discouraged from using it as much as possible. The point I was making is that I think that move is a bad one.

Finally, while the Python code here is shorter, you still don't get any of the benefits of pattern-matching and ADTs.

* You can only deal with 2 cases cleanly (what if you wanted a separate case for the singleton tree?).

* You are not prevented from accessing unavailable fields.

* You don't get any exhaustiveness checking.

lidHanteyk · on Oct 3, 2019

Python has some basic pattern matching. ADTs are alright, but if you notice that MLs implement them by tagged unions, then really this is a request for syntax and ergonomics, not semantics.

Python is untyped. This fundamental separation between Python and Haskell is non-trivial, and can't be papered over. Your complaints about exhaustiveness, field existence, and case analysis are all ultimately about the fact that Python's type system is open for modification, while Haskell's is closed; in Haskell, we can put our foot down and insist that whatever we see is an instance of something that we've heard of, but in Python, this is simply not possible.

I agree, when it comes to Python's moves. I am about ready to leave Python 2, but I'm not going to Python 3.

oisdk · on Oct 4, 2019

While I am all for stronger type systems, I don't agree that you need it to do sum types. We can already do one half of ADTs (classes ~= product types), I just want the other half!

In my mind, the syntax would be something like this:

    sum_class Tree:
        case Leaf:
            pass
        case Node:
            data: Any
            left: Tree
            right: Tree

    def size(tree):
        case(Tree) tree of:
            Leaf:
                return 0
            Node(_, left, right):
                return 1 + size(left) + size(right)

A combination of data classes and pattern matching.

newen · on Oct 3, 2019

So `tree` is either a `tuple` or an `object`? Seems pretty terrible in terms of typing.

lidHanteyk · on Oct 3, 2019

The original was not better. More importantly, recall that Haskell's type system is unsound, so it's not like you can trust your Haskell code either. Just because it type-checks does not mean that it works. In either language, you'll want to write some tests. I mentioned Hypothesis for Python; for Haskell, there's also QuickCheck.

newen · on Oct 4, 2019

Haskell's type system is unsound? I don't know about this and if it's true then it is some rarely used case. Don't throw away types altogether because of some rarely used case... Types are very very useful.

lidHanteyk · on Oct 5, 2019

Pick literally any type. The value `undefined` proves it, even if it shouldn't be provable. Magic~

olooney · on Oct 3, 2019

I agree with your criticisms, the Python version presented in the article is rather baroque. In particular, the use __dict__ seems to have little justification and serves only to "uglify" the Python.

Here's my version of skew heap in Python 3, edited down to just the essential operations. (Link to slightly fuller version: https://gist.github.com/olooney/97643d07d69d22015ae5bb70c121...). It seems about as clean as the Haskell version presented, mainly because I used None to represent leaves (which is quite Pythonic.) Relative to the code presented in the article, it also benefits from a clear partition of trees of immutable nodes on the one hand, and a class to represent the mutable heap interface on the other.

    from typing import NamedTuple, Any, Optional

    class Node(NamedTuple):
        """A single Node in a binary tree."""
        value: Any
        left: Node
        right: Node

    def merge(p: Optional[Node], q: Optional[Node]) -> Node:
        """
        Implements the critical "merge" operation which is
        used by all operations on a SkewHeap. The merge operation
        does not mutate either tree but returns a new tree which
        contains the least item at the root and is in heap order.
        The resulting tree is not necessarily balanced.
        """
        if p is None: return q
        if q is None: return p

        if q.value < p.value:
            p, q = q, p

        return Node(p.value, merge(p.right, q), p.left)


    class SkewHeap:
        """
        A SkewHeap is a heap data structure which uses an unbalanced binary tree to
        store items. Although no attempt is made to balance the tree, it can be
        shown to have amortized O(log n) time complexity for all operations under
        the assumption that the items inserted are in random order.
        """   
        def __init__(self, items=tuple()):
            """
            SkewHeap() -> new, empty, skew heap.
            SkewHeap(iterable) -> new skew heap initialized from the iterable.
            """
            self.root = None
            for item in items:
                self.push(item)

        def push(self, value: Any):
            """Add an item to this heap."""
            node = Node(value, None, None)
            self.root = merge(self.root, node)

        def pop(self):
            """Remove the least item in this heap and return it."""
            if self.root is None:
                raise ValueError("Cannot pop empty SkewHeap")
            else:
                value = self.root.value
                self.root = merge(self.root.left, self.root.right)
                return value

        def union(self, other: 'SkewHeap') -> 'SkewHeap':
            """Return a new heap which contains all the items of this and another heap combined."""
            ret = SkewHeap()
            ret.root = merge(self.root, other.root)
            return ret

        def __bool__(self) -> bool:
            """Return true iff the heap is non-empty."""
            return self.root is not None

    def test():
        h1 = SkewHeap()
        for item in [42, 13, 50, 11, 14, 50, 91, 72, 91]:
            h1.push(item)
        h2 = SkewHeap([63, 15, 1, 22, 91, 11, 92, 99, 93])
        h = h1.union(h2)

        while h:
            print(h.pop())

Maybe someday I'll read an article comparing two languages where the author's greater familiarity with one language over the other isn't the dominating factor, but it won't be this day.

Scarbutt · on Oct 4, 2019

Your recursive merge function will blow the stack for large inputs and it will be slow.