I had a bit of a derp here at first, I was like what, python already has a functional (as in working) standard lib, then I read the docs and was like ohhhhh that functional.
I think python made some half measures in regards to functional programming, not a bad thing since python tends to blend the different styles decently but at least for me it would be nice to have a good library to extend the functional side a little more, hopefully this scratches the itch.
Toolz, like seemingly everything Matt Rocklin is a major contributor to, is something of a model library: cleanly designed and coded, with strong documentation.
Although Python is not going to match a full Lisp, Haskell or ML in all their strengths, using a functional style can be useful and expressive. The toolz docs give some relevant background at https://toolz.readthedocs.io/en/latest/heritage.html .
I've been writing functional code for a year or so now at work with the help of ramda in nodejs, there is also a port for python.
My colleagues don't like becuase it's so different to what they are used to but I am putting out code faster and with less bugs so I'm not going to stop.
Why use a library for it? Can it not be done with just Python? And if using the library is much different from normal Python, does it mitigate Python's problems with functional programming? (For example one expression only lambdas and no TCO.)
I also do use some functional concepts in my Python work, but do not use a library for it. Only procedures or functions. No additional dependencies.
> Most programmers have written code exactly like this over and over again, just like they may have repeated the map control pattern. When we identify code as a groupby operation we mentally collapse the detailed manipulation into a single concept.
If you don't use a library, then you have to re-write something like groupby many times, I would expect. Or WORSE, you don't even use the pattern, writing "code exactly like this over and over again".
You probably know this, but for the other readers, I'd like to note that "groupby" specifically is part of Python's standard library (in "itertools" module)
Huh, interesting. This does not remind me of the database groupby operation, but rather of partition, like in SRFI-1. I mean in natural language it is clear to me, why they'd name it groupby, but in programming terms I think partition is more appropriate, as groupby is already "blocked" by the database operation.
Often one only needs one of the partitions though, which is when filter is sufficient. Otherwise I guess one can easily write partition oneself and then use that function over and over again, without resorting to a library.
But perhaps it is a good example, so that you do not have to write partition in every project and if the additional dependency is OK to have, why not, if it is indeed a good one.
> This does not remind me of the database groupby operation, but rather of partition, like in SRFI-1. I mean in natural language it is clear to me, why they’d name it groupby, but in programming terms I think partition is more appropriate, as groupby is already “blocked” by the database operation.
But…this is exactly what a database GROUP BY does. (You’ll always have aggregations in the SELECT clause which work on the data in the groupings, but the GROUP BY itself just specifies splitting the dataset up into this kind of groupings.)
> Otherwise I guess one can easily write partition oneself and then use that function over and over again, without resorting to a library.
Yeah, literally all a library is avoiding having to rewrite code that someone has already written once.
Ah I see. Perhaps the need to always have an aggregation confused me, which is specific to relational databases (all? most of? some?). Of course the aggregation has to work on something and that might be the same as a partitioning. Thanks for clearing that up!
The downside of a library is often, that it comes with its own dependencies and a lot of things you might not need. In general you should not buy into using a library whenever one is available, that among other things offers one procedure, which you need. The decision to use a library should be thought about a little more.
In some databases like Postgres, you can do something like “SELECT x, array_agg(y) GROUP BY x” to get the exact same effect of this groupby operation in toolz.
The source code for that function (groupby in toolz) is bizarre! Creating a defaultdict where the entries are append-to-list functions, calling them, then going over the dictionary again to extract the underling list objects. Does anyone know what this pattern is for, and why one wouldn’t just create a defaultdict(list)?
> why one wouldn’t just create a defaultdict(list)?
One shouldn't even do that. groupby is supposed to assume that the input is already sorted by the given key, so it can be implemented as a generator. What's more, it should be implemented as a generator (that's the way the Python stlib's itertools.groupby does it), to avoid having to realize the entire iterable at once.
The Toolz version has a different purpose than the one from Python's stdlib (itertools). One is unsorted input, while the other one is for sorted input. The Toolz version is not a replacement and the documentation states
The idea is that referencing the `append` method in the for grouping loop, i.e. doing
d[key(item)].append(item)
takes more time than rebuilding the _rv_ groups dictionary with lists instead of `append` methods.
Of course, a benchmark should be run and see how much longer the input sequence needs to be than the resulting groups, for this to happen.
Hell yeah me too! Some of my coworkers like it, but there are a few out there who basically don't want to learn something new.. I have been using Ramda professionally for about two years now and I love the heck out of it. My org is using more and more TypeScript which makes using Ramda a little harder but it still works pretty well in this context too. I have been working on a test data inventory project with Ramda recently and it has made the implementation so much nicer than it would have been otherwise. I started using pipeP (by defining it with pipeWith) and omg it's so great for dealing with a mix of async and non async calls.
We have started using types too I've noticed some of the types in @types/ramda are wrong where they only show arrays but it works with strings too like R.startsWith.
I might do a PR and update a few when I have time.
I've dealt with python noob codebase that had many deep side effects over multiple (~2000loc total) files. The overall logic fit into a single page, bugproof and easy to augment using FP idioms.
This does play into my code, we use types so I fully type my code and wrote comments and name functions with a fitting name.
I find classes much hard to follow then functional code should they change their styles becuase it makes me slower?
I think we all need to play to our strengths and as a team make it so it's as easy as possible to follow code even if it's in a style you don't fully understand.
Fredrik Lundh once suggested the following set of rules for refactoring uses of lambda:
1. Write a lambda function.
2. Write a comment explaining what the heck that lambda does.
3. Study the comment for a while, and think of a name that captures the essence of the comment.
4. Convert the lambda to a def statement, using that name.
5. Remove the comment.
Yeah, Python would really benefit from multi-line anonymous functions.
There was a limitation in Python where spacing and indentation gets ignored between parentheses, which makes it impossible to pass a multi-line lambda as an argument to a method or function. However, given the new parser, that limitation might be able to be mitigated.
I would argue that multi-line anonymous functions are unpythonic. Exhibit A is this line from PEP 20:
Readability counts.
I'm currently about a month into learning a legacy codebase that was written in a functional language. If I could single out one thoroughly egregious practice that has made this code far, far more difficult to read and understand than it should have been, it's multi-line anonymous functions. In general, if a function is doing something complicated enough to need multiple lines of code, it's doing something complicated enough to merit an explicit name.
> if a function is doing something complicated enough to need multiple lines of code, it's doing something complicated enough to merit an explicit name.
def foo_and_bar(x):
foo(x)
bar(x)
whew! good thing i named that
IME this limitation just leads to throwaway names like
process(x)
go(x)
do_foo(x) # in foo(x)
cb/callback/fn
because a lot of things just don't have sensible names! just like how it'd suck to have to come up with a name for every loop body
(although there was some book advocating for "replace every loop body with a named function" so some people enjoy that i guess...)
I see you chopping the first two words off that quote. ;)
In this specific case, that single instance of a single pattern is such a throw-away that it doesn't deserve a name, but the pattern itself is easy enough to name. So I'd skip the single-purpose function and create a combinator.
def do_each(*args):
def helper(x):
for fn in args:
fn(x)
return helper
and then, when I need to do both foo and bar, I don't even need a lambda.
map(do_each(foo, bar), some_sequence)
That's a fairly specific case, though. Moving back to the general, I would say that a function that does more than one thing, but can't easily be named, is a code smell.
Of course, every general rule has its exceptions. But I'm not so keen on the idea of optimizing one's coding style for the exceptional cases. Going back to PEP 20, "Special cases aren't special enough to break the rules."
(I realize mapping a function that returns nothing is terrible, but I'm feeling too lazy to think of a better example.)
i like a nice combinator as much as the next person! but consider this: if python already had multiline lambdas, would you be arguing for using a narrow-purpose combinator instead of `lambda x: foo(x); bar(x)`?
[this kind of reminds me of Go's generics mess, where workarounds for lack of generics are "just how you write Go and that's the language's philosophy"... until generics land and suddenly they won't be]
Probably. I think the former is more readable and, at the use site, more concise.
Regardless, I don't think the hypothetical is super useful, because its unstated major premise is, "But what if we, for the sake of argument, ignore all the other good reasons why Python doesn't have them?"
My favorite programming language is functional, and has significant whitespace and multiline anonymous functions. While it is my favorite, I do have to concede that the Python language maintainers' worries about the syntactic implications of multiline lambdas in a whitespace language are accurate.
(I could quote the zen of python some more here, too. Lines 5 and 6.)
Assignment expressions were "unpythonic" based on PEP 20 until they weren't. Modern languages have multi-line anonymous functions, and developer expectations have changed in the 17 years since PEP 20 was published.
Why not? Many other languages have closures and even Python has lambdas. Multi-line closures work well for these languages including in map/reduce/etc or comprehensions where applicable. Seems like your preference is overfit to Python’s limitations.
Because you can do it in a function. It's not a limitation when you can do it. Having to define a function that's easier to read, and testable as an independent unit, is not a problem.
I don’t think this limitation is likely to change - the only ways to allow whitespace-sensitive statements inside an argument list are all really ugly.
Agreed. I suspect that most uses of python decorators become moot with proper multi-line anonymous functions. I assume some would argue that decorators create more readable code but they seemed like a syntax hack to me.
This has really not been true since python grew a ternary operator, syntactically you can write functional code perfectly fine in python -- the only significant syntactic shortcoming is lack of pattern matching.
What actually hampers functional expression is semantics and mostly two-fold:
1. functions are terribly slow and always grow stack, so cannot replace iterative constructs.
2. Although python actually comes with a fair amount of functional data structures out of the box (str, bytes, tuple, frozenset, namedtuple, ...) none of them can be "updated" efficiently.
There are some other things (exception as opposed to sum-type based error handling), but fixing the above would be enough to write functional code pretty unhampered, I think.
> Although python actually comes with a fair amount of functional data structures out of the box (str, bytes, tuple, frozenset, namedtuple, …) none of them can be “updated” efficiently.
If they can’t be updated efficiently, they are just immutable than particularly functional.
Of course I agree that calling such data structures immutable is more precise, but it's still in contrast to both imperative programming and imperative data structures. In particular every single functional language (including the purest of the pure) fairly heavily use immutable array based structures in one way or other. So if functional programming requires only using data structures with structural sharing, there are no functional programming languages.
Python's lambda does pretty much what you would expect from Scheme. It creates a callable the binds arguments to parameters in a lexically scoped namespace and then evaluates the body of the lambda in that namespace. And with an open parenthesis, you can write multiline lambdas and indent it however you want.
All the usual wizardry is possible:
>>> (
lambda n: (lambda fact: fact(n, fact))(
lambda n, inner: 1 if n == 0 else (n * inner(n - 1, inner))
)
)(5)
120
Also, Python has rough equivalents to some special forms in Scheme:
(if testexp posexp negexp) ⟶ (posexp if testexp else negexp)
(cond (p1 e1) (p2 e2) (else e3) ⟶ (e1 if p1 else e2 if p2 else e2)
(begin e1 e2 e2) ⟶ [e1, e2, e3][-1]
(and e1 e2 e3) ⟶ (e1 and e2 and e3)
(or e1 e2 e2) ⟶ (e1 or e2 or e3)
Some statements do have a functional form but aren't well known:
You have map(), filter(), partial(), and reduce(). The operator module provides function equivalents for most operators. Also, the itertools were directly based on their equivalents in functional languages or array manipulation languages.
That said, Python does lack some essential tooling that you would really miss:
- There is no way to create new special forms.
- Some important special forms are missing: let, let*, and letrec
- The language spec precludes tail call optimization.
- Some Python statements lack functional equivalents: try/except, with-statement, and assert-statement
- python scoping is not really lexical (and it captures variables, not the values they contain)
- python is statement based, so lambdas are less powerful than in scheme.
For those, like me, who get excited by this note that there are restrictions on the Unicode categories that are allowed, see the supported characters¹ and gory details². It is often enough to write math-y code in a usable way, but occasionally you'll find you can't use the character you want.
Also by the lack of persistent data structures with structural sharing. That forces you to take an expensive copy whenever you want a modified version of some data, or bash it in place.
I could definitely be wrong but I think most functional data structures aren’t fully copied. Instead, they’ll utilize something like a pointer to the data that stayed the same so only the part that changes is “copied”.
Functional programming doesn’t have an inherent performance downside, but it depends on specific optimizations in compilers and runtimes to make it fast. Python doesn’t really optimize anything from what I know.
Exactly. Kind of an own-goal too, from none other than ex-BDFL. While not totally obvious, it's not impossible to widen the syntax to allow multi-line lambdas, you just need to ditch the stack-based lexer-integrated whitespace sensitivity behaviour.
> Can only use expressions, not statements. E.g. `print`s, loops, conditionals are out.
print is a function (and thus can be used in expressions)
python has conditional expressions (<true-val> if <cond> else <false-val>)
loops are a limitation, though comprehensions, map(), functools.reduce(), and the itertools module can allow lots of looping functionality in an expression.
Python’s `lambda` can only contain a single expression.
There’s no good way to add support for full anonymous functions to Python’s grammar. One of the rules that makes significant-whitespace work elegantly is that statements can contain expressions, but never vice-versa.
Plenty of languages with significant-whitespace have multiline anonymous functions, like Haskell, Standard ML, Ocaml, etc. Maybe it's no possible in Python for a student reason, but the reason is not that the syntax is bad on indentation.
The syntaxes of those languages are fundamentally different from Python’s. They don’t have an “expressions may not contain statements” rule - they don’t even have statements.
It sounds like you agree with me: the reason Python does not have multiline lambdas is not that it has significant-whitespace, but other decisions including the “expressions may not contain statements” rule.
since lambda is for simple and short anonymous functions most of the time why do I need type the whole word each time? can they also do what javascript does(or similar):
The use of lambda is becoming somewhat of a smell in Python in general. PSF's own black code formatter will complain about using it and pretty much always says to just use a def instead
I prefer functional style whenever I can. I noticed that this style slows Python programs down. It seems function calls are rather expensive. Does anybody have a similar experience?
Yes, I suspect it is a pitfall of a multi-paradigm language that cannot assume so much about code in order to optimize. Opinionated functional languages (ie clojure, haskell) can have a lot more guarantees about what is going on in order to optimize all those function calls, lexical bindings, etc.
It always amazes me the lengths people will go to to avoid learning & using a different language.
Perhaps if you want statically-typed functional programming, you shouldn’t be using Python? It’s not the best choice for that - in fact, it’s almost the worst choice.
Looks like a very nice more functional-focused alternative to pydash[1] (which is a Python port of lodash.js, which in turn is a superset of another library called underscore.js - whew).
Perhaps a custom compose function can help with these use cases? This series has a few examples of composing computation in Python that might be useful.
I think python made some half measures in regards to functional programming, not a bad thing since python tends to blend the different styles decently but at least for me it would be nice to have a good library to extend the functional side a little more, hopefully this scratches the itch.