Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
IncPy: Automatic memoization for Python (stanford.edu)
51 points by Goosey on July 1, 2010 | hide | past | favorite | 19 comments


There are a few responses here who seem not to have read the article or watched the video (note I didn't read the paper).

This is pretty amazing, and totally different from a memoizing decorator, and anything which could be done at the application level (unless it did deep deep magic; the kind you can't really do in Python).

Firstly, it is fully automatic. You don't need to annotate anything.

Secondly it works on impure functions! (as opposed to pure functions like fib.) It tracks data dependencies through the program, including global variables, external files(!), etc. If they do not change, then it uses a cached value.

Assuming that it works as he explains it in the video, this is properly amazing (and I've a PhD in compilers).

In comparison, a memoization decorator:

- requires you to annotate functions,

- can be accidentally put on functions which are not pure, which would lead to bugs and unsafe behaviour,

- does not (safely) support functions where data in the body of the function changes (such as global variables, external files, etc). Instead, it would only check the arguments.

Anyway, I'm impressed.


thanks for such a great summary and 'sales pitch' of my project ;) i couldn't have put it any better.

if you have any further suggestions, please email me personally.


Thanks greatly for your post.

I had the misfortune of reading the earlier comments and writing this off as a trivial accomplishment that could be disregarded, but now I know it is actually a useful tool to have in my arsenal.


hi everyone ... i'm the creator of IncPy and pleasantly surprised that it ended up on HN :)

anyways, this is still an early-stage research project where i'm the only developer, so i can't make much guarantees about its robustness (i can only test on my own machine with my own research scripts).

that said, though, i am very interested in getting new users and feedback. please email me at the email address listed on the IncPy website if you have any questions. i am more than happy to answer them!


I just tried to run Mercurial's test suite with it and it segfaults all over the place, unfortunately. It's too bad -- it seems like it could speed up some Mercurial operations quite a bit.


hi stevelosh,

sorry that it didn't work out of the box; as you can tell, it's still a research prototype :)

i'll add the Mercurial test suite to my TODO list of programs to try. in your experience, what kind of Mercurial operations do you think can be sped up by memoization?


Like http://www.erikyyy.de/compilercache but for any python program. Very cool.


http://ccache.samba.org/ is a much more well used version of the same thing.


Erm, not really. Compiler cache saves compiled versions of files. When building a large project it notices things that have been built before and looks up the compiled code from a central cache. If you have to do a "make clean" (e.g. some scripts for making Debian packages start this way) you still get fast re-builds.

The linked article is about run-time memoization: rather than caching the compiled version of a function's code, you cache just the output of the function for particular inputs.


"rather than caching the compiled version of a function's code, you cache just the output of the function for particular inputs."

Think of the function as the compiler and the inputs as your C files. The kinds of things the two are doing are very similar, but incPy's job is much harder.


You could just use a small memoize decorator like this one

  class Memoize:
	def __init__ (self, f):
		self.f = f
		self.mem = {}
	def __call__ (self, *args, **kwargs):
		if (args, str(kwargs)) in self.mem:
			return self.mem[args, str(kwargs)]
		else:
			tmp = self.f(*args, **kwargs)
			self.mem[args, str(kwargs)] = tmp
			return tmp


Yes you can cheat if you use your own memoizer too, if you "know" what your function does. Mine f'rinstance memoizes data fetched over the network even tho' as a function that is wildly impure, it can either age it out or check the last modification time. So I have one decorator for pure computations performed locally and impure performed remotely. Obviously this can only be done if the same person/team is maintaining both or it'll all go horribly wrong...


Not the same at all: IncPy stores the results to disc, so you can rerun your script and it will still have correct cache results for long running functions.


great point; the persistence of the IncPy cache is one of the biggest differentiators from an ordinary memoize decorator


You should str(args), not just kwargs, to be able to use parameters like lists or dicts as keys for self.mem


why not optimize such scenario's at application level?

It is possible to implement a trivial and transparent cache as decorator and apply it to particular bottle neck functions.

0 development effort, 0 running overhead, 0 maintainability issues


How could it be less development effort than an automatically memoizing interpreter?

Its hardly transparent if its a decorator.

Maintainance issue: applying it to a functions which isn't pure. Avoiding it adds development overhead.

There are other advantages I outlined in my comment.


What about just make a new def-macro for python? A very simple common lisp-implementation would be something like this:

  (defmacro defmemoize (fn val &body body)
      (with-gensyms (lst hash-code)
      `(let ((,lst (make-hash-table :test #'equal)))
          (defun ,fn ,val
          (let ((,hash-code (list ,@val)))
              (or
                  (gethash ,hash-code ,lst)
                  (setf (gethash ,hash-code ,lst)
                      ,@body)))))))
In which the standard fibonacci-example would be

  (defmemoize fib (n)
      (if (<= n 2) ; first two = 1
          1
          (+ (fib (1- n))
             (fib (- n 2)))))
Obviously, it would be something more advanced than this, but it shouldn't have to be much more advanced.


I like the atom construct in Clojure - see a functional implementation of memoization, instead of one based on macrology: http://clojure.org/atoms




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: