Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Julia for Numerical Computation in MIT Courses (github.com/stevengj)
80 points by sharmi on Nov 26, 2013 | hide | past | favorite | 50 comments


> Traditionally, these sorts of courses at MIT have used Matlab

This is an incredibly good sign for hobbyists in a number of fields where Matlab completely dominates the research end of the spectrum. In the DSP realm the amount of experimental software that relies on Matlab and the number of research publications that pack in Matlab implementations is enough to be a pain in the side for anyone outside of an institution where Matlab licenses are taken for granted. Hopefully it will become a trend.


Agreed - its always a sinking feeling to see code attached to a paper and see that its MATLAB or, even worse, MATLAB+some toolkit/addon. Of course, still better than seeing a paper without ANY code...


Python+Numpy+Scipy+Spyder/IPython have made headway around these parts in displacing MATLAB/Simulink.

Simple single installers (Python(x,y), etc) have certainly helped.


Octave?


Octave is close enough to Matlab to have inherited most of its flaws, yet not close enough to be a complete drop-in replacement. Julia, while obviously inspired by Matlab, is not attempting any sort of comparability, and thus can fix flaws and be a better language.

Octave, in my opinion, should be relegated to (trying to) run legacy Matlab code, while Julia or Numpy should be the first choice for any new code.


> Octave, in my opinion, should be relegated to (trying to) run legacy Matlab code, while Julia or Numpy should be the first choice for any new code.

Octave dev here. I agree.

Every time I have to implement yet another idiocy of the Matlab language, I die a little inside. It's a dirty job, but someone has to do it.


Come work on a modern functional language!


You mean mathics? Perhaps. I don't know how important it is to liberate mathematica code. It doesn't seem to have the deployment that Matlab has.


And its greatly appreciated.


I tried to get into Julia, but I was put off when it seemed that the language doesn't support vectorization the way that Matlab does, but rather encourages writing your loops explicitly (which is fast due to the LLVM JITC). Yuck.

I am an Octave / Matlab fan myself, and think that Julia/ Python/ R/ etc don't quite hit the sweet spot for matrix driven algorithms and 100-1000 line programs that Matlab does. Call me old fashioned....


Julia does support vectorization just like MATLAB, but I think what you may be referring to are some of the speed traps that currently exists where undesireable/unnecessary copies occur. This is often in practice the reason that writing devectorized code is better in Julia right now.

But, this issue is well-identified in the Julia dev community, and is on the roadmap for 0.3: https://github.com/JuliaLang/julia/issues/4853


I was put off when it seemed that the language doesn't support vectorization the way that Matlab does

Yea, coming from matlab it just feels so wrong to write out loops by hand. Fortunately it seems some people are working on trying to solve that here: https://github.com/lindahua/Devectorize.jl


This seems like a strange choice to me, the only reason for choosing Julia seems to be for speed but when doing any of these courses I can't see speed being much of an issue but rather the ability to leverage existing code and help available online and offline to learn the core mechanics of numerical programming.

In my opinion Julia is great but still a bit immature.


It may indeed be a strange choice, but I think there are many reasons for why one might try Julia beyond the speed aspect. The main one for me (and I think the language designs also state something to this effect) is the language wide paradigm of multiple dispatch on parametric types. Coming originally from a C/Fortran old school numerical computing background, it seems to me that many of the best numerical software (your lapacks,fftws, and whatnot) eventually end up at a point where you have a flat collection of a bunch of foo functions with some protocol over the function suffixes and signatures, so that you can can get at an optimized foo_* at the bottom level with a uniform interface through a top level foo at the level of granularity of basic fooness as understood by end users. Julia formalizes this structure, which is I think part of how it achieves speed, but more importantly allows people to focus on programming to the essential abstraction and adding in methods to optimize or specialize after the fact.


I love this post – you really get it :-)


Learning the speed of algorithms might well be part of the course - and in julia there is no sudden speed ramp between "done in fortran" and "done in matlab scripts". So you can write the naive, and then the progressively smarter implementation, and see how they speed up.


I don't think speed is the only reason for using it - from my perspective in this context its more like free (in both senses) drop-in replacement for MATLAB, but with less "weirdness" than MATLAB (and Octave) that is also fast.

I also think any "immaturity" is only really in the package ecosystem, especially after the release of 0.2. Even then, the package ecosystem is strong for the things that people really need (naturally), and of course you can call any Python package from Julia if you want as well.


We try to not implement some of the biggest Matlab weirdness whenever possible:

http://wiki.octave.org/FAQ#How_is_Octave_different_from_Matl...

Sadly, this is not possible very frequently.


Must be frustrating! Octave provides a valuable service to the community.


A question: are you getting good performance from Julia? I've seen posts that say so, but my code that does Monte Carlo simulation (both vectorized and devectorized) seem to be very very slow compared to Matlab or Numpy.


I've been testing Julia on and off for the past few month (rewriting various bits of real world numpy code I've written for work) and I've seen everything from 2 times slowdown to 20 times speedup when compared to numpy. Most of the time I'm not seeing much more than 2 times speedup.


We implemented a "serious" version of simplex algorithm for linear programming (sparse-matrix / sparse-vector operations that can't be handled by standard libs), and we got performance no worse than half of the equivalent code in C++ with bounds checking, slides here: http://www.mit.edu/~mlubin/informs2013.pdf


giving an example of something that can go wrong would be a good starting point to understand why


Hehe. Get them while they're young.

I'll have to take a look see through this and write out the analogous code with my Numerical Haskell libs after the holiday season to compare.


Please do! I've been keeping an eye out for a post from you on the haskell subreddit announcing some work.

I have some ideas for some numerical code in Haskell, but I keep coming back to numpy/julia/matlab for my day-to-day.


You could also subscribe to the mailing list I have setup on my website at WellPosed.com, though I will cross post to reddit and Haskell cafe once it's released (will be pre Alpha quality mind you). I'm aiming for sometime after Xmas or around New Years.


So is Julia intended to be a general purpose language? I keep hearing about it only in the context of numerical computation.


It started as a language targeted for numerical computation.

But as usual, many developers resist to "use the best tool for the job" instead of "a language to rule them all", and Julia is now starting to have libs for everything.

Not that is bad, quite the contrary. The language is quite nice and already achieves C like speedups for many of its features.

So I can easily see it getting a place in the mainstream.


it's surprisingly good. it feels much more like a general language than matlab does. i have been using it privately as a "faster python" and have been very happy with it (but for work that is arguably still mathematical in nature - generating procedural art and looking at block ciphers).

in fact, i didn't realise it was intended to be a matlab replacement at first. i just looked at the specs and thought it looked interesting (i still haven't used the interactive dohickey (ijulia)). if you come to it from that pov, the only thing that seems odd is the 1-based indexing (and if you're old enough to have used fortran then even that doesn't feel so strange).


I think its fair to say that it shines for numerical computation, but you can perform other tasks with Julia with ease.

For example, the idea of writing some number-crunching as a webservice with MATLAB or Octave is not an option, but with Julia is totally possible (and has been done more than once already) - and it builds on strong foundations (e.g. libuv). Heres an example (disclaimer: my own site) http://iaindunning.com/2013/sudoku-as-a-service.html


Why is the IPython and scientific package installation always such a nightmare?


I am very impressed with Anaconda's Free edition. Works well in Linux and has more packages [2] than Enthought's free edition. Anaconda is the commercial offering from Travis Oliphant, the creator of Numpy.

[1]: https://store.continuum.io/cshop/anaconda/

[2]: Packages: http://docs.continuum.io/anaconda/pkgs.html


Because some people like to use operating systems that don't have software installation and maintenance figured out.

    apt-get install ipython
Done.


That doesn't solve the problem. Here's another version:

Why is installing a specific version of a package you fancy (e.g the latest or some previous) a nightmare?

And there "apt-get" doesn't help you much.


    pip install mypackage==1.2.3
I believe the general formula is apt-get for python and python-pip, then use pip for everything past that point.

And in general, virtualenvs are the right way to go for anything other than throwaway scripts. I've got a global numpy/scipy/requests install so that I can just do `ipython -pylab` and start getting work done. But real project code happens in a virtualenv.

On a mac anaconda works pretty well. I sent a link to a marketer with minimal python experience and she was up and running in under an hour. (Admittedly, she's rather more adept than a normal marketer...)


First of all, you can use APT to install older versions. Second, you can always make up a case where the packages in the repos don't do 100% what you desire (maybe you want some different build options, or dependency versions, etc). The packages in the repos are designed to cover most cases, and to work together with the other packages on your system. So I have to ask: are you SURE you need something the packages don't give you? If so, then you have agreed to take on the extra work to get what you want, and it's understood that it may/may not be trivial. This applies to any software.

Even so, APT does help you quite a bit with build dependencies, so whatever non-standard thing you've decided to do, you can still do it with (hopefully) not too much work. Feel free to ask me about any specific thing you're having trouble with.


If you have trouble with it, you might prefer to try Enthought Canopy. I've used it a bit, and install is a breeze. I've not figured out how to install outside packages yet, but the actual base install is very simple and includes most of the scipy packages.

Just to note also: I'm completely unaffiliated with Enthought except that I have a registered account with them for occasional feedback.


Anaconda is another option. Comes with almost everything I need and you can use pip to install the few things it doesn't come with


My goto recommendation for a fresh installation: http://fonnesbeck.github.io/ScipySuperpack/


For the same reasons any set of non trivial softwares is: flexibility and # options means many things can go wrong.

Solutions like Enthought Canopy, superpack or Anaconda exist to solve that exact problem (disclaimer: I work for Enthought).


Anaconda and Python(x,y) on windows help out a ton. Similarly for Homebrew on OSX. On Linux, if you have a package manager you're probably good to go.


Because you are not using Anaconda...


Making some overgeneralizations here, software written by scientists aren't going to be as good as software written by software engineers. Their usual purpose (and their training/background) is to publish papers and writing code is just a means to an end.


hm... why teach Julia, if everything (90%) in numerical/scientific community is Fortran/C(++) ?


Far from 90% of code written in the numerical world is Fortran/C. And even the code that end up being written in Fortran almost always starts its life in something like MATLAB or Python. Only after all the kinks have been worked out and all the bottle necks identified might some of the code be ported to Fortran or C. And even in those cases it often only makes sense to rewrite the most critical 10%-20% of the code.


Agree with the above.

In addition, all of these higher level languages popular in science seem to have pretty good support for calling existing C/Fortran libs when necessary. Julia is no exception, which makes it possible to take advantage of all the work that's been done developing optimized numerical codes.

That said, I would like to see something more automatic. In my limited experience, Julia's ccall works as advertised, but f2py in Python is much simpler to use.


it's trivial to call c from julia. the first project i used it on, the julia wrapper around cairo (the svg library) didn't support something i wanted. just reading the library code i figured out how to extend the wrapper in about half an hour. it was three lines of julia code. i was very surprised / impressed.


I think it's meant to compete with the likes of MATLAB, Python/Scipy and R which are probably the most used languages in scientific research. As it stands now most prototyping is done in one of those high level languages and only if you really need the speedup do you port it to C/C++. Julia wants to be a high level dynamic language where you get the speedup for free.


Gotcha. I see people are trying to use numpy for CPU intense computations (eg. FFT back and forth), and then they ask why it SO slow?!..


I think the idea is that you can move the threshold on when to implement in Fortran/C(++) since you only have a '1x-5x slowdown' for most common application compare to the 20x-5000x slowdown with R/Python/Matlab/..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: