Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Smile – Statistical Machine Intelligence and Learning Engine (haifengl.github.io)
120 points by haifeng on March 29, 2017 | hide | past | favorite | 27 comments


"Matlab ©"

Hehe, if you're going to be defending the Mathworks' trademarks, the proper symbol to use is ®, not ©. But who can keep copyright and trademark laws apart, right? It's all the same as long as it's someone else telling you that some intangible thing isn't yours. /s

https://www.mathworks.com/company/aboutus/policies_statement...

But seriously, folks it's not your job to defend the Mathworks' trademarks. It's their job. You can be deferential and do their job for them, but it's not legally required. It seems like kow-towing to me, giving Matlab more respect than it deserves.

This actually is a pet peeve of mine, because I see so many non-Mathworks employees cargo-culting those ® and ™ symbols around without knowing why, because they see the Mathworks themselves using them so much. As a GNU Octave developer, it feels a bit odd to see how effective the Mathworks is at convincing their followers to display the proper respect and protocol towards their products.


While we're at it, the trademarked name is in all caps - MATLAB


Interesting project! As a Scala developer, I am always curious when I see a project that's mostly Java with a dash of Scala. Seems like a one meta-pattern these days is to use Scala to create a DSL around a Java project. I'd just go full Scala myself :D but it's nice to see Scala co-mingling happily with Java in a large, important, useful OSS project.


How does adding

    Array(1.0, 2.0, 3.0, 4.0)
and

    Array(4.0, 3.0, 2.0, 1.0)
result in

    Array(1.7302967433402214, 2.547722557505166, 3.3651483716701107, 4.182574185835056)


Floats are weird and unpredictable like that. You can save yourself an RNG call if you simply add two floats. ;-)

More seriously, the result here really is x + y/norm(y). I would be really interested in knowing what the bug is here. Probably just a C&P error that forgot to update the result, since y is unitized (in-place?) in a call further below.


Is there something wrong in your code? I don't see the effect.

smile> val x = Array(1.0, 2.0, 3.0, 4.0)

x: Array[Double] = Array(1.0, 2.0, 3.0, 4.0)

smile> val y = Array(4.0, 3.0, 2.0, 1.0)

y: Array[Double] = Array(4.0, 3.0, 2.0, 1.0)

smile> x + y

res2: smile.math.VectorAddVector = Array(5.0, 5.0, 5.0, 5.0)


I just quoted the example in the webpage (section "Vector Operations"), I didn't run it myself.


Good eyes :) It should be a c&p error.


Ok, they have fixed the website now.


That does seem very strange.


Came across this yesterday. Can someone (maybe poster?) talk about when you would use this versus something like scikit-learn or any number of R libraries? Is the goal simply to have all machine learning in Java so it can be productionized easier?


The project homepage says "Data scientists and developers can speak the same language now!". So it is surely easier to producitionize a ML project without rewriting the algorithms after the data scientists work out the model with R or Matlab.


There are more python developers than scala developers. There are more python data scientists than scala data scientists. I like the project, though.


They are more Java developers than python developers :)


I don't know that that's necessarily true. The most recent StackOverflow survey[1] shows a difference of 8%, which is not an overwhelming majority. Granted, that's not an unbiased sample size, but I think the OP above is correct...more data scientists use Python than Java.

So anyone wanting to use this library would have to think about tradeoffs: Are the efficiencies lost in data scientists learning to use Java for modeling worth the efficiencies gained in putting a model in production? For some, the answer may be yes, for some no.

[1]https://stackoverflow.com/insights/survey/2017#technology-pr...


Looks useful...BTW the shell prompt for "Hello World" example is misspelled: smlie> "Hello world"


How does this compare to the implementations of Scala in Jupyter notebooks?

I was assuming that it would be something more like Matlab's gui environment or maybe RStudio.

It could be helpful to add an introductory paragraph, especially since "hello world" and "2+3" follow right after the heading "Linear Algebra".


Smile is an awesome library. If you use it in Java, Tablesaw is a data-frame-like data-munging framework that works well with it. https://github.com/lwhite1/tablesaw


It's not THAT R-like. Looking at the front page, their bar chart of performance of their machine learning algorithms is done in Excel. I like to think that no R user would post R benchmarks using an Excel bar chart.


running stuff from the console is R-like; clearly Java doesn't have a ggplot2


  ... clearly Java doesn't have a ggplot2
JFreeChart[0] is likely what many would reach for in the JVM ecosphere to perform ggplot2-type functionality, though Scala devs might want to use something like scala-chart[1] or similar.

0 - http://www.jfree.org/jfreechart/samples.html

1 - https://github.com/wookietreiber/scala-chart


This software has its own visualization package. See it at

http://haifengl.github.io/smile/visualization.html


Would be nice if there was a simple way to use this from Clojure.


With incanter you can do all of the examples of the welcome page

It certainly looks a bit dated but it comprises LOTS of statistics and related functionality.

I'm using for my job/s for years now and I am still amazed by its depth.

And it's native clojure.

http://incanter.org/


This looks awesome.


Well done


Looks very nice.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: