More

tsiki · on June 5, 2017

I think you're missing the point. The jimmies are getting rustled up because someone provides false information about the performance to make his own argument seem better. This is something anyone should be against.

irq11 · on June 5, 2017

but they didn't.

the writer makes an unconvincing claim that the original post was wrong. the data presented shows only that if you try really hard and get lucky enough, you can probably do as well as a simple regression in this case.

the author himself admits that deep learning is probably misapplied here, and that training with such small data is difficult, at best. which again brings us back to the important question (i.e. the point being made by the original post): why would you ever do this?

dbecker · on June 5, 2017

if you try really hard and get lucky enough, you can probably do as well as a simple regression in this case.

Maybe you aren't familiar with deep learning: but this isn't "trying really hard." This is doing basic stuff that anyone using deep learning probably knows.

And deep learning doesn't just "do as well" as the simpler model. It does meaningfully better at all sample sizes.

blueblob · on June 5, 2017

Define "meaningfully better." Perhaps you mean statistically significantly better? It may have better accuracy, but it has significantly less interpretability. What does it capture that regression couldn't capture? At least with regression you can interpret the relationship between all of the variables and their relative importance by looking at the coefficients of the regression. With deep learning, the best approaches for explanation are to train another model at the same time that you use for explanation. Additionally, it was proven that a perceptron can learn any function, so in some senses the "deep" part of deep learning is because people are being lazy because at least you could get a better interpretation of the perceptron. I don't mean to imply that there's not a place for deep learning, but I think this isn't a great refutation of the argument that fitting a deep model is somewhat inappropriate for a small dataset.

dbecker · on June 5, 2017

The model we are comparing against makes 10X as many errors.

I hadn't imagined someone would argue that's not a meaningful difference.

Though the difference is statistically significant too.

blueblob · on June 5, 2017

Not sure what kind of argument that is. If something overfits it will have less error, does that make it better? It may mean it would generalize a lot less when run on more data. Whether or not something is meaningful depends on what you take the meaning to be.

toth · on June 5, 2017

Not the OP, but it wanted to point out it has 10X less error on the holdout sample so it is not simply overfitting.

blueblob · on June 5, 2017

It doesn't matter that it's on the holdout, he's partitioning an already small dataset into 5 partitions and talking about the accuracy in using 80 points to predict 20 points. The whole argument is usually that in the law of large numbers you can now have a statistically significant difference in accuracy. When you're predicting 20 points each with 5 (potentially different) models you likely don't have enough to talk about statistical significance.

dbecker · on June 5, 2017

We tried to mirror the original analysis as closely as possible - we did 5-fold cross validation but used the standard MNIST test set for evaluation (about 2,000 validation samples for 0s and 1s). We split the test set into 2 pieces. The first half was used to assess convergence of the training procedure while the second half was used to measure out of sample predictive accuracy.

Predictive accuracy is measured on 1000 samples, not 20.

eanzenberg · on June 5, 2017

Honest question.. Who cares about interpretability if you're optimizing for predictive power?

Also, DL can be interpretable in different domains, much like any non-linear classifier (are you hating on random forests too for the same reason?) It just takes more work vs. looking at linear coefficients.

blueblob · on June 5, 2017

This is an area that fades in and out of focus with such venues as the Workshop on Human Interpretability in Machine Learning (WHI) [1]. It's becoming increasingly important when it comes to auditability and understanding of what is actually learned by algorithm. Avoiding classifiers from learning to discriminate based on age, race, etc [2] or in domains where it's important to know what the algorithm is doing such as medicine. Work in understanding DL is not really interpretable in any domain, typically they train another (simpler, less accurate) model and use that to explain what the model is doing or use perturbation analysis to try to tease out what it is learning. If all you care about is getting the right answer and not why you get that answer maybe it doesn't matter.

I wouldn't say I'm hating on DL nor that I hate on random forests, or ensembles, etc., but when you have very little data fitting an uninterpretable, high dimensional model might not be the right answer, in my opinion, see [3].

[1] https://arxiv.org/html/1607.02531v2 [2] https://arxiv.org/abs/1606.08813 [3] https://arxiv.org/abs/1601.04650

irq11 · on June 5, 2017

"it does meaningfully better at all sample sizes"

maybe you aren't familiar with reading graphs, but no, it really doesn't. one graph with mostly overlapping error bars does not inspire great confidence.

also, it isn't at all clear to me that the cross-validation method employed is sufficient to rule out overtraining. nor is it clear that the differences claimed are replicable or generally applicable enough to make a counter argument to the original post.

tsiki · on April 30, 2017

Really? I have the exact opposite experience, I feel much more free to study and work on whatever I want now that I graduated. I would always feel bad for working my own stuff instead of just working on the class project or homework.

tsiki · on April 23, 2017

I've heard this before, is there a source for this? It has all the hallmarks of an urban legend...

pkaye · on April 24, 2017

I don't have the source anymore but it was an article where they spoke to someone at Google HR who mentioned that a particular team was difficult to approve anyone or hiring. To prove their point, they anonymized the team's own resume and asked them to rate them.

stuxnet79 · on April 24, 2017

I saw a lecture (about tech interviews) by a former Google engineer who at the time was working at Etsy. He mentioned that the experiment was performed on him and his team. The hiring committe he was a part of had acquired a reputation of being particularly difficult to please. They ended up rejecting all of their own hiring packets. I was skeptical when I heard the story. Google must do a fantastic job anonymizing the hiring packets that get reviewed, or maybe they had been at Google for a long enough time that they forgot their own process and didn't clue in. However, this engineer did mention it so it doesn't seem to be an urban legend to me. With some clever searching I'm sure you can find the video.

tsiki · on April 23, 2017

For me the problem is that the number of sites I'd like to support to some extent. I'd love to donate to The Guardian, but I also frequent The Atlantic, BBC and many other sites. Actually supporting all these sites would take time and effort. I took a look at some chrome extensions which try to make this easier (namely tipsy) but they seem to have very few publishers signed up.

Baeocystin · on April 23, 2017

https://blendle.com/ lets you pay a small amount per article, pulls from a variety of sources, no-questions-asked refund if you decide it wasn't worth it.

joering2 · on April 23, 2017

One day someone will come up with "paypal for articles online" - one button to comfortably and securely click to pay a few cents for an article you want to read.

And of course get enough traction for it to get popular and be used by enough people to be sustainable.

Until then, your problem will be mentioned on HN again and again...

hashhar · on April 23, 2017

The better way would be for a service that installed an extension that tracks number of visits to distinct URLs on a site you tell the extension to track. At the end of the month, it uses a multiplier with the number of visits to either:

1. Autopay using PayPal or something or the other. 2. Open the payment links with the relevant fields autofilled.

danielpatrick · on April 23, 2017

Check out Brave (browser) with built in payments. Of course, this doesn't solve your problem of who's signed up, but Brave has a protocol for that scenario and they reach out to sites and offer them the money before returning it back to you.

confounded · on April 23, 2017

For print magazines like The Atlantic, why not subscribe?

tsiki · on April 4, 2017

I currently have a thinkpad with ubuntu in it (not preinstalled should it matter), and unfortunately I'm considering switching the other way around. There are just too many things with this combo that don't work the way they should, like webcam and microphone not working, public wifi with redirection not redirecting etc. I'm completely fine paying some 50% premium to get the support and customer care of a major corporation.

tsiki · on April 2, 2017

How exactly would that premise be different from the Microsoft/Nokia disaster?

skdotdan · on April 2, 2017

Microsoft never had an ecosystem of applications. Samsung would, if offered an Android runtime and made that developers resubmitted their Android apps to their own store. Samsung has enough market share to force them to do so.

tsiki · on July 21, 2015

What if wage disparity follows the lines in negotiation ability even more strongly, does that make it an issue of negotiation skills then?

tsiki · on July 8, 2015

I'm the most surprised how LA blows up, I thought it was quite a bit cheaper there than in SF/NY.

refurb · on July 8, 2015

LA county has almost 10M people living in it. SF has only 800K. If housing prices were equal, you'd expect LA to be 10 times the size (assuming an equal number of people per dwelling).

bagacrap · on July 8, 2015

Yes, it's difficult to make a fair comparison across counties of such vastly different areas and populations. LA county includes far-off places like Palmdale which bring the average housing price way down. If you only looked at more central neighborhoods like Hollywood, Beverly Hills, Santa Monica, etc. you'd see distortion more akin to SF.

tsiki · on June 24, 2015

A lot of the people commenting on the article seem to be pretty adamant that something like this matters. Personally, I've never been confused by a period (or lack of thereof) at the end of a commit message. And if such a time comes, I'll give serious thought to whether it's worth anyone's time to teach the team to use/not use periods or whether I'm better off investing that time to teach them something more useful.

tsiki · on June 12, 2015

>Finland has a cultural problem: working for a big company is considered good, working for a small one is considered being unsuccessful and failure at founding a startup is considered a shame.

Maybe 10-15 years ago. Now with the rise of successful startups and the national interest towards them, working at a startup is considered good and trendy and rather, most of the CS students avoid large companies. Even my mom was proud and supportive when I cofounded a startup, and she's always been the one to advocate for a good and stable job.