I didn't see any classification measures besides accuracy...
Could You provide the confusion matrix? It's easy to derive all the important values (FPR, TPR, precision so on) and get a lot better idea of quality.
Since the outcome is either success or not, the coin flip classifier yields an accuracy of 50%. It would be worth noting that in the paper.
It's true that that should have mentionned it in the paper. The basline here is always telling it fails, which gives you about 52% accuracy. I should definitely provide the confusion matrix, it's on my TODO list. I will add a performance section to the website where I show this kind of information.
I didn't mean to put down the article. Quite the opposite. It's truly an amazing application and a well thought through execution of the machine learning methods.
I have been wondering for some time, what the supervised learning could be useful for, which could benefit the broader audience (beyond computer vision, gaming, medicine and all the much popular and repeated applications). That You nailed perfectly.
Yes, I just have to see whether I could gather the same kind of information on these other websites. But I definitely want to compare the results I have with other crowdfunding websites.
Let's say that the way the project starts is important. How long really depends on the campaign duration. After 4 hours on average, the accuracy is 76%, meaning that you can tell quite accurately pretty early. We focus on early predictions because this is when they are helpful, i.e. the creator has time to react. We have very high accuracy near the end, but this does not help so much...
One possible explanation: the initial boost is largely a reflection of the pre-launch marketing efforts, which are in turn a measure of the effectiveness of the marketing effort generally. So, you could read this as saying more that marketing is important, than that there's something that's special in particular about the large "pop".
Sure, many factors can explain the initial "surge" in pledges: pre-launch marketing, featuring on recently launched, novelty, first spread on social medias, etc. I did not really try to explain the causes of the different phases, but merely to use the raw data to predict the state at the end (successful or failed).
From your findings, is it possible to create a sentence formulated like this? "If, after X hours, your KS has $X and X links, your chance of success is X in X." If not, can you create a comparable "headline fact"?
I haven't done it yet, but I could. This also depends on what your goal is, and how long the campaign last, but it is quite easy to extract such probabilities of success. For instance, this plot shows the estimated probability of success as a function of time and the proportion of the goal that was pledged: http://imgur.com/hoTzBEW. From it, you can see that you need to be at about 25% or more of your goal in the middle for your campaign if you want to have more chances of success than failure. Similar plots could be done by including more factors.
What I found curious was the difference in goals for successful and non-successful projects. It seemed that lower goals had a higher chance of succeeding (all things considered). Would that be a right conclusion to jump to?
It is correct: on average, failed projects ask for more money than successful ones, except for projects in the Video Games category. There, it is the opposite. I'll add a "Stats" page to the website as soon as possible to show these results.