Thanks for adding that to the discussion. I'd like to point out a couple of thin...

Thanks for adding that to the discussion. I'd like to point out a couple of things:

(1) that adding features can create problems is well known among good ML practitioners (I daresay, esp. to those who have a fair amount of exposure to non-deep-learning techniques). With deep learning you can afford to worry less since with enough data and compute cycles, the network can figure out what to ignore. Which is convenient. Throwing out uninformative features however, may still have a practical benefit: less features -> smaller dataset size -> faster training.

(2) This is probably a minor nitpicky point: adding more features can lead to no improvements not only because of the curse of dimensionality, but sometimes simply because the feature has absolutely no bearing on the label; that is to say you might not be adding noise, but you might not be adding information either.