I would expect an entry level person to be able to verbally answer basic questions such as:
- Explain why an NN with n nodes across multiple hidden layers can model a more complex structure than an NN with n nodes but only one hidden layer.
- When using an NN, when is it appropriate and not appropriate to utilize a cross-entropy cost function?
- Why can a single perceptron not approximate an XOR operation?
- Why is neural network (NN) training data divided into three sets: training, generalisation, and validation? What is the purpose of each? Must the three sets be mutually exclusive?