While training the network, you can keep in mind the goal to generalize the network and reduce overfitting. The concept of learning from some data and correctly applying the gained knowledge on other data is generalization. There are certain aspects that control the degree of overfitting and generalization.
- Number of parameters can be altered depending on the difference between test score and training score. Also, keeping in mind the complexity(non-linearity) of the data.
- Dropout neurons: adding dropout neurons to reduce overfitting.
- Regularization: L1 and L2 regularization.
After you have trained the network, you can successfully use that same network to perform prediction on other handwritten digits dataset. This process will be termed as transfer learning.