Well, all of us definitely feel good to achieve high in all we do, and this competition helped me experience this on Kaggle.
This article is gonna summarize what this competition on Digit Recognition using CNN’s is all about and how you need to submit your predictions in the form of a csv file on Kaggle.
Loading and understanding data
The data files contain gray-scale images of hand-drawn digits, from zero through nine.
As seen in the output, the training data set has 785 columns. The first column, called “label”, is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.
These values are integers between 0 and 255 inclusive and they indicate the lightness or darkness of that pixel, with higher numbers meaning darker.
The test data set is the same as the training set, except that it does not contain the “label” column.
Importing necessary packages and libraries
Spitting the training data into input and labels
We created two new data frames with the desired columns such that the labels are stored separately as Y_train.
I used the plot to visualize the distribution of the different labels since this can have an effect on how well the model can be trained, but it doesn’t seem to have too much or too little of a particular label.
Reshaping and normalizing input data
Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total, and images are grayscaled so they use only one channel. So we need to reshape them into 28 x 28 x 1 3D matrices.
We normalize the data by dividing all values by 255 to change the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values.
This approach is very simple and it involves converting each label in the column to a number.
Splitting the data into the train and validation data set
The train set is split into two parts: 10% became the validation set and the rest is used to train the model.
Techniques used to increase the amount of data by adding slightly modified copies of already existing data or newly created synthetic data from existing data. This mainly helps reduce overfitting( the model learning the train set data too well and failing to generalize well with other datasets).
As shown in the code the Image data generator from Keras is needed to perform image augmentation. Please do take a look at this documentation to see a full list of all transformations that can be performed.
The original digits were transformed to reproduce variations that could occur when someone is writing a digit:
- Rotation of 10 degrees
- Random zoom of 10%
- The random horizontal shift of image width by 10%
- The random vertical shift of image height by 10%
The Convolutional neural network(CNN)
Features included in the model to enhance performance:
- Batch normalization: a technique for training deep neural networks that standardizes the inputs to a layer. This has the effect of stabilizing the learning process and reduces the number of training epochs. It also provides some regularization which minimizes generalization error.
- Dropout: the model chooses which neurons to shut off based on the probability specified which triggers other neurons to work actively and therefore makes sure weights across different parts of the network are consistent.
- Adam optimizer: an extension to stochastic gradient descent and is popular since it achieves good results fast. This is because it computes individual adaptive learning rates for different parameters.
Submitting your results
For more information, you can refer to this notebook and if you found it useful please do upvote it.
Hope this article helps you rank higher on Kaggle ;)