CNN with an accuracy of 99%

Abinaya Jayaprakash
4 min readNov 5, 2020

Digit recognition

Well, all of us definitely feel good to achieve high in all we do, and this competition helped me experience this on Kaggle.

This article is gonna summarize what this competition on Digit Recognition using CNN’s is all about and how you need to submit your predictions in the form of a csv file on Kaggle.

Loading and understanding data

The data files contain gray-scale images of hand-drawn digits, from zero through nine.

As seen in the output, the training data set has 785 columns. The first column, called “label”, is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.

These values are integers between 0 and 255 inclusive and they indicate the lightness or darkness of that pixel, with higher numbers meaning darker.

The test data set is the same as the training set, except that it does not contain the “label” column.

Importing necessary packages and libraries

Spitting the training data into input and labels

We created two new data frames with the desired columns such that the labels are stored separately as Y_train.

I used the plot to visualize the distribution of the different labels since this can have an effect on how well the model can be trained, but it doesn’t seem to have too much or too little of a particular label.

Reshaping and normalizing input data

Abinaya Jayaprakash

Srilankan living in Berlin. Mathematics master student at Freie Universitat. Interested in Data science & Machine Learning