Digit recognition

Well, all of us definitely feel good to achieve high in all we do, and this competition helped me experience this on Kaggle.

This article is gonna summarize what this competition on Digit Recognition using CNN’s is all about and how you need to submit your predictions in the form of a csv file on Kaggle.

Loading and understanding data

Did the movie get a positive or negative review?

Sentiment classification uses natural language processing and machine learning to interpret emotions in the inputted data. It is a text analysis technique that detects polarity.

This article would explain the steps to building a sentiment classification model using the “IMDB dataset of 50k movie reviews”.

Understanding the dataset

Few techniques that could come in handy :]

This article would briefly explain SQL queries, statistical tests, and visualization methods using the New York subway weather data.

These are covered in detail in the course “Intro to Data Science” by Udacity which I highly recommend.

Wrangling subway data

  • Using SQL queries: Pandasql makes accessing, reading, and interpreting data stored in data frames much easier, especially for someone who is new to python or pandas. It helps us choose the columns/data we need to make predictions or arrive at conclusions. …

Why not try out machine learning along with data analysis??

Machine learning is a branch of artificial intelligence where we construct models/systems that learn and study data to make predictions.

This is what I tried out as well using the analysis I did on the titanic data set. A model that can predict the chance of survival for any passenger based on data about them.

Let us begin creating our model. What would the first step be?

Understanding your data set

The data set I chose is “Titanic: Machine Learning from Disaster” from Kaggle and which contains two separate train and test data files.

Check out these must-do projects :]

How can I learn to code better using python?

I guess this is one question everyone who codes asks themself frequently. Well, one best way would be to try and create your own projects and try modifying snippets of code to make them more elegant and effective.

And in particular, I found these super cool mini-projects at the end of the “Learn Python for free” course provided by Scrimba.com. This course comprises around 59 interactive screencasts that help beginners learn from scratch or the experienced refresh their knowledge and maybe look at different ways of code implementation.

Let’s look at one of the projects in detail and the steps…

From a beginner’s perspective !!

My first data analysis project as a newbie in data science to identify the different factors that affected survival rates among passengers who were aboard ‘The Titanic’.


This project would mainly focus on survival rates of passengers depending on their sex, age, socio-economic status and a few other factors. I have mainly used the library pandas and also integrated a few sql techniques along. I also included various coding practices I came across.

Data Wrangling

Understanding the variables in the dataset. What does each column represent?

Abinaya Jayaprakash

Srilankan living in Berlin. Mathematics master student at Freie Universitat. Interested in Data science & Machine Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store