Investigating Titanic Data
From a beginner’s perspective !!
My first data analysis project as a newbie in data science to identify the different factors that affected survival rates among passengers who were aboard ‘The Titanic’.
Introduction
This project would mainly focus on survival rates of passengers depending on their sex, age, socio-economic status and a few other factors. I have mainly used the library pandas and also integrated a few sql techniques along. I also included various coding practices I came across.
Data Wrangling
Understanding the variables in the dataset. What does each column represent?
- PassengerId = the unique number that identifies each passenger
- Survived = Value of “1” indicates the passenger survived and “0” indicates otherwise
- Pclass = Passenger class (1 = 1st class, 2 = 2nd, 3 = 3rd)
- Name = Name of passenger
- Sex = Sex of Passenger
- Age = Age of Passenger
- SibSp = Number of Siblings/Spouses of the passenger aboard