Thursday, August 6, 2015

Looking for a Job as a Data Scientist? (Checklist included)

Interested in landing a job as a data scientist? You're on the right place, "data scientist" has been called the sexiest job of the 21st century in IT (1). We are bringing a brief description of data scientist position, enclosing some interesting tips and links to sources you should study to skill up your knowledge about data. Don't worry, you don't need to be a genius in calculus. There are many ways that can help you develop specific data skills to match the job you want. Let's start..

Who is a data scientist?

It's a person who uses data to analyze a situation and to answer all question about a specific area. Sean Patrick in his book about Nikola Tesla (2) told that the truth about intelligence and success in life is:

"You have to be smart enough to fulfill the intelectual requirements for success."

Pathologically known Venn's diagram describes different skills that intersect in the middle and create 'data science' part. The part 'Hacking Skills' involves computer programming to access and analyze data but also the ability to go out and answer questions, because most of the questions aren't outlined in textbooks. It also means the ability to cleverly draw up code from scratch to solve problems. Admittedly, there's also need of having some math and statistical skills. Substantive expertise is something like domain knowledge. It's knowledge related to specific facts and relationships about certain subject, not just a technical process. It would let you use your backgroung for example in biology to apply previous skills to finding diseases in DNA codes. So this is what you need to become a data scientist. Let's look at more practical part.

What data scientists do?

According to Udacity's blog there are only four types of work you can do.

  • A data scientist is a data analyst who lives in San Francisco
    • you pull data out of MySQL databases, master at Excel pivot tables and produce basic data visualizations (bar and line charts)
    • you analyze results of an A/B test (see an article about A/B testing)
    • and probably take the leed of Google Analytics account
  • "Data Engineer" and "Data Scientist" jobs
    • position at big companies where they have a lot of traffic and an increasingly large amount of data and need someone to set up a lot of data infrastructure 
    • usualy senior positions
  • We are data, data is us
    • companies for whom their data is their product
    • ideal for someone with formal math, statistics or physics background
    • more academic path
  • Non-data companies who are data-driven
    • such companies already have a team and are looking for a person to fill a specific niche where they feel their team is lacking, such as data vizualization or machine learning
    • requires more general knowledge and familiarity with tools designed for "big data", e.g. Hive, Pig ...
What skills you should develop and how?

Here you can download Udacity's Checklist for your first data analyst job, enjoy =)
Edit (April 2016): the PDF has been removed, so I enclosed another link for downloading the Checklist. If you still can not find the checklist, you could probably find it on Google ;)

Promised links... - free IDE for R - documentation - version control
shell / bash script 
R Programming (link)
Programming for Everybody in Python (link)
Introduction to Interactive Programming in Python 1 (link)
Introduction to Interactive Programming in Python 2 (link)
Programming Languages (link)
Machine Learning MIT (link)
Many more courses about machine learning online (link)