Our team has collected several posts on learning data science. These posts will cover courses, books, and youtube videos. We hope they can help you on your journey.
25 OF THE BEST DATA SCIENCE COURSES ONLINE
Bootcamps and Specializations
1. Introduction to Probability and Data
This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes’ rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization. You will be guided through installing and using R and RStudio (free statistical software), and will use this software for lab exercises and a final project. The concepts and techniques in this course will serve as building blocks for the inference and modeling courses in the Specialization.
Take The Course
2. Full Statistics Courses
In this Specialization, you will learn to analyze and visualize data in R and create reproducible data analysis reports, demonstrate a conceptual understanding of the unified nature of statistical inference, perform frequentist and Bayesian statistical inference and modeling to understand natural phenomena and make data-based decisions, communicate statistical results correctly, effectively, and in context without relying on statistical jargon, critique data-based claims and evaluated data-based decisions, and wrangle and visualize data with R packages for data analysis.
Take The Courses
3. The Data Scientist’s Toolbox
In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.
Take The Courses
Read More Here
LEARNING DATA SCIENCE: OUR FAVORITE DATA SCIENCE BOOKS
In data science, there are many topics to cover, so we wanted to focused on several specific topics. This post will cover books on python, R programming, big data, SQL and just some generally good reads for data scientists.
Data Science Books
As a data scientist, you have a very important role. Your goal is to provide your company insights into improving the companies bottom or top line. The problem is, we can make data say anything we want. It can be very easy to manipulate data to prove that our feature was effective and it can be tempting if the company incentivizes that type of behavior.
Thus, a great general read for data scientists (and really anyone in our modern world) is Naked Statistics.
This is kind of like the much older book How To Lie With Statistics which you can read for free.
We do prefer Naked Statistics because it is a little more modern and covers much more complex statistical debauchery than its much older counterpart. It just goes to show you that numbers are at your whim and you have a lot of responsibility to make sure your numbers are right. If something seems amiss with your data…it probably is. Rather than reporting it out right away, think about how you might unknowingly be miss representing the facts.
Another similar book is Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are.
LEARNING DATA SCIENCE: OUR FAVORITE RESOURCES FROM FREE TO NOT
Data science has many facets. Statistics, data cleansing, programming, system design and really…almost anything else data related depending on how large the company is.
This post will discuss our favorite resources for these topics. Now, most of these courses and books are primers for topics like statistics, Python and data science in general. They really will only provide the base knowledge. At the end of the day, real practical experience is one for the few things that will really train your data science knowledge. You should learn as much as you can from these resources and then apply for as many internships and entry-level positions as possible and study for interviews.
You will learn much more and gain more than just technical knowledge. You will also gain a lot of business experience.
Free Statistics Courses
Let’s start with, learning/reviewing basic statistical concepts. Many of you have probably taken a statistics course or two in college. But you might not remember everything clearly so it’s a good idea to review from the beginning.
It can be tempting to try to start taking on complex statistical concepts and models. But most algorithms and models require some sort of accuracy and hypothesis testing. This means you actually need to be able to understand concepts like p-test vs at-value, z-statistics vs t-statistics, ROC vs AUC., random variables, etc.
These all seem like basic concepts and maybe you kind of remember these words. However, we find they often get forgotten as many of us focus more on learning how to implement models in python and R vs basic statistics. Although both of these concepts do not necessarily rely on each other. You can start to assume you understand what the p-value means when you run models in either language without fully grasping the importance of it.
1. Khan Academy
This is why we recommend at least going back and walking through the Khan Academy statistics section. They cover concepts like Hypothesis testing, T-statistics vs Z-Statistics, Confidence Intervals, etc.
Khan is always a good place to start because the videos are a great combination of visual and audio examples.
Personally, there aren’t too many books we like when it comes to pure statistics. In the R programming section of this resource list, we will reference our favorite R + Statistics book.
2. Duke University On Coursera
For a full course that are free you can try Duke Universities Statistics Course. This is actually several courses that cover multiple types of statistics like classical vs Bayesian. These are two different methods that are worth looking into.
Python Videos, Books And Courses
Python is an interesting topic. The thing about python is there are so many plausible sub-sections of the programming language. For instance, when we prepare for interviews we always like to clarify which type of python questions we will be dealing with. Will we be asked questions that focus on concepts that are operational, analytical, optimization based, algorithms and data structures or possibly data science algorithms. All of these are different topics that have different styles of interview questions. Getting a question on how to traverse a binary tree is very different from having to implement a decision tree algorithm.
As a data scientist, typically you will benefit from the analytical and operational aspects of python. The operational portion will provide you with the ability to automate the boring stuff (<-3. as the cliche book is titled).
This book is great for really…any data focused person. Data scientists, business analysts, Business Intelligence Engineers, and Database developers all can benefit from automation. Now, you don’t need to use python, if you’re in a windows environment there is PowerShell Linux has to bash. Learning some form of scripting language helps improve your workflow and design thinking.
OUR FAVORITE PYTHON BOOKS, COURSES AND YOUTUBE VIDEOS
Python is a common language that is used by both data engineers and data scientists. This is because it can automate the operational work that data engineers need to do and has the algorithms, analytics, and data visualization libraries required by data scientist.
In both rolls, the need to manage, automate and analyze data is made easier by only a few lines of code. So much so that one of the books we have read and seen in many data focused practitioners libraries in the book Automate The Boring Stuff With Python.
The book covers python basics and some simple automation tips. This is especially good for business analysts who work heavily in Excel.
There are also books by O’Reilly that are also a great overview of the basics.
Read More Here
Are You Interested In Learning About Data Science Or Tech?
Learning Data Science: Our Favorite Data Science Books
What Is Data Science Really As Told By An Ex-FAANG Data Scientist
How Algorithms Can Become Unethical and Biased
How To Load Multiple Files With SQL
How To Develop Robust Algorithms
Dynamically Bulk Inserting CSV Data Into A SQL Server
4 Must Have Skills For Data Scientists
SQL Best Practices — Designing An ETL Video
We are a team of data scientists and network engineers who want to help your functional teams reach their full potential!