Beginning Data Science
Data science unifies statistics, data analysis, machine learning and their related methods in order to understand and analyze actual phenomena with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
In this track, we'll be exploring the tools and techniques to get you started on your journey.
You'll pick up the basic building blocks of how to analyze and communicate data findings.
The first course you'll take is Data Analysis Basics, where you'll establish some language and definitions as well as how to think about data. Next, we'll cover some Python topics, as it's the language data scientists use the most. You'll establish a firm foundation in Python lists, dictionaries, sequences, tuples, and more.
Next we'll cover how to install and use Anaconda, as well as Jupyter Notebooks, two useful tools for your Python work. Additionally, you'll start creating charts with the Python library matplotlib, an industry standard data visualization library. Matplotlib provides a way to easily generate a wide variety of plots and charts in a few lines of Python code.
You'll get a basic introduction to NumPy, the fundamental package for scientific computing, and then pandas, which provides fast, flexible, and expressive data structures for your Python data work.
We'll then cover some best practices for cleaning and preparing data, data visualization, and an introduction to scraping data from the Web. To wrap up this Track, you'll take our Introduction to Big Data course and then our Machine Learning Basics course.
Ready to take the next step in your Data Science career? Let's get started!
-
An entry-level salary for the technologies covered in this track is about $95,000 / yr on average.
-
Some companies that use these technologies regularly include: Google, Microsoft, Apple, Airbnb
Ready to start learning?
Treehouse offers a 7 day free trial for new students. Get access to 1000s of hours of content. Learn to code, land your dream job.
Start Your Free Trial-
- 1
- 2
- 3
Data Analysis Basics
Learn how to make better decisions with data in this course on data analysis. We'll start by looking at what data analysis is, and then we'll see how we can use data analysis to create better outcomes.
-
- 1
- 2
- 3
- 4
Python Basics
Learn the building blocks of the wonderful general purpose programming language Python.
-
- 1
- 2
- 3
Introducing Lists
Lists are a powerful data type that allows you to store multiple ordered values in a single container. You are gonna love them.
-
- 1
Introducing Tuples
Learn about a Python data structure that's similar to lists but with one key difference!
-
- 1
- 2
- 3
Functions, Packing, and Unpacking
Learn the ins and outs of Python functions, how to send and receive values to functions, and all about Python packing and unpacking.
-
- 1
- 2
Python Sequences
Discover several types of Python sequences, many ways of sequence iterations, and all of the common sequence operations.
-
- 1
- 2
Introducing Dictionaries
Another useful Python data structure is the dictionary. Learn how to write one and use one in your day-to-day Python code.
-
- 1
- 2
- 3
- 4
Object-Oriented Python
Sometimes simple scripts with functions in them just aren't enough. Eventually you'll need logical models of your work and that'll lead you to creating custom classes in Python. Object-oriented programming is a large topic. It provides us some amazing tools, though, so it's one of the most beneficial things to learn about in Python. First, you'll learn how to build basic custom classes. Then, you'll expand them through inheritance. And for some extra power, you'll also learn how to take control of Python's built-in classes to make your own more powerful while doing less work. Finally, we'll put everything together into a fun game utility.
-
1 minInstruction
Learning SQL
We recommend you learn SQL...(continue reading)
Viewed -
15 minWorkshop
Introduction to Anaconda
Learn why you want to use Anaconda, and then learn how
Viewed -
15 minWorkshop
Jupyter Notebooks
The Jupyter project has an amazing tool for Python, Julia, R, and other languages. Learn how to install Jupyter Notebooks, use them, and install kernels for other languages.
Viewed -
- 1
- 2
- 3
Introduction to NumPy
NumPy is short for Numerical Python. It is the fundamental package for scientific computing. You will see it at play in just about everywhere Python needs to deal with data. This course gives a gentle introduction to the powerful library.
-
- 1
- 2
Introduction to pandas
Pandas provides fast, flexible, and expressive data structures that have been designed to make working with relational or “labeled” data not only easy, but also intuitive. It’s the fundamental high-level building block for doing practical and real-world data analysis in Python.
-
- 1
- 2
Preparing Data for Analysis
Learn how to clean and prep data for analysis using spreadsheet tools and Python's Pandas.
-
- 1
- 2
- 3
Introduction to Data Visualization with Matplotlib
Get started creating charts with the Python library, matplotlib, an industry-standard data visualization library. Matplotlib provides a way to easily generate a wide variety of plots and charts in a few lines of Python code. It is an open-source project that can be integrated into Python scripts, jupyter notebooks, web application servers, and multiple GUI toolkits. Whether you are exploring sample data available on the internet, or your own business data, learning matplotlib is a great place to start your data visualization journey.
-
1 minInstruction
More Visualization
Learn more data visualization libraries...(continue reading)
Viewed -
- 1
- 2
- 3
Scraping Data From the Web
Almost any information you want is available on the Internet. Web scraping is a key tool for data mining that information allowing for web page exploration and collection for a variety of reporting. The tools and techniques used in this course allow for data to be collected that would otherwise not be easily accessible without robotic assistance.
-
33 minWorkshop
Data from APIs
Use Python to gather data from an API and save it to a CSV file.
Viewed -
- 1
- 2
- 3
Introduction to Big Data
Big data represents an entire ecosystem of data sets, tools, and applications. This course is intended to get you familiar with the concepts, problem spaces, and overall ecosystem of Big Data.
-
- 1
- 2
- 3
Machine Learning Basics
Machine learning encompasses many different ideas, programming languages, frameworks, and approaches to the subject, so the term "machine learning" is difficult to define in just a sentence or two. But essentially, machine learning is giving a computer the ability to write its own rules and learn about new things, on its own. In this course, we'll explore some of the big ideas, and toward the end, we'll even write a little bit of code in Python that can make some intelligent predictions.
-
Track Completion
This course includes:
- Data Analysis Basics 75 min
- Python Basics 3 hours
- Introducing Lists 105 min
- Introducing Tuples 13 min
- Functions, Packing, and Unpacking 65 min
- Python Sequences 65 min
- Introducing Dictionaries 36 min
- Object-Oriented Python 3 hours
- Learning SQL 1 min
- Introduction to Anaconda 15 min
- Jupyter Notebooks 15 min
- Introduction to NumPy 2 hours
- Introduction to pandas 3 hours
- Preparing Data for Analysis 79 min
- Introduction to Data Visualization with Matplotlib 75 min
- More Visualization 1 min
- Scraping Data From the Web 70 min
- Data from APIs 33 min
- Introduction to Big Data 51 min
- Machine Learning Basics 58 min