Module 6 Unit 2 - Getting Started with Data Analysis#
Unit Learning Objectives#
By the end of this unit, you will be able to
Identify the steps involved in data analysis
The Data Analysis Process#
Many of the basic steps of data analysis overlap with the computational thinking process. Generally speaking data analysis involves the following steps that can be iterated and repeated as needed:
Data Analysis Process
Step |
Description |
Details |
---|---|---|
1 |
Identifying why you need or want to do data analysis. |
Start by asking/writing down questions or identifying needs. For example: What is the problem you want to study? What are your needs? What is your hypothesis? |
2 |
Finding a suitable data source to ask the questions against. |
You may already be collecting data that can be used to help address your questions or needs, or you can leverage commercially or publicly (Open Data) available data. |
3 |
Cleaning (or scrubbing) the data. This is often referred to as the Extract-Transform-Load (ETL) portion of the data analysis process. |
The cleaning process involves things such as importing, tidying and standardizing the data so that we end up with nicely formatted data (e.g. consistent date-time format) that can be used easily for further analysis. |
4 |
Exploring the Data. |
This involves visualizing or graphing the data to look for trends and patterns. This also involves the development of statistical models to examine these relationships. |
5 |
Interpreting the Results. |
Critically examining the results of the data exploration for solutions to the questions being asked (Step 1). This is often written in a non- |
Now that we understand the process of analyzing and representing data, let’s try to put this into action using open data sources and examples