Module 6 Unit 2 - Getting Started with Data Analysis

Callysto.ca Banner

Module 6 Unit 2 - Getting Started with Data Analysis#

Unit Learning Objectives#

By the end of this unit, you will be able to

  • Identify the steps involved in data analysis

The Data Analysis Process#

Many of the basic steps of data analysis overlap with the computational thinking process. Generally speaking data analysis involves the following steps that can be iterated and repeated as needed:


Data Analysis Process

Step

Description

Details

1

Identifying why you need or want to do data analysis.

Start by asking/writing down questions or identifying needs. For example: What is the problem you want to study? What are your needs? What is your hypothesis?

2

Finding a suitable data source to ask the questions against.

You may already be collecting data that can be used to help address your questions or needs, or you can leverage commercially or publicly (Open Data) available data.

3

Cleaning (or scrubbing) the data. This is often referred to as the Extract-Transform-Load (ETL) portion of the data analysis process.

The cleaning process involves things such as importing, tidying and standardizing the data so that we end up with nicely formatted data (e.g. consistent date-time format) that can be used easily for further analysis.

4

Exploring the Data.

This involves visualizing or graphing the data to look for trends and patterns. This also involves the development of statistical models to examine these relationships.

5

Interpreting the Results.

Critically examining the results of the data exploration for solutions to the questions being asked (Step 1). This is often written in a non-


Now that we understand the process of analyzing and representing data, let’s try to put this into action using open data sources and examples

Callysto.ca License