Skip to content

Latest commit

 

History

History

Overview

This study covers how to read from and write to externals files that follow common data format, such as CSV and JSON. CSV and JSON files play an important role in data science as most datasets are found in these formats. We will learn how CSV and JSON files are structured and organized. We will learn how to read in external data files, parse the data, and load them into Pandas DataFrames, which are common steps in starting data science projects using big data. We will learn additional DataFrame methods to assist us with bring data together from different sources as well as two additional Seaborn visualizations: strip plots and box plots. Learning Objectives By the end of course, you will be able to:

CO1: Identify industry-standard approaches to organization, storage, manipulation, analysis, and visualization of big data
CO2: Wrangle raw data from different sources and formats
CO3: Write data science programs in Python through an object-oriented approach using common data science packages
CO4: Analyze data and results through statistical and visual analysis
CO5: Investigate data science problems involving big data

By the end of this module, you will be able to:

MO 4.1 Process text-based files in Python (supports CO1, CO2, CO3, CO5)
MO 4.2 Load CSV files into a Pandas DataFrame (supports CO2, CO3, CO5)
MO 4.3 Manipulate data in a Pandas DataFrame (supports CO2, CO3, CO5)
MO 4.4 Visualize data in a Pandas DataFrame (supports CO3, CO4, CO5)
MO 4.5 Use serialized JSON objects in Python programming (supports CO1, CO2, CO3)
MO 4.6 Handle exceptions in Python programming (supports CO3)