This course runs for a duration of 5 days.
The class will run daily from 10 AM ET to 6 PM ET.
Class Location: Virtual LIVE Instructor Led - Virtual Live Classroom.
This hands-on Data Engineering Bootcamp teaches attendees the foundations of data engineering using Python and Spark SQL. Students learn how to build production-ready data-driven solutions and gain a comprehensive understanding of data engineering.
Skills Gained
Data Availability and Consistency
A/B Testing Data Engineering Tasks Project
Learning the Databricks Community Cloud Lab Environment
Python Variables
Dates and Times
The if, for, and try Constructs
Dictionaries
Sets, Tuples
Functions, Functional Programming
Understanding NumPy and pandas
PySpark
Audience
This Data Engineer Bootcamp training is targeted to Data Engineers
For more Python training you may be interested in, click here.
Big Data Concepts and Systems Overview for Data Engineers
Defining Data Engineering
Data Processing Phases
Python 3 Introduction
Python Variables and Types
Control Statements and Data Collections
Functions and Modules
File I/O and Useful Modules
Practical Introduction to NumPy
Practical Introduction to pandas
Data Grouping and Aggregation with pandas
Repairing and Normalizing Data
Data Visualization in Python
Python as a Cloud Scripting Language
Introduction to Apache Spark
The Spark Shell
Spark RDDs
Parallel Data Processing with Spark
Introduction to Spark SQL
Lab Exercises
Some working experience in any programming language; the students will be introduced to programming in Python. Basic understanding of SQL and data processing concepts, including data grouping and aggregation.