KT-0661 Data Science at Scale using Spark and Hadoop Training - Minneapolis, Minnesota IT MN Technical Financial New York, NY
Knowledge Transfer Microsoft Certified Silver Training Partner CPLS
Knowledge Transfer is a Microsoft Certified Silver Learning Partner
Oracle University

 

Microsoft Certified Training Partner CTEC
Search for a Course Topic:
Public Courses
Corporate Services & Training
 

 

 



 Course Search
Keyword
Course #
State

 Training Delivery
 
Training Delivery
Custom Curriculum
Course List
Certifications
 
 Main Menu
 
Home
View Courses
Site Index
 
 


Data Science at Scale using Spark and Hadoop


Description: 

Data scientists build information platforms to provide deep insight and answer previously unimaginable questions. Spark and Hadoop are transforming how data scientists work by allowing interactive and iterative data analysis at scale. Learn how Spark and Hadoop enable data scientists to help companies reduce costs, increase profits, improve products, retain customers, and identify new opportunities. Cloudera University’s three-day course helps participants understand what data scientists do, the problems they solve, and the tools and techniques they use. Through in-class simulations, participants apply data science methods to real-world challenges in different industries and, ultimately, prepare for data scientist roles in the field

Skills Gained

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, and develop concrete skills such as:

  • How to identify potential business use cases where data science can provide impactful results
  • How to obtain, clean and combine disparate data sources to create a coherent picture for analysis
  • What statistical methods to leverage for data exploration that will provide critical insight into your data
  • Where and when to leverage Hadoop streaming and Apache Spark for data science pipelines
  • What machine learning technique to use for a particular data science project
  • How to implement and manage recommenders using Spark’s MLlib, and how to set up and evaluate data experiments
  • What are the pitfalls of deploying new analytics projects to production, at scale

 

Who Can Benefit

This course is suitable for developers, data analysts, and statisticians with basic knowledge of Apache Hadoop: HDFS, MapReduce, Hadoop Streaming, and Apache Hive as well as experience working in Linux environments.

 
Click here to view the Course Outline
     
Prerequisite: 

None

 
   
     
Duration: 
3 Days  
     
     

View Printer Friendly Page

 

Course Schedule
  Start Date  City  Price  
 8/22/2017
 $2595
 9/19/2017
 $2595
 10/17/2017
 $2595

To Inquire About Future Classes

Request a class date

if one is not scheduled.