Industry-Abridged Joint Certification Programme

Home / Computer Science and Engineering / Data Engineering Foundation with Hadoop and Spark

About Course

This foundational course introduces participants to the core principles of data engineering, focusing on Hadoop and Spark technologies. Students will grasp the fundamentals of distributed computing, Hadoop's role in handling large-scale data, and Spark's capabilities for efficient data processing. From data storage to advanced analytics, this course provides hands-on experience in building robust data pipelines. By the end, learners will possess the essential skills to design, implement, and optimize scalable data solutions using Hadoop and Spark in real-world scenarios.

Data Engineering Foundation with Hadoop and Spark
Data Engineering Foundation with Hadoop and Spark

Pre-requisites

Basic programming skills, Familiarity with data concepts, Understanding of Linux commands.


Course duration , assessment and expert session

20 hours of self-paced interactive learning, including summative assessment and expert live interactions


Key topics:

  • Introduction to Hadoop.
  • HDFS and MapReduce.
  • Programming with Pig.
  • Programming with Spark.
  • Relational Datastores with Hadoop
  • Non-Relational Data Stores with Hadoop
  • Querying Data Interactively
  • Mid-Term Project
  • Managing Cluster
  • Feeding Data to Cluster
  • Analyzing Streams of Data
  • Designing Real-World Systems
  • Introduction to Apache Spark
  • RDD
  • Spark Architecture and Components
  • Pair RDD
  • Advanced Spark Topics

Learning Outcomes:

At the end of the course, the student will be able to:

Proficient understanding of the Hadoop ecosystem for big data processing.

Competency in programming with Pig and Spark for data analytics.

Effective skills in data storage and querying with relational and non-relational stores.

Expertise in managing and optimizing Hadoop clusters for efficiency.

Practical application of learned concepts to design real-world data engineering systems.

In-depth knowledge of advanced Spark topics, including Spark SQL and cluster deployment


Key Job Roles:

  • Big Data Engineer
  • Hadoop Developer
  • Spark Developer
  • Data Engineer
  • Data Analyst with Hadoop and Spark
  • Data Science Engineer
  • Cloud Data Engineer
  • Data Architect - Hadoop and Sparks / Business Intelligence Developer
  • Analytics Engineer

Explore Other Courses


© BITS-L&T EduTech-2023-24 All Rights Reserved. Designed by SDET Unit BITS Pilani, Pilani Rajasthan

Visitor Count - 47