Data Engineering Tools - L&T EduTech | Building value for Learner, Academia and Industry

Data Engineering Tools

About the Program

Data Engineering Tools encompass a suite of technologies and frameworks essential for the effective management, processing, and analysis of extensive datasets. Within this domain, data engineers play a pivotal role, tasked with the design and upkeep of intricate data pipelines, ensuring the reliability and accessibility of critical information. Their expertise is fundamental in enabling data-driven decision-making and bolstering a diverse range of data-intensive applications, ranging from business intelligence platforms to advanced machine learning algorithms. This specialized skill set is highly sought after in today's data-centric landscape, making data engineers invaluable assets in the IT industry.

Courses

Credits Semester

Hadoop Eco System with HDFS and MapReduce 3 III
Data Processing with Hive and Pig Latin 3 IV
Complete Python for Data Engineers 3 V
PySpark for Data Engineers 3 V
Cloud Data Engineering AWS, GCP, and Azure 3 VI
Real-Time Data Engineering with Streaming Tools 3 VII
Project Work - The Data Services Capstone: Exploring Big Data Tools 3 VIII

Credits Semester

IT World Essentials: Your Digital Entrypoint 3 I
Critical Thinking, Design Thinking, Leadership and Teamwork 3 II
Project Work - The Data Services Capstone: Exploring Big Data Tools 3 VIII

Credits Semester

Critical Thinking, Design Thinking, Leadership and Teamwork 3 II
Career Readiness in Digital Era 3 VI

Mode of Delivery

Self-paced learning – 10 hours
VILT sessions – 28 hours
Project work – 7 hours

Face-to-face instructor led sessions / VILT sessions (including project work) – 45 hours
Self-paced learning + Expert session – 30 hours
Project work – 15 hours

Job Roles

Data Engineer
Data Integration Engineer

Big Data Analyst

Software Tools

Python
Scala
Presto
Hive
Pig
Flink
Zeppelin
Oozie
kafka
Pyspark
Databricks
MySQL
Cassandra
MongoDB
Hadoop
Airflow
Spark
Ambari
Zookeper
Flume
Sqoop

Skills

Designing efficient and scalable data structures for modeling and analysis.
Working with NoSQL databases like MongoDB and Cassandra for unstructured data.
Implementing centralized data storage solutions in data warehousing for large datasets.

Utilizing Hadoop and Spark for big data processing and distributed computing.
Combining data from multiple sources for seamless integration into a unified system.
Managing large datasets effectively with Hadoop and Spark ecosystem for performance.