Big Data Engineering

Continuing Education Center
Big Data Engineering

Course Overview:
- Big Data Engineering is critical for data-driven decision-making in industries like finance, healthcare, and e-commerce. With the explosion of data, companies need experts to build scalable pipelines, ensuring efficient storage, processing, and analysis. Certified professionals can secure roles like Data Engineer, Cloud Architect, or Analytics Specialist with competitive salaries.
Learning Outcomes:
- Big Data Tools: Students will gain expertise in technologies like HDFS, Spark, Hive, and Flink for both real-time and batch data processing.
- Data Processing Frameworks: Learners will work with tools like MapReduce, Elasticsearch, HBase, and Kafka for distributed data management.
- Big Data Strategy: The course prepares trainees to store, analyze, and manage large datasets, implementing effective big data strategies for organizations
Prerequisites:
- Network Essentials .
Target Certification:
- The participant who will pass the course exam will obtain a certificate from MIU.
- The level grants one certificate:
- Huawei HCIP Big Data (H13-723) certificate.
Who Should Attend
- Graduates or students in Computer Science, Communications, Electronics, or Computer Engineering fields looking to enter the AI industry.
Estimated Time to Completion: 80 hours
Housr/Day | No. of days / week | Total no. of days |
6 Hours/day | 4 days/week | 14 Days |
Contents:
Huawei HCIP Big Data (H13-723) (80hrs)
- BigData Application Development Overall Guide
- Introduction to Big Data
- Mainstream big data technologies
- Big data scenario-based solution
- Big data application development
- Experience with big data processing frameworks
- Offline batch processing solution
- HDFS: Hadoop Distributed File System
- Map Reduce: distributed offline batch processing and Yarn Resource Negotiator
- Zookeeper: Cluster distributed Coordination service
- Non-relational databases
- HBase: distributed NoSQL database
- HBase Overview and Data Models
- HBase Architecture
- HBase Performance Tuning
- Common Shell Commands of HBase
- Distributed search engine: Elasticsearch
- System Architecture
- Basic functions and concepts of Elasticsearch
- Key Features
- Concepts of Shard and Replica
- Compute Engines
- Spark2x: in memory distributed computing engine
- Spark Overview
- Spark Data Structure
- Spark Principles and Architecture
- SparkSQL offline analysis tool
- Flink: stream processing and batch processing platform
- Flink Principles and Architecture
- Flink Time and Window
- Flink Watermark
- Flink Fault Tolerance Mechanism
- Extract Transform Load (ETL) tools
- Sqoop: data transformation
- Flume: massive Logs aggregation
- Apache NiFi
- Apache Airflow
- Converged Data Warehouse
- Introduction to DWS
- Hive distributed Data Warehouse
- Kafka: distributed message subscription system