Big Data is a transformative force that goes beyond traditional decision-making methods. This course equips participants with the skills to understand and design Big Data architectures to support data analytics solutions within their organizations.
Through practical exercises and workshops, participants will apply key techniques for managing large-scale data projects, with a focus on machine learning algorithms and AI use cases. By the end of the course, participants will produce a practical action plan with a Big Data architecture ready to implement within their organizations.
By the end of the course, participants will be able to:
- Design plans for implementing big data projects and develop strategies for data-driven solutions.
- Understand the challenges of big data compared to traditional tools such as Excel.
- Discuss the benefits and challenges of using Hadoop and other distributed data architectures.
- Review big data storage and processing technologies such as PostgreSQL and MongoDB.
- Learn the most common machine learning algorithms and understand the importance of ethics in data analysis and AI.
- Create architectural diagrams for analytics-focused use cases.
Core Concepts
- Definition of Big Data and its key elements (5 “V”).
- Relationship between Big Data and data analytics.
- Impact of Big Data on modern technologies and the open-source revolution.
- Types of data: text, audio, images.
- Professional roles related to Big Data.
Examples and Applications
- Leading companies: Netflix, LinkedIn, Facebook, Google, Orbitz, Dell.
- Best practices for designing Big Data projects and assessing the current state of an organization.
Storage and Engineering
- Big Data engineering and fundamental models.
- Hadoop environment: overview, HDFS (Hadoop Distributed File System), MPP vs. in-memory distributed applications.
- Databases: SQL vs. NoSQL (PostgreSQL, MongoDB, Cassandra).
- Streaming data, data warehouses, and data lakes.
- Role of cloud computing, networking, and data transfer risks.
- ETL processes and Big Data computation technologies: MapReduce, Spark, Storm, Kafka.
Data Analytics
- Fundamentals, objectives, and team roles in data analytics.
- Key math and statistics concepts; supervised and unsupervised learning.
- Techniques for extracting value from data (Data Science 5 “Pens”).
- Importance of ethics and programming literacy.
Big Data Solution Design
- Identifying analytical opportunities and organizational challenges.
- Describing data impact and using data to solve problems.
- Determining data sources and planning storage and computation strategies.
- Conducting brainstorming sessions to plan and implement analytics solutions.
- Data analysts, data engineers, and data scientists.
- Administrative and technical professionals seeking to understand big data strategies, technologies, and use cases.
- Participants are expected to have basic programming skills and experience with data analysis using Python, a foundational understanding of database technologies, and awareness of analytics-driven business initiatives.