Skip to content

Sahadat97/Data_Engineering_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

🏔️ Data Engineering Learning Journey

A structured, hands-on learning repo documenting my path through modern Data Engineering — starting with Snowflake, expanding into the full DE stack.

Learning Status Started Notion Notes


👋 Why This Repo Exists

I'm building real, hands-on experience in Data Engineering by learning in public. This repo contains my SQL scripts, Python experiments, mini-projects, and reflections as I work through each tool and concept — starting with Snowflake.

📓 Deep-dive notes, diagrams & reflectionsNotion Knowledge Base


🗺️ Learning Roadmap

Phase 1 — Snowflake (Current)

📓 All Snowflake notes in one placeSnowflake Notion Page

Week Topic Status
1 Architecture, Virtual Warehouses, Databases & Schemas ✅ Done
2 Data Loading — COPY INTO, Stages, File Formats ✅ Done
3 Semi-structured Data — VARIANT, FLATTEN, JSON 🔄 In Progress
4 Time Travel, Fail-safe, Cloning ⬜ Planned
5 Performance — Clustering, Result Cache, Query Profiling ⬜ Planned
6 Snowpipe, Tasks, Streams (CDC) ⬜ Planned
7 Snowpark (Python in Snowflake) ⬜ Planned
8 RBAC, Data Sharing, Security Best Practices ⬜ Planned

Phase 2 — Orchestration (Upcoming)

  • Apache Airflow — DAGs, Operators, XComs
  • dbt (data build tool) — Models, Tests, Documentation

Phase 3 — Data Movement & Storage (Planned)

  • Apache Kafka — Producers, Consumers, Topics
  • Delta Lake / Iceberg — Table formats
  • AWS S3 / Azure Blob — Cloud storage patterns

Phase 4 — Capstone Project (Planned)

  • End-to-end pipeline: ingestion → transformation → serving layer

📁 Repo Structure

data-engineering-journey/
│
├── snowflake/
│   ├── 01-basics/                  # Warehouses, databases, schemas
│   ├── 02-data-loading/            # Stages, COPY INTO, file formats
│   ├── 03-semi-structured/         # JSON, VARIANT, FLATTEN
│   ├── 04-time-travel/             # Time Travel & Fail-safe
│   ├── 05-performance/             # Clustering, caching, query profiling
│   ├── 06-streams-tasks/           # Snowpipe, Tasks, Streams
│   ├── 07-snowpark/                # Python in Snowflake
│   └── projects/                   # Mini end-to-end projects
│
├── airflow/                        # (Coming soon)
├── dbt/                            # (Coming soon)
├── kafka/                          # (Coming soon)
│
└── resources.md                    # Curated links, docs, courses

❄️ Snowflake — Key Concepts Covered

Architecture

  • Snowflake's three-layer architecture: Storage / Compute / Cloud Services
  • Virtual Warehouses — sizing, auto-suspend, auto-resume
  • Multi-cluster warehouses for concurrency scaling

Data Loading

  • Internal & External Stages
  • COPY INTO with CSV, JSON, Parquet
  • File Format objects and options

More concepts added weekly. Follow along ⭐


🛠️ Tools & Stack

Tool Purpose Level
Snowflake Cloud Data Warehouse 🔄 Learning
SQL Query & transformation language ✅ Comfortable
Python Scripting & Snowpark ✅ Comfortable
dbt Data transformation framework ⬜ Planned
Apache Airflow Orchestration ⬜ Planned
Apache Kafka Streaming ingestion ⬜ Planned
Git / GitHub Version control & learning in public ✅ Using
Docker Containerisation & local dev environments 🔄 Learning
GitHub Actions CI/CD pipelines & workflow automation 🔄 Learning
Terraform Infrastructure as Code (IaC) ⬜ Planned
Linux / Bash Scripting & server operations ⬜ Planned
Power BI Dashboard building ✅ Comfortable

📓 Notion Knowledge Base

I maintain detailed notes alongside this repo — including:

  • Concept explanations in my own words
  • Diagrams & architecture sketches
  • "Gotchas" and lessons learned
  • Topic-by-topic summaries

👉 Open Notion Notes


📌 How I Learn (My Process)

  1. Read official docs / watch a focused tutorial
  2. Write notes in Notion — explain it like I'd teach it
  3. Code it — reproduce the concept from scratch in this repo
  4. Reflect — commit with a meaningful message describing what I learned
  5. Review — revisit tricky concepts a week later

📬 Connect

If you're on a similar learning path, I'd love to connect.


Updated regularly as I progress. Last updated: 29.04. 2026.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors