Data Engineer Path: resources and roadmap


You’re passionate about technology, code, and data; this is your place.

Data lifecycle

In this first stage, you need to understand the data lifecycle: where data comes from, how it’s stored, and all the ways it can be used.

Learn and understand the role of different data positions and how they work together.

  • Data Engineer
  • Data Architect
  • Data Analyst
  • Data Science
  • BI Engineer
  • ML Engineer
  • AI Engineer

Resources:

What does a Data Engineer do?

Understand the Data Engineer role.

Resources:

Core skills to start as a Data Engineer

Fundamentals

Learn core foundations, technologies, architectures, and design patterns.

  • Data Warehouse
  • Data Lake - Lakehouse
  • Medallion Architecture
  • What data ingestion is
  • What .parquet files are
  • Where data is really stored

Resources:

Choose your data stack

There are multiple learning paths, each with its own data stack, for example:

  • GCP: BigQuery, Cloud Storage, Cloud Function, Airflow
  • AWS: Snowflake, S3, Airflow, dbt
  • Azure: Databricks, Data Factory
  • And others…

Resources by stack:

Azure + Databricks

GCP

In my opinion, this is the best path.

Airflow

dbt

Kafka