Data Engineer Path: resources and roadmap
You’re passionate about technology, code, and data; this is your place.
YouTube Map of Computer Science A high-level map to locate disciplines, fundamentals, and possible learning paths in computing. https://www.youtube.com/watch?v=SzJ46YA_RaA
Data lifecycle
In this first stage, you need to understand the data lifecycle: where data comes from, how it’s stored, and all the ways it can be used.
Learn and understand the role of different data positions and how they work together.
- Data Engineer
- Data Architect
- Data Analyst
- Data Science
- BI Engineer
- ML Engineer
- AI Engineer
Resources:
YouTube The Data Movie | Data Literacy Explained Visually A visual summary of data literacy, why it matters, and how it shows up in business. https://www.youtube.com/watch?v=J2rQTJby8XM
What does a Data Engineer do?
Understand the Data Engineer role.
Resources:
YouTube Fundamentos de Ingeniería de Datos | E01 Defines the role, key responsibilities, and how it adds value to data teams. https://www.youtube.com/watch?v=dNvGBMzhf2A
YouTube Ingeniería de Datos es la profesión MÁS DEMANDADA en 2025 Market context, role growth, and career opportunities. https://www.youtube.com/watch?v=UaNc08OY_YY
YouTube How I would learn Data Engineering in 2025 Step-by-step path with study priorities, projects, and recommended tools. https://www.youtube.com/watch?v=odZHmYgebbw
Core skills to start as a Data Engineer
YouTube ¿Qué necesito para ser Data Engineer en el 2025? Clear checklist of fundamentals, base stack, and starter projects. https://www.youtube.com/watch?v=rZmNLUQYBic
Fundamentals
Learn core foundations, technologies, architectures, and design patterns.
- Data Warehouse
- Data Lake - Lakehouse
- Medallion Architecture
- What data ingestion is
- What .parquet files are
- Where data is really stored
Resources:
YouTube Introducción a Lakehouse Architecture — Capítulo 1 Introducción a Lakehouse Architecture. https://open.spotify.com/episode/4YXzuQGihWck7V71mGinkl?si=O7gqqiM8S-aL2wyRRrBY-A
YouTube Data Engineer Bootcamp Express — Capítulo 1 Introducción práctica al stack y conceptos base. https://www.youtube.com/live/_GqpkwZqyAg?si=3fs15NQItzMQPJBU
YouTube Data Engineer Bootcamp Express — Capítulo 2 Continuación con arquitectura, pipelines y criterios de diseño. https://www.youtube.com/watch?v=d_vG7XV9Vvg
O'Reilly Fundamentals of Data Engineering Libro completo sobre arquitectura, ciclo de vida del dato y buenas prácticas. https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/
YouTube SQL Data Warehouse from Scratch Proyecto end-to-end para entender modelado, ETL y reporting. https://www.youtube.com/watch?v=9GVqKuTVANE
YouTube Data Engineering Course for Beginners Curso base para conocer herramientas y flujos típicos de un DE. https://www.youtube.com/watch?v=PHsC_t0j1dU
O'Reilly Building Medallion Architectures Guía para diseñar capas Bronze/Silver/Gold con criterios prácticos. https://www.oreilly.com/library/view/building-medallion-architectures/9781098178826/
O'Reilly Data Pipelines Pocket Reference Referencia rápida de patrones, componentes y decisiones de pipeline. https://www.oreilly.com/library/view/data-pipelines-pocket/9781492087823/
Choose your data stack
There are multiple learning paths, each with its own data stack, for example:
- GCP: BigQuery, Cloud Storage, Cloud Function, Airflow
- AWS: Snowflake, S3, Airflow, dbt
- Azure: Databricks, Data Factory
- And others…
Resources by stack:
Azure + Databricks
YouTube Azure Data Engineer Full Course For Beginners Ruta completa con storage, data factory y prácticas de ingeniería. https://www.youtube.com/watch?v=bSx3DbJNQk4
YouTube Databricks Tutorial 2025 Playlist con notebooks, Spark y flujos típicos de Databricks. https://www.youtube.com/watch?v=XOSuR8g2SfQ&list=PL2IsFZBGM_IGiAvVZWAEKX8gg1ItnxEEb
O'Reilly Spark: The Definitive Guide Libro referencia para dominar Spark y performance. https://www.oreilly.com/library/view/spark-the-definitive/9781491912201/
O'Reilly Learning Spark, 2nd Edition Guía práctica para construir jobs y pipelines en Spark. https://www.oreilly.com/library/view/learning-spark-2nd/9781492050032/
YouTube Databricks Data Engineer Associate Certification Preparación enfocada para la certificación con ejemplos. https://www.youtube.com/watch?v=0Hd5vYqin7w
GCP
In my opinion, this is the best path.
Coursera Data Engineering, Big Data, and Machine Learning on GCP Especialización con práctica en BigQuery, Dataflow y ML en GCP. https://www.coursera.org/specializations/gcp-data-machine-learning
Airflow
YouTube What is Apache Airflow? Explica DAGs, operadores y conceptos de scheduling. https://www.youtube.com/watch?v=CGxxVj13sOs
Astronomer Airflow 101 Curso guiado para construir pipelines y orquestación. https://academy.astronomer.io/
dbt
dbt dbt Learn Curso oficial para modelado analítico y testing en dbt. https://learn.getdbt.com/courses/dbt-fundamentals
Kafka
YouTube Nunca te explicaron Kafka así Explicación clara de topics, particiones y casos de uso reales. https://www.youtube.com/watch?v=wiIt2ZGu6Qc&t=144s