Data Engineering Weekly

Data Engineering Weekly

Share this post

Data Engineering Weekly
Data Engineering Weekly
Data Engineering Weekly #56
Copy link
Facebook
Email
Notes
More
User's avatar
Discover more from Data Engineering Weekly
The Weekly Data Engineering Newsletter
Over 35,000 subscribers
Already have an account? Sign in

Data Engineering Weekly #56

Weekly Data Engineering Newsletter

Ananth Packkildurai's avatar
Ananth Packkildurai
Sep 20, 2021
5

Share this post

Data Engineering Weekly
Data Engineering Weekly
Data Engineering Weekly #56
Copy link
Facebook
Email
Notes
More
Share

Data Engineering Weekly - Brought to You by RudderStack - the Customer Data Platform for Developers

RudderStack Provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools.


Event: Join Impact 2021 on November 3, 2021: The First-Ever Data Observability Summit. Join Today's Leading Data Pioneers

Hear from data leaders pioneering the technologies & processes shaping data engineering. Featuring First Chief Data Scientist of the U.S., founder of the Data Mesh, and many more!

Click To Get Your Free Ticket For All Data Engineering Weekly Readers


Benn Stancil: The Data OS

Y Combinator—an incubator of both startups and the Silicon Valley zeitgeist—funded 15 analytics, data engineering, and AI and ML companies. In 2021, they funded 100. Does the modern data stack bring too many tools to the table to solve the data problem? Benn Stancil is discussing data OS.

https://benn.substack.com/p/the-data-os


Data Engineering - UC Berkeley, Spring 2021

UC Berkeley published its spring 2021 data engineering course slides and resources. It is excellent learning material for data engineering practitioners.

https://cal-data-eng.github.io/


Airbnb: Automating Data Protection at Scale

Data protection and privacy monitoring is a critical aspect of the data management platform. It is the most challenging aspect of data management since it can travel through multiple data storages, making it harder to keep track of manually. Airbnb writes about Madoka, a metadata system for data protection that maintains the security and privacy-related metadata for all data assets on the Airbnb platform.

https://medium.com/airbnb-engineering/automating-data-protection-at-scale-part-1-c74909328e08


Uber: YAML Generator for Funnel YAML Files: Streamlining the Mobile Data Workflow Process

Funnel analysis is a critical analytical feature from click tracking events. Uber writes an exciting blog about YAML generators, followed by a simple UI workflow engine to develop funnel analysis. It triggers an interesting data pipeline debate, no-code or code-only data pipeline. IMO, the answer is to know your audience and their workflow to make them productive.

https://eng.uber.com/streamlining-mobile-data-workflow-process/


Intuit: A Paved Road for Data Pipelines

Intuit writes about a general overview of its data infrastructure, emphasizing that lack of standardization can lead to fragmentation and islands of computing. The blog narrates Intuit's developer portal and UI-driven pipeline lifecycle management platform.

https://medium.com/intuit-engineering/a-paved-road-for-data-pipelines-779004143e41


Sponsored: RudderStack - Churn Prediction With BigQueryML to Increase Mobile Game Revenue

Here’s an interesting case study on how machine learning can directly impact the bottom line. RudderStack writes an outline of how app developers, Torpedo Labs, use BigQuery ML to identify high-value mobile game players who are dangerously close to churning.

https://rudderstack.com/blog/churn-prediction-with-bigqueryml


Pinterest: Faster Flink adoption with self-service diagnosis tool at Pinterest

Self-serving diagnostic tooling is a vital part of the data platform for democratizing the adoption. Pinterest writes about Dr. Squirrel, a Flink logs aggregator to perform job health checks, flag unhealthy jobs explicitly, and provide root cause analysis and actionable steps to help fix the issues.

https://medium.com/pinterest-engineering/faster-flink-adoption-with-self-service-diagnosis-tool-at-pinterest-50a07143f444


Cloudera: Operating Apache Kafka with Cruise Control

Cruise control is one of my favorite tools to operate Apache Kafka at scale. Cloudera writes an exciting blog giving an overview of Cruise Control and its use cases.

https://blog.cloudera.com/operating-apache-kafka-with-cruise-control/


AutoTrader: Auto-generating an Airflow DAG using the dbt manifest

It is always challenging to integrate Airflow as a task dependency system with Dbt, a model-dependent system. AutoTrader writes an exciting blog about its DbtTaskGenerator to auto-generate Airflow DAGs using Dbt's manifest files.

https://engineering.autotrader.co.uk/2021/09/15/auto-generated-airflow-dag-for-dbt.html


Links are provided for informational purposes and do not imply endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers' opinions.


Subscribe to Data Engineering Weekly

By Ananth Packkildurai · Launched 5 years ago
The Weekly Data Engineering Newsletter
5

Share this post

Data Engineering Weekly
Data Engineering Weekly
Data Engineering Weekly #56
Copy link
Facebook
Email
Notes
More
Share

Discussion about this post

User's avatar
Functional Data Engineering - A Blueprint
How to build a Recoverable & Reproducible data pipeline
Dec 22, 2022 â€¢ 
Ananth Packkildurai
73

Share this post

Data Engineering Weekly
Data Engineering Weekly
Functional Data Engineering - A Blueprint
Copy link
Facebook
Email
Notes
More
3
The Future of Data Engineering: DEW's 2025 Predictions
Emerging Innovations, Evolving Roles, and the Roadmap to Scalable AI-Driven Insights
Dec 19, 2024 â€¢ 
Ananth Packkildurai
47

Share this post

Data Engineering Weekly
Data Engineering Weekly
The Future of Data Engineering: DEW's 2025 Predictions
Copy link
Facebook
Email
Notes
More
2
Towards Composable Data Infrastructure
A Case for Federated Data Catalog
Apr 11 â€¢ 
Ananth Packkildurai
37

Share this post

Data Engineering Weekly
Data Engineering Weekly
Towards Composable Data Infrastructure
Copy link
Facebook
Email
Notes
More

Ready for more?

© 2025 Ananth Packkildurai
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.