The Data Platform Fundamentals Guide
A comprehensive guide for data platform owners looking to build a stable and scalable data platform, starting with the fundamentals and wrapping up with real-world examples illustrating how teams have built in-house data platforms for their businesses.
Philipp Schmid: The New Skill in AI is Not Prompting, It's Context Engineering
Building powerful and reliable AI Agents is becoming less about finding a magic prompt or model updates. It is about engineering context and providing the right information and tools in the right format at the right time.
The author explains why context engineering is crucial in the development of AI agents.
https://www.philschmid.de/context-engineering
Piethein Strengholt: Unstructured Data Management at Scale
Unstructured data management will be the next significant challenge in big data management as we continually enhance our ability to parse and understand various forms of data. The author highlights the processing of unstructured data in alignment with the Medallion architecture and discusses Tensor Lake and LlamaParse.
https://piethein.medium.com/unstructured-data-management-at-scale-4c612f822f70
Médéric Hurier (Fmind): The Great Data Divergence: Why Generative AI Demands a New Approach Beyond the Data Lake
The article brings a new perspective from the previous article, questioning whether the Data Lake is a valid approach for emerging Gen AI cases. Freshness, Context, and Low-Latency access are the keys to the success of Gen AI applications, and the author questions the medallion architecture of the data lake.
Sponsored: Rapidly developing ELT pipelines with dltHub and Dagster
On July 8th, join us for a technical deep dive co-hosted by Dagster Labs and our partners at dltHub, where we explore how to rapidly build and scale ELT pipelines using the power of open-source tooling. Whether you're exploring your first data ingestion project or scaling existing pipelines, this session will equip you with the tools and best practices to iterate faster, ship confidently, and operate reliably in production.
DataDog: How we built reliable log delivery to thousands of unpredictable endpoints
DataDog writes about building a log delivery to external endpoints, drawing inspiration from the package delivery network. The design around writing a small microbatch per destination with an envelope is an interesting design case study for fanout write.
https://www.datadoghq.com/blog/engineering/reliable-log-delivery/
Gojek: Introducing xkafka — Kafka, but Simpler (for Go)
What if we could make using Kafka in Go feel more like writing a simple HTTP service?
Gojek details its Kafka SDK abstraction, following the ‘all batteries included’ pattern.
https://medium.com/gojekengineering/introducing-xkafka-kafka-but-simpler-for-go-91f4ce3edade
Uber: Reinforcement Learning for Modeling Marketplace Balance
Uber shares insights on leveraging reinforcement learning (RL) to enhance driver-rider matching by modeling it as an infinite-horizon Markov Decision Process (MDP) and applying a DQN-inspired value iteration method with temporal difference learning. Innovations include utilizing negative signals from driver idle states in reward modeling, employing contrastive loss for smoother geospatial embeddings, and validating models through a custom Monte Carlo-based evaluation pipeline. The global deployment resulted in a 0.52% increase in driver earnings and a 2.2% decrease in rider cancellations.
https://www.uber.com/en-IN/blog/reinforcement-learning-for-modeling-marketplace-balance/
Wix: Advancing Enterprise AI: How Wix is Democratizing RAG Evaluation
Wix open-sources WixQA, a realistic benchmark suite derived from customer support interactions, and RAGXplain, an evaluation framework turning metrics into human-readable insights for enterprise Retrieval-Augmented Generation (RAG) systems. WixQA includes expert-written, simulated, and synthetic datasets paired with a synchronized knowledge base, while RAGXplain provides clear, actionable recommendations based on six performance metrics.
https://www.wix.engineering/post/advancing-enterprise-ai-how-wix-is-democratizing-rag-evaluation
Deliveroo: Deliveroo's Machine Learning Platform: Powering the Future of ML
Deliveroo shares insights from building their centralized Machine Learning Platform, designed to standardize workflows, boost engineer productivity by 2-3x, and accelerate model deployment. Combining open-source technologies like Kubernetes, Argo, and Metaflow with custom-built tools, such as Inferoo (a real-time inference service handling over 1 billion daily requests) and a dedicated Feature Store, the platform emphasizes automation, cohesion, and self-service. The team is now enhancing the platform with an internal ML portal for unified model management, monitoring, and deployment.
https://deliveroo.engineering/2025/07/02/deliveroo-ml-platform.html
Henry Ko: TPU Deep Dive
The author delves into the architecture and design of its Tensor Processing Units (TPUs), specifically TPUv4, highlighting how systolic arrays, pipelining, and Ahead-of-Time compilation with XLA enable high throughput and energy efficiency in AI workloads. The article covers TPU hierarchies—from single-chip TensorCores and memory buffers (CMEM, VMEM) to multi-chip trays, racks, pods, and multislice setups—emphasizing flexible interconnects, such as Inter-Core Interconnect (ICI) and Optical Circuit Switching (OCS). It also explains how XLA abstracts complex distributed topologies (such as 3D torus and twisted torus) to optimize various parallelism strategies.
https://henryhmko.github.io/about/about.html
All rights reserved, ProtoGrowth Inc., India. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers’ opinions.