Discussion about this post

User's avatar
Jon's avatar

Do you think the Iceberg spec was intended to solve all those problems?

I think of specs and protocols like HTTP or gRPC. They define contracts for systems to communicate to one another, but there's no built-in definition for how long a request/response should take. That varies wildly across use cases.

So yeah, I agree that a semantic spec isn't sufficient for Iceberg as a technology. But that's the case with everything, right? I can't look at some service and say "oh, they provide RESTful API endpoints, that's all I need to know that it will work for my use case!" But it's sure nice that most web services use standard protocols, rather than having bespoke binary protocols for every integration under the sun.

Neural Foundry's avatar

Sharp analysis on the operational vs semantic gap in REST specs. The ListTables sync pathology you describe is exactly what breaks at scaleexcept most teams blame the catalog implementation rather than the underspecified protocol. I've debugged similar issues where query planners stall because there's no bounded expectation for metadata latency, so they can't distinguish between a slow catalog and a broken one. The commit contention problem is worse than people realize becuase aggressive retry clients don't just starve others, they amplify load during exactly the moments when the system is already stressed.

6 more comments...

No posts

Ready for more?