We talk about the “tin layer” in Delta Lake: The Definitive Guide. It’s for landing zone, pre-bronze data. Essentially where things land before being converted to Delta (or really Iceberg or any future table format)
Not all features are real-time in nature. Real-time serving is one part, but real-time ingestion as a platform is what distinguishes the platinum layer from a feature store.
Ananth, your description of the Platinum layer — especially its focus on data driving actions, embedded intelligence, and serving applications/ML models — strongly resonates with the concept of Reverse ETL.
Was this similarity an intentional design philosophy, or do you see key distinctions between the two (Reverse ETL vs Platinum layer )?
Great article, just not sure why the emphasis on "aggregates" in the Gold - I presume it means that data is integrated from multiple sources, bronze & silver datasets. Not that it needs to be aggregated. Best reporting datasets are often ones with row level detail in my view.
Also - I'd say the core strength and purpose of Gold is that it's aligned and modelled to fit a specific business process or use case (e.g. 2 Gold layers will serve different use cases whilst re-using 80% of the Silver datasets).
Real time analytics cannot sit on top of layers that are run on low frequency cronjobs which are typically used for business analytics. Medallion is typically realised with batch jobs. Forcing streaming at the end of a batch sequence can be an anti-pattern.
We talk about the “tin layer” in Delta Lake: The Definitive Guide. It’s for landing zone, pre-bronze data. Essentially where things land before being converted to Delta (or really Iceberg or any future table format)
Is not this feature store
Not all features are real-time in nature. Real-time serving is one part, but real-time ingestion as a platform is what distinguishes the platinum layer from a feature store.
Ananth, your description of the Platinum layer — especially its focus on data driving actions, embedded intelligence, and serving applications/ML models — strongly resonates with the concept of Reverse ETL.
Was this similarity an intentional design philosophy, or do you see key distinctions between the two (Reverse ETL vs Platinum layer )?
Reverse ETL often focuses on activating a specific business function, while the platinum layer focuses on the platform aspect of real-time data.
Great article, just not sure why the emphasis on "aggregates" in the Gold - I presume it means that data is integrated from multiple sources, bronze & silver datasets. Not that it needs to be aggregated. Best reporting datasets are often ones with row level detail in my view.
Also - I'd say the core strength and purpose of Gold is that it's aligned and modelled to fit a specific business process or use case (e.g. 2 Gold layers will serve different use cases whilst re-using 80% of the Silver datasets).
The granularity of the facts can vary based on the use cases, but they are largely aggregated and refined.
Nice Ananth. Indded we need platinum and can be designed based on the specific usecase
Great Post! Thanks for sharing.
Awesome post
Good article. How do you orchestrate platinum layer?
in the last diagram , you have mentioned some of the platforms for the storage in gold , which are not correct i think.
Real time analytics cannot sit on top of layers that are run on low frequency cronjobs which are typically used for business analytics. Medallion is typically realised with batch jobs. Forcing streaming at the end of a batch sequence can be an anti-pattern.