The best conversation happens at home, especially when you have a different domain of expertise. Different perceptive and life experience always brings innovative thoughts and help to refine your ideas. I capture my lunch conversation with my wife about data contracts and hope it will be helpful for you. She is a cancer biologist, hence the reference to brain, kidney, and cancer research. Stay with me; it has some interesting relevance.
Neha:
What is the data contract that you're talking all over?
Ananth:
A data contract is an agreement between the data producer and consumers of the data that abstractly describes data to be exchanged. (i.e) The explicit description of expectations around data model, data quality, and data availability.
Neha:
I don’t understand what you’re trying to say. It is all jargon to me.
Ananth:
Let’s take a scenario in your lab. You’re running an experiment on mice, and you want to share your data about certain drugs on the mice with your colleague. How will you share the data?
Neha:
We usually share in an excel sheet.
Ananth:
How will your colleagues know what data you’re collecting to ensure it is helpful for their research?
Neha:
We exchange notes now and then; sometimes, we implicitly make some assumptions about the structure of the data.
Ananth:
Hmm, isn’t it dangerous? What happens if the assumption goes wrong?
Neha:
Oh Yes, That happens all the time. We were working on an experiment to collect a Drug reaction on a mouse’s kidney and brain. Measuring the weight of the kidney is an essential and obvious parameter for the experiment. We thought the person doing the kidney study knew the obvious, and we were under the impression that the measurements were being collected. When we exchanged the measurement at the end of the experiment, we found no data for the weight of the kidneys. We had to wait another six months to re-run the experiment. 😢
Ananth
Excellent, Oh, I mean, well, that is not good. Who did it?🤕 Well, you see, human errors are common. It’s no one’s fault; this is where the data contract comes into play.
Suppose you write down what measurement you’re collecting, how each experiment is interconnected, who is responsible for which measurement, and data quality expectations; you could have prevented it. We call this process Data Contracts.
Neha:
Got it. It makes sense. But won’t it create a physiological fear? My colleagues may fear that others will blame them if something goes wrong. It will create destructive team dynamics.
Ananth:
It is a valid concern. The software industry underwent a similar transformation to the DevOps culture for software development. Many companies successfully adopted this model. We hope the Data Contract can influence these cultural changes.
Ananth:
I’m going back to your “missing-the-weight-of-the-kidney” measurement incident. Even if a single measurement is missing, you could still derive a decision. I’m trying to play a devil’s advocate to say Data Contract is not a bigger deal. Can you still make a directionally correct decision using partial measurement?
Neha:
You can, but you must apply your prior human knowledge as one variable to make the decision. It depends on your domain expertise that can impact the result in any variance. At this point, you are not making a data-driven decision from the experiment; you’re simply looking at the data and making assumptions. It might work in your business context. Maybe the cost of the mistake might be low, but certainly not in cancer research, where people’s life at risk.
Ananth:
Yes, that makes sense.
Neha:
Data Contract is fantastic, and I can imagine how much it is helpful. But what is Schemata that you’re talking about recently?
Ananth:
Schemata is for the data practitioners in an organization. Developers may roll out a change that can break the data pipeline and lead to a chaotic experience for the data analyst & data scientist. Schemata enable a dynamic and rapidly evolving data contract framework that provides a collaborative experience for the data producer and consumers to define ownership, set expectations, and live happily ever after.
Neha:
That sounds interesting. Good luck with that.
All rights reserved Pixel Impex Inc, India. Links are provided for informational purposes and do not imply endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers’ opinions.