#26 - 3 principles of Data Observability
Data Observability may be the one of the newest additions to the Modern Data Stack, but the concept is not new. Originated from the software engineering world, Data Observability refers to a set of practices aimed at increasing the ability to infer what happened with data and reducing the data down time.
There are many choices you will need to make when implementing a Data Observability capability for your stack (i.e what to monitor, what tool to use, etc). The following guiding principles will help simplify the thought process and help you make better decision:
Synchronised observability: Collect data metrics at the exact moment key events happen to data (extracted, loaded, transformed)
Context observability: Gather information about the context around the data. (which environment it’s been in, what are the parameters set for that environment (such as timezone, currency), etc)
Continuous validation: Continuously test data in each stage of the life cycle
The interesting fact is, these same principles have been applied to solve very similar problems in a different industry. To help us understand data observability better, Andy Petrella presented on this Kensu’s blog post a thought-provoking analogy for data observability: it’s like how the food tracing system works.
In the data world, the goal is to reduce data down time, then in the food industry, the goal is to identify the source of the outbreak as fast as possible.
If in the data world, we put systems in places to understand the data lineage and what happens to data at each stage, then in the food processing world, there is also a thorough system to collect metrics on food processing steps and environments.
If in the data world, unit tests are placed to validate data rules being met, then in the food industry, each component or ingredient of a food product is tested at each processing stage for nutritional standards.
The benefits of both data observability and food traceability are similar: faster troubleshooting, more accountability and higher reliability on the end product.
To understand more how the same three principles are applied in both data and the food industry, check out this visually engaged video explaining the rules of food traceability:
That’s it for this week! If you enjoy or get puzzled by the content, please leave a comment so we can continue the discussion. Throw in a like as or share as well if you know someone who may enjoy this newsletter. Thanks!