DataOps in 3 thoughts: Part 1
As more companies aspire to become data-driven and turn data into a competitive advantage, the pressure on data-delivery teams is at an all-time high. They encounter many challenges along the way, such as unclear and ever-changing business requirements, data silos that need to be integrated and optimised for reporting or analytics, low quality data that's ruining results, …
These are often tackled in feats of individual or team heroics, but it is clear that hard work or a new tool will only go so far. Short cycle times between idea proposal and realisation are more the exception than the rule. A more structural change is required in order to solve this problem.
A new approach?
In search of a way to improve the processes of data-delivery teams while keeping them mentally sane, the people at DataKitchen developed a set of technical practices, workflows, cultural norms, and architectural patterns that came to be known as DataOps. These ideas are inspired by established practices and methodologies like Agile, DevOps and Lean Manufacturing.
The ideas from the agile methodology focus on continuously reassessing priorities and being able to support new innovations in a matter of days or weeks. This responsiveness is possible through working in short increments, called sprints. The agile way of working is well-established in the software development community.
The principles of DevOps are all about streamlining the build lifecycle using automation. By focusing on continuous integration and continuous deployment, the deployment times are significantly reduced and software quality can be boosted (when providing sufficient tests of course).
The origin of the Lean Manufacturing methodology is in the Japanese manufacturing industry, where it was used to improve productivity whilst moving to small batch production. It focuses on eliminating waste in the assembly line. We could look at data pipelines as assembly lines as well - refining, transforming and orchestrating continuous streams of data. Applying statistical process control to this data factory brings improvements in efficiency and transparency by monitoring real-time process measurements.
Wait, there’s more!
Similar to the Agile Manifesto, the people at DataKitchen have published a DataOps manifesto, along with an open invitation for you to sign and endorse it. You can find the principles and some clarification on the DataOps Manifesto website, we encourage you to go and have a read.
We’ve read through the manifesto (specifically the “Principles”) and found that we share a lot of these principles, even though we didn’t give them such a catchy name. We'd love to share our interpretation, based on our experiences at kuori. We’ve grouped them into three thoughts, and those will be the subjects of the three next posts in this series.
Enjoyed this blog? Looking forward to reading the next parts? Let us know!