DevOps: Bridging Development and Operations
Accelerating Delivery, Enhancing Collaboration, and Ensuring Reliability in the development and enhancement of Data Products
DataOps, a fusion of "Data" and "Operations," addresses the challenges of developing data products by combining principles from Agile, DevOps, and Lean Manufacturing. It emphasizes collaboration, automation, and efficiency in handling data pipelines, enabling teams to deliver data products faster and with higher quality (Atwal, 2020, p.xxiii). If you are interested in the devinition of DataOps take a look on the following blog post: What is DataOps? Introduction to Streamlining Data Product Development
The exponential growth of data in recent years has created both incredible opportunities and formidable challenges for organizations. The ability to harness data efficiently and translate it into actionable insights has become a cornerstone of competitive advantage. However, traditional methodologies for example to develop software product often fail to keep up with the speed and complexity of data products development.
DataOps inherits components from (DataKitchen, 2018):
Lean Manufacturing: Focusing on the value adding processes by eliminating waste leads to a more efficient utilization of resources, with higher quality and lower costs.
Agile: Building the right product for the right people by increasing “the ability to react to unforeseen or volatile requirements regarding the functionality or the content“ of Data Products (Zimmer et al., 2015)
DevOps: Shared commitment towards the Data Product reduces information barriers (Culture of collaboration), automation (e.g., CI/CD pipelines), Infrastructure as Code and automated tests enable fast and reliable deployment of code into production in a high quality (Macarthy & Bass, 2020b)
Lets have a deep dive into the DevOps component:
DevOps
The performance of software development increased significantly in the last 25 years. A key factor for the increased performance of software development - besides the adaption of agile software development - was the widespread adoption of DevOps philosophy. The software development process is similar complex to the manufacturing process of physical goods. Instead of a physical transformation of raw material to a finished good, software developers create a code (e.g., a new feature for an existing product). This code needs to be tested (quality and security) and to be deployed in the production by IT operations. This process contains different dimensions of complexity, for example: code creation, organizational framework, tools and method. Traditionally, the release of new versions of a software product were a “high-stress affairs involving outages, firefighting, rollbacks, and occasionally much worse” which was conducted infrequently every few years (Atwal, 2020). There was no matured holistic approach to deal with the complexity involved. As a result, fractions existed in the development process. The software developer and the IT operators had different objectives. Software developers were eager to develop and test new products to fit the customer demands. On the other hand IT, operators were focusing on the stability of the production system. New releases frequently jeopardized the stability of the production system resulting in a negative attitude of its operators to new releases (König & Kugel, 2019). The technical debt of dependent legacy systems (e.g. monolith architecture) increases the workload for IT operations reducing the ability to test new features and increasing the risks of instabilities (Atwal, 2020, p.162). Figure 2 depicts explanatory the flow of the code in traditional software development:
Traditional flow of code in software development (explanatory)
The software development followed often the water fall principles, which means the work is executed subsequently. The software developer starts to work on the new release. After completing the whole code, it is passed -with some instructions- to the quality and security assurance colleagues in another department. After testing and approving the code, the package is passed to the IT Operations to deploy the code. The IT operations deploy the whole release at once. The infrastructure provision is done manually without significant automation. An extensive amount of time is needed to solve the configurations and dependencies manually.
In the first step the user requirements are specified in a very first phase. Requirement changes or testing MVP (minimum viable products) is rarely possible. Furthermore, the current project progress is difficult to assess since development, testing and deploying are subsequent tasks. Skill centric silo organization results in an extensive alignment and central planning effort without an end-to-end visibility resulting in huge cycle times and delays. The teams were often committed towards the functional organization and not the software product resulting in a dismissive information sharing and poor collaboration (Katal et al., 2019b).
From a lean perspective the traditional approach has similarity with the mass production of Ford and Tylor. Several lean principles are not met in the traditional approach. From a flow perspective the code is piling up after each process step resulting in long waiting times. The optimum is an one-piece flow where small chunks of codes go through the process chain frequently. Small chunks of codes could be tested faster and errors will be detected when they were made (right first time). Unfortunately, errors will be detected in the traditional approach late in the process resulting in extensive amount of reworks since error dependencies need to be resolved as well. Rework is one type of waste (Muda). The huge amount of work in progress (code between processes), the unbalanced teams resulting in idle times adding more types of waste to the traditional approach. In the traditional approach it is difficult to manage feedback from the production efficiently since information is shared dismissive between IT Operations and teams, but as well with the customer of the software product resulting in the risk that a software product is built which is not meeting the customer demands. The traditional approach is not compatible with lean ideas resulting in a poor performance. Based on these challenges the DevOps philosophy was developed. „DevOps is described as a software engineering culture and philosophy that utilizes cross-functional teams [Developer, IT-Operations, QA, Security] to build, test, and release software faster and more reliably through automation“ (Macarthy & Bass, 2020b). Companies applying DevOps show frequently outperformance in key success indicators in comparison to the traditional approach. Key success factors are for example: Deployment frequency, lead time for changes, change failure rate, time to restore services (Portman, 2020). The main idea is to facilitate rapid software development and deployment by bringing small releases constantly into production ready status (Lwakatare et al., 2016). There is no conclusive definition of DevOps. In general, there are four pillars of DevOps (CAMS): Culture, Automation, Measurement and Sharing (Katal et al., 2019a). DevOps embraces a culture to bridge the conflicting interests for Developer and IT Operator. Small, self-organized and autonomous cross-functional teams (Developer, It Operator and Quality/Security assurance) with a shared commitment towards the product and not the function are built. Thus, the time to share information is reduced significantly, end-to-end visibility is given and the collaboration is strengthened. Learning and innovation cycles are improved through continuous feedback. An explanatory example of releasing and deploying code is depicted in Figure 4. Through continuous integration and continuous delivery/deployment - also known as CI/CD Pipeline – a commit can be brought into a production ready status through automation in a reliable and in high quality fast way. Thus, small pieces of code can be shipped fast into production reducing the work in progress and the cycle times significantly (“One-Piece-Flow”). Failures are detected right after their emergence and can be solved immediately reducing the amount of reworking and thus eliminating waste (“Right First Time”). This procedure enables to test a feature quickly and to adapt to the customer needs. MVP (minimum viable products) can be presented in an early phase of the projects. Agile methods can be applied like working in sprints with SCRUM. Working in sprints is a key enabler to refine the customer requirements during the development phase in order to build the right product for the right people. For explanatory reasons the CI/CD pipeline is broken down in process steps (in reality it is a continuous process). One sprint follows several phases: Plan, code, build (building a deployable package like a docker image), test, release to a repository, deploy, operate, monitor. It is very important that the production system is monitored and errors or warnings are continuously feedbacked. Continuous feedback assures that the necessary information is brought to the developers to react fast and preventive to possible failures (Continuous Improvement, Right First Time).
The processes for software development are highly automated inclusive the quality testing and the provision of infrastructure. An explanatory example of automated infrastructure provision (infrastructure as code) is shown in the figure below:
Containerization (e.g., with Docker), microservice architecture and tools to configure and provide the infrastructure enabling infrastructure as code. Automated and replicable infrastructure reduce significantly technical debt.