For most organisations application releases are analogous to extremely tense and pressurized situations where risk mitigation and tight time deadlines are key. This is made worse with the complication of internal silos and the consequent lack of cohesion that exists not just within the microcosm of IT infrastructure teams but also amongst the broader departments of development, QA and operations. Now with the increasing demand on IT from application and business unit stakeholders for new releases to be deployed quickly and successfully, the interdependence of software development and IT operations are being seen as an integral part to the successful delivery of IT services. Consequently businesses are recognizing that this can’t be achieved unless the traditional methodologies and silos are readdressed or changed. Cue the emergence of a new methodology that’s simply called DevOps.
The advancement and agility of web and mobile applications has been one of the key factors that have led many to question the validity or even practicality of the traditional waterfall methodology of software development. The waterfall’s rigorous methodology of conception, initiation, analysis, design, construction, testing, production/implementation and maintenance in an age when the industry demands “agility” can almost seem archaic. While no one can dispute the waterfall methodology’s relevance, certainly not companies such as Sony who suffered the embarrassment of the rootkit bug, but with web and mobile app releases needing to be rapidly and regularly deployed, can companies really continue to proceed down a long continuous integration process?
Much of the problem stems from legacy IT people cultures as opposed to the methodology itself where each individual is responsible for their sole role, within their specific field, within their particular department. Consequently within the same company the development team is often seen as the antithesis of operations with their constant drive for change in needing to meet user needs for frequent delivery of new features. In stark contrast operations are focused on predictability, availability and stability, factors that are nearly always put at risk whenever development request a “change” to be introduced.
This disengagement is further exacerbated with development teams delivering code with little or no involvement from their operations teams. Additionally to support their rapid deployment requirements, development teams will use tools that emphasize flexibility and consequently bear little or no resemblance to the rigid performance and availability-based toolsets of operations. In fact it would be rare to find either operations or development teams being aware of their counterparts toolsets yet alone taking any interest in potentially sharing or integrating them.
Alternatively you have the operations team that will do everything they can to stall any changes and new features that are being proposed to the production environment in an attempt to mitigate any unwanted risk. Eventually when development teams are allowed to get their software release picked up by operations it’s usually after operations have gone through a laborious process of script creation and config file editing to accommodate the deployment on a production runtime environment that is significantly different to the one used by development.
Indeed it’s commonplace to see inconsistencies between the runtime environment the development teams have used to run their code upon (typically low resourced desktops) and the high resource server OS based environments utilized by operations. With development having tested and successfully run everything on a Windows 7 desktop, it’s no surprise that once operations deploy it on a Unix based server with different Java versions, software load balancers and completely different properties files etc. that failure and chaos ensues during a “Go Live”. What follows is the internal blame game where operations will point to an application that isn’t secure, needs restarting and isn’t easy to deploy while development will claim that it worked perfectly fine on their workstations and hence operations should be capable of seamlessly scaling and making it work on production server systems.
Subsequently this is what the panacea being termed DevOps was established to address. DevOps from its outset works to push for collaboration and communication between the development, operations and quality assurance teams. Based on the core concept of unifying processes into a comprehensive “development to operations” lifecycle, the aim is to inculcate an end-to-end sense of ownership and responsibility for all departments. While the QA, development and operations teams have unique methods and aims in the process, they are all part of a single goal and overarching methodology. This entails providing the development team more environmental control while concurrently ensuring operations have a better understanding of the application and its infrastructure requirements. This involves operations even taking part (and consequently having co-ownership) of the development of applications that they can in turn monitor throughout the development to deployment lifecycle.
The result is an elimination of the blame culture especially in the case of any application issues as both software development and operational maintenance is a co-owned process. Instead of operations blaming development for a flaky code and development blaming operations for an unstable infrastructure, the trivial and time consuming internal finger pointing practices are replaced with a traceable root cause analysis between all departments as a single team. Consequently application deployment becomes more reliable, predictable and scalable to the business’ demands.
Additionally DevOps calls for a unified and automated tooling process. The evolution of web applications and big data has led to infrastructure needing to scale and grow considerably quicker. This means that the traditional model of fire fighting and reactive patching and scripting are no longer a viable option. The need for automation and unified tools whether for deployment, workflows, monitoring, configuration etc. is a must not just to meet time constraints but also to safeguard against configuration discrepancies and errors. Hence the growing awareness of DevOps has aided an emergence in the market of open source software that deal with this very challenge ranging from configuration management and monitoring tools such as Rundeck, Vagrant, Puppet and Chef. While these tools are familiar to development teams the aim is to also make them the concern and interest of operations.
The DevOps methodology is a straightforward and obvious initiative to cater for the changing face of application development and deployment. Despite this it’s greatest challenge lies within people and their willingness to change. Both development and operations teams need to remove themselves from their short term silo-focused objectives to the broader long term goals of the business. That necessitates that the objective should be a concerted and unified effort from both teams to have applications deployed in minimum time with minimum risk. I’ve often worked with operations staff who have little or no idea of how the applications they’re supporting are related to the products and services their companies are delivering and how in turn they are generating revenue as well as providing value to the end user. Additionally I’ve worked with development teams that were outsourced from another country where communication was non-existent not just because of the language barrier. As the demands from the business on IT rapidly increase and change so too must the silo mindset. DevOps is aiming at initiating an inevitable change; those that resist may find that they themselves will get changed. As for those that embrace it, they may just find application releases a lot less painful.