Human error remains the single biggest risk in a data centre environment. It still accounts for a large proportion of outages across the globe, the majority of which can be preventable. A recent survey from Uptime Institute shows that outages are increasing and are becoming more frequent and even more expensive, not to mention brand damaging. To a degree there are always going to be certain mistakes when humans are involved, we aren’t robots after all (which can still have their own failures), but the key element is that many of the outages and mistakes that occur are known to be preventable and this is where action can be taken. Plans and processes around continual professional development really need to be the norm. Organisations can benefit hugely by learning from any outages they experience and can work towards mitigating against the inevitable human error probability.
There are many procedures and processes in place across the data centre environment to monitor and test the lifecycle of mission critical equipment to ensure it’s meeting its required demands. The same thinking needs to be applied and in place for the technical teams working within data centres. It’s a fact that if people have been doing the same job for long time, their confidence can take over and this can cause individuals to overlook details and specific processes which in turn can cause catastrophic failures – they could be confidently doing things wrong.
Investing in staff and on-going education/training and personal development can pay significant dividends. Surely, if organisations get better at spotting the knowledge, competency and skills gaps in their teams and invest to fill these gaps whilst ensuring the processes and procedures are kept up to date, the picture could be significantly different. It will increase the team’s ability to recognise that a problem may occur and therefore resolve the issue before it escalates.
Looking at the 2020 Uptime Institute global data centre survey 2020* report, it reveals what IT and Data Centre Managers around the world are thinking, doing and planning in the areas of efficiency, resiliency, workload placement, staffing and new technology adoption. 78% of organisations have stated that they have had an IT-related outage in the last three years with 75% saying that their most recent outage could have been prevented with better management making a large proportion of outages a result of human error. This figure has increased by 15% since 2019, when the same question was asked. A frightening fact.
With industry supported education programs awarding official certification and qualifications out there, alongside advances in individual and team analytical tools, backed by science and psychological methodology that identifies exactly where knowledge, competency and even confidence levels are lacking, there are so many opportunities for organisations to take important steps to work towards human risk mitigation.
One of the biggest challenges that organisations face is that continual professional development budgets are usually limited or cut to boost other areas of the organisation. There is also a common misconception regarding education/training allocation generally, as the professional development activities are often used to provide a reward to those most loyal or high performing staff, rather than those who actually need it the most. This misconception results in the employees gaining very little from the development activities, as they are already good, and therefore it provides little or no benefit or ROI to the organisation itself. It’s crazy when you think of the massive risk data centre operators are taking by not investing in their people. It could cost them thousands per minute during an outage, and the statistics continue to show that a large portion of these outages are caused by human error are avoidable.
The Uptime Institute survey also states that with more investment in management, process and training, that the outage frequency would almost certainly fall significantly. Hopefully, this will raise alarm bells to the rest of the industry to turn their attention to these areas. The pandemic has highlighted the critical importance of the digital infrastructure industry and demand is only going to increase. Alongside an increasing skills shortage and an ageing workforce, this is a stark warning that if organisations do not start putting people development first, the situation is not going to improve.
Guest blog by Sarah Parks, Marketing and Communications Director, CNet Training
View the Diamond Website Sponsor listing for CNet Training on the DATACENTRE.ME Directory