Are you relying on your IT heroes too heavily? Watch the webinar replay here.
It’s 3:45 on a Friday afternoon.
The financial system, your organization’s most critical application, just went down. The server it was running on needed to be replaced two years ago, but funding was tight and the replacement was pushed out. The IT team is on it in minutes and determines that the outage has been caused by an underlying hardware failure. Your IT Systems Administrator makes the decision to remove the failed server and restore on to a spare server in the environment from last night’s backup. At 2:30 in the morning on Saturday, she’s still working, making decisions on how much data needs to be restored. Those decisions, important under the best of circumstances, are that much more crucial on a tight deadline with little sleep and another Service Impact update email due to go out within the hour.
Your Information Technology staff is talented. They were likely hired because of their technical background and their ability to troubleshoot problems. The IT staff quietly work through the day-to-day of maintaining in-house and off-site systems, making sure the storage arrays, servers, and network equipment stay up-to-date with the latest software patches to ensure reliable operation and to protect from ever-escalating security threats. They respond to emergencies and problems without complaining, and they often work odd hours to make sure it all gets done without interrupting business.
When unplanned outages occur, they work non-stop until the problem is solved, which can easily mean 24 hours of work or more as problems are diagnosed, systems are recovered, tested, and validated.
Is there anything wrong with this approach? What risks does this dependence on the effectiveness of the IT staff and their heroic efforts pose for the business?
Today there are dozens of solutions for any specific IT problem. IT teams typically present multiple solutions to the leadership team that vary on location (cloud versus on-site), self-healing capabilities, and, of course, cost. When leadership teams choose a solution with a dependency on heroic IT effort, they often choose the cheap way rather than the fast or the good way, knowing that the IT staff is willing to bridge the gap with their diligent work efforts. This approach often leads to technical debt, where saving money today can increase risks dramatically down the road.
- What if your key IT staff leaves?
- Did they document the procedures they use to keep the systems up and running?
- Was heroic effort chosen as a replacement for keeping hardware and software maintenance contracts up to date, eliminating the ability to get timely outside help?
Studies show that paying down technical debt is not only more costly over time but also increases the risks of large-scale outages and resumé-generating events.
Organizations are now more dependent on IT systems than ever before. The increased application count, the number of cloud platforms, and the ever-increasing complexity is leading to burnout in many IT professionals. Too much complexity leads to too much work, divided attention, and decreased effectiveness. Yesterday’s heroic effort is today’s neglect of the bottom half of the project list. And the demand for effective IT professionals may force your best IT staffers to look for less work and more money at their next opportunity.
Hobbyists and resume building
Your IT staff may be heroic. But are they choosing applications, data center hardware cloud platforms that benefit the business, or are they choosing them to learn the new technologies and to build their resume? The wrong choice of cloud for a mission critical legacy application can cost a business dearly for the life of the contract. Has your IT team done due diligence on cloud platforms to prove out cost models and application speeds, or are they willing to trade heroic effort for a chance to learn a new platform?
Suboptimal emergency decision making
The leadership team decided against the backup or disaster recovery as a service solution. It felt like buying an insurance policy that they would seldom use. But now the systems are down and your team is busy restoring multiple systems on multiple cloud and on-site locations, probably using multiple data recovery tools. Do you expect your IT staff to make good decisions on the fly at the 24-hour mark? How about the next night at 2 AM when they’re cold, tired, and hungry? Do you expect them to fly back from their vacation to save the day?
Heroics leave no time for strategy and improvement
If your IT staff is busy maintaining all the existing systems along with researching, learning, selecting, and implementing new systems, are they finding the time to continuously validate alignment projects with the changing business requirements? Is the IT staff considering new approaches that not only reduce heroic effort, but allow the business to build new revenue streams, to reduce costs, or to open new lines of business?
How do I reduce dependency on heroic effort?
What’s the quickest way for an organization to reduce heroic effort requirements? By reducing the tactical, keep-the-lights-on work to maintain an IT steady state, and by reclaiming that lost time by working hand-in-hand with the business side to identify opportunities for revenue, profitability, and cost control. The fastest way to make that jump is by migrating mission critical legacy applications to an Enterprise Cloud provider like Expedient to reduce the heroic effort required to maintain the business.
Back to that Friday afternoon… What could it have looked like?
It’s 3:45 on a Friday afternoon. Because Expedient is managing your workloads, you’ve had the time to migrate the financial system, your organization’s most critical application, to a new, more reliable platform. Even under the weight of everyone completing their timecards before they leave for the weekend, the fully managed and patched server hasn’t reported any issues. If it does, it’s protected by weeks of fully managed backups and a disaster recovery solution that you can fail over with the push of a button. Accidents happen and emergencies happen, but they probably won’t this weekend. Or the next, or the one after that.
Watch the webinar replay here: