Starting from a database emergency: how you can get your server strategy back on track.


William Durkin


Technical


October 22, 2020

We understand that being without a DBA—or not having enough DBA support—can become a big issue for growing companies. Without thoughtful planning and expertise, neglecting database management can have a real, lasting impact on your business.

And for an IT manager or Systems Administrator? It means that you become very familiar with a particular mix of hope and dread when you come into work each day. The database isn’t your job—or if it is, you might not have the skills or bandwidth to deal with it in a meaningful way. So, things break. And when they do, there are already people who can’t work or customers who can’t make purchases by the time you hear about it.

You know it’s disastrous to lose that kind of productivity—every second counts. But while you recognise that a more proactive approach could save you a lot of grief in the long run, immediate problems are so critical and the anxiety so overwhelming that getting there feels impossible.

This is a very typical technical journey. But we want you to know that transforming from a reactive “fear and hope” approach to a proactive data professional is possible. And, by doing so you’ll not only improve your database systems, but also rediscover the joy of working with data—here’s how:

Prepare

Start by painting a picture of your entire database infrastructure and then determine a roadmap to stability and performance based on your business goals. We recommend starting with a Health Check. It’s worth investing in a report that prioritizes the issues you’re specifically facing. A good Health Check will highlight all the issues that need your attention, and then prioritize them so you can effectively dedicating resources to problems that matter. The Health Check report then becomes your database plan—your to-do list—that you can work through until your systems are performing the way you want.

A good Health Check report should include:

OS Configuration
Cluster Configuration
SQL Server Configuration
Maintenance
Performance
Security
High Availability / Disaster Recovery
Monitoring
Any Other Observations

Start Monitoring

With the right prioritization, paired with the right people and matched with the appropriate technology, you can monitor and tune your own environment. To do this you’ll want to onboard a SQL Server monitoring solution that gives you detailed performance metrics, fixes high-impact queries, analyses tempdb health, and resolves deadlocks and blocking. We generally recommend Redgate SQL Monitor for those who have deeper SQL Server knowledge, and don’t need a high level of detail. However, there are many tools out there that can accomplish the same things, according to your specific needs.

Monitoring is critical to your organisation. You need to know when your data systems fail before your users are screaming at you to fix the problem. It will help diagnose critical issues, and if all else fails, it will give you time to remediate before the phone rings.

Knowledge is Power

Just accomplishing a Health Check and Monitoring solution will have a huge impact on your database health. With a Health Check in place to diagnose your SQL Server problems and a monitoring solution configured to alert you when things go wrong, you will no longer be surprised when things break.

To reach a state of stability and availability, you may need to do some tuning to get to a place of better database performance. Addressing these performance issues are a great place to start. This is by no means an exhaustive list, but they are performance issues we see fairly often. If you have technical skills on your team to do performance and availability checks, start here:

1. Check your SQL Server configuration and clustering

This is a wide topic, but if you’re having critical downtime or errors, optimizing the configuration of your SQL Servers can save you ton of headaches. With the right SQL Server clustering, you can solve hardware issues, apply security patches, and help you troubleshoot problems more effectively. Want to learn more about SQL Server clustering? Brent Ozar has a great introduction

2. Look at your Wait Statistics

SQL Server does a lot of tracking the queries and performance of those queries, and often the best place to start looking at a performance issues is in the SQL Server “wait stats.” It will give you the resources that the query is waiting for and what SQL Server is tracking. If you want a deep dive on Wait Stats, no one is better than Paul Randal’s Pluralsite course on this exact topic.

3. Identify / tune problematic queries

This takes a bit of skill, but identifying queries with high CPU / IO usage or with other problems is important to getting the most out of your applications. You can run a set of scripts to help you identify the problems, or use a monitoring software that will show you their execution plans. Then you get to tuning.

4. Managing Indexes

Dealing with indexes are part of the DBA’s role. To troubleshoot issues in your servers, you’ll have to identify any missing, unused or fragmented indexes from tables in your database.

There are many other ways to improve overall performance, like taking a closer look at your Top SQL or your Query Plan. But the great thing about starting with a Health Check is that you will know where to look.

You need peace of mind when it comes to managing your database architecture. Our goal ultimate goal is for your architecture to be so mature that you’re sipping lattes while everything is automated, but if you need to dive into your database issues yourself, this is a great place to get started.

 If you’re ready to get started on a Health Check, Data Masterminds offers a comprehensive assessment. An initial conversation is always free—just schedule a call with our team, and we’ll talk you through the best plan for your needs.