database_support

As a DBA one of the jobs is making sure all the databases and database servers are working properly and you have a controlled environment.

Once in a while, you have a skeleton in the closet where suddenly a database server comes around and you’re scratching your head where it came from. Your manager (and maybe your colleagues) expect you to support it because you’re the DBA.

You don’t know the server, your colleague doesn’t (really) know the server. So now what. Are you going to take responsibility for everything on the database server of the database itself when you don’t know anything about it?

I have had this so many times that I decided not to support a database (or entire database server) until I did a thorough scan of the environment to see if the database server is up-to-date, the settings in the server are set properly, etc. I use this checklist to make sure I cover all the aspects of the server:

Database server

  • Is the server virtual or physical?
  • What version of Windows (hopefully Server edition) is being used?
  • What version and edition of SQL Server are being used?
  • When was the last time the Windows OS was updated?
  • When was the last time the SQL server instance was updated?
  • How much memory is in the system
  • How is the storage configured?
    • What disks are being used?
    • What kind of disks are being used?
  • How many cores are present?
  • What services of SQL server are running?
  • What accounts are being used for the services?
  • What logins are present in SQL Server and what are their permission?
  • And maybe the most important one, is there an SLA on this server?

Database level

  • What’s the size of the database?
  • When was the last backup taken?
  • When was the last time an integrity check was executed
  • How are the files set up?
  • What compatibility setting is being used?
  • What are the options in the database, i.e. auto close, updates of statistics, page verification, etc?
  • What users are present in the database and what are their permissions?

resized_diagnostic-house-meme-generator-it-s-not-the-database-it-s-never-the-database-but-it-might-be-lupus-d2af08

This is just a small list of the things I’d like to know about a database or database server.

Based on that information I can make a decision either to support the current state of the database or server or not to support it (support it best effort) until I’ve gotten the chance to change settings I don’t approve.

Normally you’d get a discussion about why you’re not fully supporting the database or database server. People know you’re the DBA and that it’s your job to support the database servers. Unfortunately, we’re not living in a perfect world and as soon as shit hits the fan, and you’ve taken full responsibility for a database that’s not up to your standards, then you have to explain yourself and don’t say the database is not up to your standards because you should have said that from the beginning.

In these situations, I would put all the findings in writing and inform the IT manager of the “new” database or database server and what to do to support it. This may sound political but you’ve got to protect yourself.

I’ve seen people get lower performance reviews because they supported items that were not up to standards and had gone haywire.

resized_jesus-says-meme-generator-jesus-says-sort-out-the-fucking-database-675a7d

One of the situations I had was a database server installed by a system administrator working at the software vendor.

No documentation about the installation, configuration, databases, logins, etc. It was all installed during a project and now the project was finished, maintenance was the responsibility of the DBAs.

Logins are used as service accounts for the application running under sysadmin privileges, no min or max memory set, SQL Server service running with the local system account, etc, etc. I wasn’t going to fully support this server and mentioned this to the IT manager in writing and personally. It wasn’t that the server was buggy but it wasn’t up to standards.

After a while, we had an issue with the server. Turned out that the service account of the application had destroyed the system database because a consultant forgot to include the

"USE [database]" 

the clause in a script and because it had sysadmin privileges, the default database was “master”, it dropped and truncated a lot of tables.

Besides the fact that the consultant did all that DDL with a dynamic query, if the privileges were set correctly, the login would never be able to corrupt the system databases. It took us many hours to get everything back up and I was called into an office with some managers who wanted to know why the server went down the way it did. The managers were angry because this incident cost the business a lot of money, a full department couldn’t work and foremost customers couldn’t be helped.

All the findings were put in changes, the changes were scheduled as a low priority by the business because the chance of things going down was pretty slim.

I mentioned the findings and the correspondence about the low-priority changes for the findings. If I hadn’t done that I would probably have been fired because of the impact the incident had on the company. It’s not about playing the blame game but if it’s not in your power to change things that should be changed then you have to protect yourself.

At the end of the day, you want to do the best job you can do and have a good night’s sleep without the thought in the back of your head that hell could break loose at any moment. If you have any stories like this please share them in the comments.

I’d like to hear if anybody had the same experience I had and maybe it helps someone else to avoid such a situation.