Failure is always an option

A while back, I had the pleasure of sitting down to talk with Bert Wagner (blog|twitter), Erin Stellato (blog|twitter), and an anonymous friend about some of our #SQLFail moments.

Everyone screws up

Everyone has their failures. Nobody is perfect. But failure can be a good thing. We can learn from our failures, and learn not to make the same mistakes again in the future. Failure only becomes a problem when we make the same mistake repeatedly. Bloggers spend a lot of time writing about our successes, but we’re human and we make mistakes, too. This post is a chance to share one of my memorable failures.

Story time…

I feel like I should frame this story like Grampa telling us a story about The War. Just like my Grampa, I love telling stories. “Back when I was your age, I didn’t have Management Studio or Ops Studio. I had to use Enterprise Manager if I wanted a GUI…” (Cue dream sequence music.)


I loved Query Analyzer.

Background

My very first tech job was at a startup that made a software suite for ambulatory health clinics. It was the first job where I used (and fell in love with) SQL Server. As with any startup software company, we all wore many hats. I helped with installing and configuring our software, training, tech support, and picking restaurants for the team to eat at. I was still learning my way through SQL Server, one query at a time.

When we were on-site at a customer’s location for training and go-live, we would often train on the production system, then clean up the test data before go-live. In the early 2000s, none of our customers were running in virtual environments, and having an extra system for training wasn’t in our customers’ budgets.

The fateful night

I still have vivid memories of that night. I’d ordered pizza so that I could stay back at my hotel room and finish my punch list of things before go-live the next day. It was after 2am, and I was sitting at the kitchen counter of the Residence Inn in Kalamazoo, MI, the pizza box still open next to me as I worked my way through a large pepperoni.

I got to the item on my punch list for “delete all test appointments.” The logic here was pretty simple: All the test appointments were for the same imaginary test patient. Just find all of that person’s appointments, and delete them. I decided I would do this one doctor at a time to make sure I didn’t mess it up too badly.

I wasn’t comfortable writing the SQL code for this, so I used Enterprise Manager’s “query” GUI to join Appointments to Patients and Doctors. I filtered for one doctor’s name and got back a list of all the appointments. If you aren’t familiar with this feature in Enterprise Manager, below is a similar example using the Northwind demo database (sticking to the SQL Server 2000 theme!). Here, I’m joining dbo.Orders (appointments) to dbo.Customers (patients) and dbo.Employees (doctors). Our database didn’t have foreign keys defined, so I’ve dropped them from Northwind, too. Enterprise Manager still figures out the join criteria based on column names!

An uncompressed backup of the Northwind demo database is just over 2MB.

Next, I selected all the rows for the orders (appointments) related to the test customer (patient) and hit delete. That was easy!

This employee in the Northwind database is named Davolio. My boss at the time was named Davo.

Except when I clicked Yes, I was immediately greeted with an error:

Huh, who else is doing stuff in the database at 2am?

I re-ran my query in Enterprise Manager, and nothing came back. I guess it worked after all? I looked in the application and… The doctor was gone. Not only was her schedule wiped out, but I also couldn’t find her anywhere in the system!

In my Northwind example, you can see that the Employee is completely gone.

Oops. That wasn’t supposed to happen.

What happened?

When I used the GUI to join and query the three tables, I’d done a SELECT *, returning rows across all three tables.

SELECT * has been my nemesis for 15 years.

When I selected rows and deleted them, Enterprise Manager performed the DELETE across all three tables. To prove this, I ran Profiler and stepped through it again on Northwind. Sure enough, you can see that it performs the DELETE on all three tables. It tries to ROLLBACK, but there’s no BEGIN TRANSACTION, so there’s nothing to rollback.

Oh, did I mention the database didn’t have any foreign keys?

The test patient had also been deleted, but I didn’t care about that. I needed to get this doctor back into the software. Go-live was the next day. All her future appointments were already in the system, and at 8am this had to be ready to go. I had just over 5 hours to fix this problem. My adrenaline was pumping and I was wide awake.

How’d I fix it?

Because it was 2:30am, I didn’t want to call my boss and wake him up. I figured I could fix it myself. I tried doing an INSERT to recreate the doctor, but DoctorID was an identity, and I needed to create this doctor with a specific DoctorID value so that it matched up with all the existing appointments. I’d seen someone else do this, so I knew it was possible; I just didn’t know how. After searching the internet (Google wasn’t a verb, yet), I scoured the help files (the local .chm files on my laptop), and I found my savior: IDENTITY_INSERT. I wasted over an hour on this, but in the end, I got that one doctor’s row back into the schedule, and all was good.

I went to bed that night having caused and solved a major issue. Most important, I learned a lesson that night: It was the last time I deleted anything from the Enteprise Manager GUI. That night convinced me I needed to learn more T-SQL and be more confident in writing code. I’d also read about these things called “foreign keys” that sounded like they could have been handy that night. Heck, if I could learn how to use IDENTITY_INSERT, I could learn anything.

1 Trackback / Pingback

  1. Pobody’s Nerfect: The Andy Mallon Story – Curated SQL

Comments are closed.