Redgate versus Aliens from Planet SQL
Data refreshes are one of the hardest things to accomplish in an Agile environment: The downtime between sprints is miniscule and most modern systems have quite a bit of metadata that must be contained within specific environment such that ‘unwanted’ interactions do not occur. “Like what?” I can here you say. How about sending account cancellation to active members during a test. Yeah…I can hear the gasps of horror from here.
This is the Redgate Tools Team, a small software company out of Cambridge, England. How small? Here’s the entire group…..
Ahh, the English, their Easter Egg and their dry sense of humor. Anyway, Redgate builds the best SQL management tools I’ve ever seen: Fast, reliable, and easy to use. They have quite a tool selection, but my favorites are SQLCompare and SQLDataCompare.
SQLCompare is a schema level comparison tool that can either report on the differences between two databases or it can alter the target database to match the sourcedatabase…VERY POWERFUL. There are a few gotcha’s here and there (like circular relationships…has a hard time with those), but when you are in a hurry it’s a great time saver.
While SQLCompare is centered on structure, SQLDataCompare is all about what is in those structures, namely data. Given two databases, it can compare every row on every table with a corresponding row on a corresponding table. Quick…real quick. The configuration can be intimidating, but once you have a handle on it you can get a lot of work done quickly and reliably.
And the best part of Redgate products are that once a project is created and configured it can be run from the command line, which in turn is easily integrated into a schedule, SSIS, or FinalBuilder automation.
So here’s the scenario:
- Four environments must have refreshed data from PROD: DEV, TEST, QA, and PROOF.
- Each environment has 5 major databases with a footprint of about 500GB.
- > 100 tables hold environmentally sensitive data spread out over 5 databases.
- There can be NO ENVIRONMENTALLY SENSITIVE DATA LEAKAGE! (e.g., Data from PROD can’t exist in TEST if it could cause unwanted results or damage.)
- Any one environment can be down no longer than 24 hours.
- All the refreshes have to be done in a calendar week.
So…Here’s the game plan:
We created a new environment (What’s that? Number 7?) that will be used as a staging area where the data refresh preparations will be conducted. This environment will hold an extra database that will gather and hold all environmentally sensitive data from all the environments using SQLDataCompare.
- Each Saturday a FinalBuilder project is scheduled to gather all the environmentally sensitive data from each environment and store it in the hold area.
- Each Sunday a SQL job restores a production copy of all databases down onto this refresh preparation environment.
-
A FinalBuilder project is scheduled for 8AM each day:
- It recovered non-essential space from the databases in the form of audit, deprecated, and staging tables.
-
Environmental sensitive data was pulled from the holding area and overlaid on the PROD data depending on the day:
- Monday = DEV
- Tuesday = TEST
- Wednesday = QA
- Thursday = PROOF
- The data is scrubbed of personal/sensitive data.
On Monday, which is the first time the space recovery and cleansing routines run, it takes about 6 to 7 hours. Each subsequent day takes about 5 minutes.
- The Production DBA’s backup the refresh environment and restore it to the target environment.
- The Production DBA’s resync the security on the target environment. (So that it works the way it did before the refresh.)
- Conduct any necessary code promotions (either though automation or a request, depending on your process).
There…All done. A data refresh used to take all week without the ability to do anything else. Now…we could do all environments in a single week and still have time for additional tasks. Gotta love Redgate.