What is SQL database CI?
Continuous integration (CI), in the context of databases, refers to the practice of isolated database changes to be integrated, as soon as they are made and pushed to a source control repository. In the early days of CI, daily integration was a rule of thumb to follow. However, today, it is more common, than not, that database changes are integrated several times a day. Each change needs to pass an established testing plan that executes automatically on detected changes committed to the repository and if everything is ok, changes will be automatically merged using the build script. with the remaining code, into a new build that will be pushed to other environments (i.e. production, QA).
What does “continuous” mean
By “continuous” it may be assumed that the process, once started, never ends. However, this is not generally the case. The only thing that is “continuous” in the entire process is monitoring the source control repository for changes. The rest of the process is repeatable and occurs only when a change is detected. With this being said, the rest of the process (build, load test data, test, document, deploy) is more “continual” than “continuous”.
What does “Integration” mean?
In CI, “Integration” means that once a database change is committed to the repository, the CI server is triggered to pull the latest state of the repository and to integrate committed database changes with the production (or any other environment). This would create a process to build a database, load it with some test data and perform various types of testing. As soon as all tests are passed, the process of “integration” moves to the next step, which is building the database that will be deployed.
What are the benefits of CI?
Agility. Instead of waiting the QA team to finish testing right before the release, CI provides feedback early and often. This enables developers to be agile in their response to problems and avoids situations where teams become mired in days, weeks, or even months’ worth of accumulated defects that can slow development to a stop
Efficiency. By automating the repeatable part of the process (building, loading test data, running unit test, deployment), manual effort and human error are largely eliminated from the build process
Quality. Having CI implemented allows you to have a database that passed all tests, otherwise it may not even be deployed to QA. This gives a certain confidence to developers as they know the database that is deployed to QA will, in theory, be free of critical and obvious errors.
Requirements for CI
Source control management system (SCM system). There are many available including Team Foundation Server (TFS), Git, Subversion (SVN), Mercurial and Perforce. By placing a database under source control, you will be able to maintain SQL code changes inside the team and to establish a systematic development methodology. Having a full history of changes made against the database, enables developers to revert any changeset in case there is a need for change rollback. Having a SQL database under source control is a pre-requisite to implementing CI.
Continuous integration server (CI server). This is where the entire processes of automatic testing, building and deployment is managed. A CI server is connected with the source control repository watching and waiting for any committed changes. When a change is detected, the CI server builds a database from the latest version on the repository, put some test data in the database and runs automated tests if needed informing the user (developer) about the test results at the end or during the test process.
Reliable build and deploy process. When all automated tests are finished and passed, a build system is triggered to use the tested code for building a database and deployment to QA, for example
Automation. From the first step of monitoring changes on the source control repository, to testing, generating test reports and sending feedback back to the team, up to the final automated build and deployment. CI, should not require any manual intervention and should occur on each change made against the database and committed to the repository.
Building can be scheduled in other ways, like daily, or several times a day, at set periods of time, but the best practice, in general, is to set it to run on each committed change.
A “poor man’s” CI workflow?
To create scheduled or “On demand” CI, the first thing is to identify processes that requires automation. These can be in various areas such as building, testing or deployment. For example, the first requirement is to create a delta script of recent database schema changes, to apply to QA. Next, is to run unit tests on all new and changed objects e.g. stored procedures. Another requirement is to replicate all static production data, like State names, Zip and country codes, etc. Finally, synthetic production data aka test, is needed to test the database fully in QA
The next step is to create a process that makes these processes repeatable and allows running them based on a schedule or particular event. Various tools like ApexSQL Diff (schema change scripting), ApexSQL Script (schema and/or data scripting), and ApexSQL Generate (synthetic database scripting) can be used to create these scripts.
The last step is to create a job to combine and schedule all the processes to run in the correct order, ideally with one click and/or via a scheduled/automated process and ideally return codes for success or failure
Once complete, this “poor man’s” CI process will be able to run on demand, with the click of a batch file, or be scheduled to run nightly
Changing this process, to one that is automatically triggered based on checked in code requires deeper integration of tools involved in the development process, like ApexSQL Source Control
A “rich man’s” CI work flow
The following are steps that show the usual workflow involved database continuous integration at the next level. Note that these steps may vary depending on the project/environment:
- A developer makes a change in the database and creates a build to perform local testing. If everything goes well, changes are committed to the source control repository. Anything that happens before the commit, does not invoke any action on the CI server side. This means that developers can work locally, even to have a local repository where changes will be committed prior to synchronizing with the remote one.
- The CI server is constantly monitoring the remote repository for possible changes (e.g. every few minutes, hours etc.). As soon as committed changes are detected, the CI server triggers a build script that pulls the latest state from the repository (including the newly committed change) and builds a database.
- The new database is populated with some static data
- Unit tests are executed automatically, as soon as the database is built and populated with the static and data. Additionally, any rule that must be followed inside the team can be set and checked in order to keep the consistency of SQL code and internal coding standards.
- CI server sends the feedback to the development team in form of test results. Those can be sent to developers once the test is finished (if all tests passed), or on the first failed test.
- Additionally, each version of a built SQL database can be documented in a user friendly and readable format so the team can have an information about the database development process at any point and to be able to inspect some specifics, like object dependencies.
- Developers are informed about possible errors not long after they committed changes that triggered the entire process. This helps to review the changes, investigate and fix the problems quickly before moving forward to the next task. This ensures that changes that can break the database cannot pass to test, because the database will not be even built until all tests are passed.
- As soon as all tests are passed, the CI server invokes the deployment mechanism that uses the tested database and deploy it directly to QA
Some CI best practices:
Frequent committing of changes. Committing is recommended after each change, or after a task is done (assuming that a single task is atomic enough and it can be finished in a few hours). Worst case, code should be committed nightly with all logical work units reduced to tasks under one day’s worth of work.
Less/no committing of broken code. The best way to prevent bad code from getting into a build is to stop it at the commit. Developers should run tests on their local database to ensure changes haven’t broken the database. They should also carefully review the new code, objects to ensure standards have been followed, format is acceptable, and quality, in general, is acceptable. This can be done manually, or in some cases with automated processes that review an object when a developer checks it in, and prevents the check-in if problems are detected
Quick bug fixing turnaround. If an issue is detected on QA, it should be resolved or reverted quickly, ideally not longer than 24 hours
Automated test coverage. Each new database object, whether it is newly introduced or improved from the previous version, should be covered with unit tests and the acceptance criteria, in terms of pass rate of percentage of passed tests, should be sufficiently high to ensure good downstream quality. Also, automated tests that validate code consistency, redundancy and other rules should be included in the automated process.