An SQL text by Erland Sommarskog, SQL Server MVP. Last revision: 2025-02-26.
Copyright applies to this text. See here for font conventions used in this article.
One of the most challenging tasks you can undertake as an SQL Server DBA is to change the server and/or database collation on an SQL Server instance. To be precise, if you only need to change the server collation to match a database on the instance, this is not too horrible. It is when you need to change the collation of a database including the string columns in the database that your real troubles begin. Exactly how horrible it can be, depends on several things. How "wild" is your collation change? How complex is your database? The more SQL Server features you are using, the more challenges you are likely to face. And not the least: how big is your database? A collation change often requires a number of size-of-data operations, so for a large database it could take a long time.
In this article, I will give some tips to help you to plan and implement a collation change. This includes two scripts that I hope you will find useful. They may not cover exactly all features you are using, but they should give you a head start.
I have a second article about collations, All You Wanted to Know about Collations, which is a long article that indeed covers everything you wanted to know about collations (and probably a bit beyond that), except exactly the task of changing database and server collations. I decided to cover these operations in a separate article, since quite likely there are readers who have been tasked to implement a collation change, but do not care for all the gory details on collations in the longer article. That is, the other article is by no means a prerequisite for this article.
This introduction concludes with a quick recap on collation basics. The next chapter discusses the risks involved with a collation change. I then proceed to discuss the importance of having a schema-compare tool at hand when you work with a collation-change project. The following chapter discusses two ways to change the server collation, one documented and one undocumented. This chapter leads to a brief discussion of an undocumented feature in SQL Server to change the database collation, mainly to discourage you from using it, in case you would read about it elsewhere on the internet.
The chapters so far are more or less a preamble to the main chapter, which in detail covers the normal way to do a collation change, that is, in-place. It is in this chapter I present the scripts mentioned above and how they can help you. The last chapter discusses an alternate strategy: Create a new database with the new collation and copy data over; a route you may take, when the problems with an in-place change are starting to get insurmountable.
Table of Contents
The Risks with Collation Changes
The Necessity of Schema Compare Tools
Using -q to Change the Database and Column Collations
The Obstacles with ALTER DATABASE COLLATE
An Interlude about Azure SQL Database
Preparing for ALTER DATABASE COLLATE
The Complications with ALTER TABLE ALTER COLUMN
The Change Column Collation Script
Finalising the Collation Change
Creating a Faithful Schema Copy
A collation in SQL Server is a set of rules, based in a specific language or a group of languages. To this comes general preferences like case- and accent-sensitivity. A collation determines how data is sorted, compared and grouped, the result of functions like upper/lower and a few more things. The collation also controls which code page that is used for the char, varchar and text data types, and thus the character repertoire for these data types. Changing between two collations with different code pages typically requires extra considerations.
There are four levels where you can set the collation in SQL Server:
Note: For the rest of this article, I will rarely mention the text and ntext data types, since they are deprecated. However, if I say "char and varchar", this implies text as well. And in some places, I may in the interest of brevity say only varchar, but that will also imply char and text. The same applies for the Unicode data types.
When you set off to change the collation of a database and all its string columns, there are a number of things that can go wrong. These risks apply to metadata as well as to data.
To be able to perform a collation change, you will need to drop and recreate many objects. If you go straight to it and try ALTER DATABASE COLLATE to change the database collation, it is very likely it will spew out a long list of error messages, because there are several types of objects that block ALTER DATABASE COLLATE from running. Thus, you will have to drop these objects, and to be able to do that, you may have to drop yet other objects and so on. And once you have been able to change the database collation, you need to restore all the objects you have dropped, so you need to track what you drop. You also need to consider that there might be things that are dropped silently, for instance permissions, that you also need to restore.
ALTER DATABASE COLLATE only changes the database collation and the collation of the system tables. but it does not touch user tables. So you will need to run lots of ALTER TABLE ALTER COLUMN to change the column collations. And again, you may find that to change the collation of a column, you need to drop other objects that depend on that column, for instance indexes and key constraints. Which in its turn can cascade to more objects, for instance foreign keys, that you need to drop – and later restore.
If you are using table types, their string columns will retain the old collation unless you change the types. However, you cannot alter table types, but you need to drop and recreate them. And to be able to drop them, you first need to drop all stored procedures and functions that have table-valued parameters of these types. (A stored procedure that only declares a local variable of a table type does not need to be dropped.) When you have recreated the types, you need to restore these procedures. Furthermore, for both types and procedures, there may also be permissions you need to restore.
Views and table-valued functions are also a concern, although this is easier. In many cases, it is sufficient to run sp_refreshsqlmodule to update the collation information in sys.columns.
It should be quite obvious what the risks are here: when you recreate objects and other things, you may accidentally restore them differently from what they were before. Or you may be oblivious and forget to restore some items entirely. This can lead to various consequences: Forgetting an index or recreating it in the wrong way, can lead to poor performance. Forgetting permissions can lead to users not being able to do their work. Forgetting a trigger which is supposed to perform a cascading update can lead to database inconsistencies.
A risk which is particularly noteworthy is that when you run ALTER TABLE ALTER COLUMN, you need to repeat the column definition exactly, or else you may lose an important property of the column, for instance data masking.
A risk of a different matter is that under the new collation, object names that were distinct no longer are. This mainly happens when you change from a case- or accent-sensitive collation to an insensitive one. Say that in a case-sensitive (CS) database there is both dbo.MyTable and dbo.mytable. When you try to change to a case-insensitive (CI) collation, this clash results in a violation of a unique index in sys.objects (or more precisely in its underlying table) and this causes ALTER DATABASE COLLATE to fail.
Not only can metadata start to clash under the new collation, but it is even more likely to happen with the data itself, I would say. For instance, you have a table where the primary key is a char(3) column with a CS collation. Both abc and ABC appear as key values, and when you change to a CI collation, the town is no longer big enough for the both of them.
You may also get data errors when you change from a something-insensitive to something-sensitive collation. Say that in the main table, the key value is ABC, but in a referencing table the foreign-key column has the value abc. When you change to a case-sensitive collation, the FK is no longer valid.
If you change collation from one language to another, there is also a risk for constraint violations, because what is distinct in one language is not in the other or vice versa. One of the more striking examples is a change to a CI collation for Turkish or Azeri (or the other way round). In these languages I and i are different letters, the first matched by a dotless lowercase ı. and the second by a dotted uppercase İ.
It goes without saying that CHECK constraints can present problems as well, since under the new collation some data may violate the conditions of the CHECK constraint.
In this article, I will not provide any particular help for how to tackle these problems. I only want to make you aware of the risks. If you run into data errors during testing, you should probably go to the target database and make sure that the data is corrected before you run the collation change for real. When it comes to CHECK constraints, it may be that the constraint itself needs to be adjusted.
As the reader at this point may feel a little scared, I should point out that some collation changes are less prone to data errors than others. For instance, I suspect that it is not uncommon to change between SQL_Latin1_General_CP1_CI_AS and Latin1_General_CI_AS in either direction. If you only have nvarchar columns in your database, you are perfectly safe, as these two collations are entirely equivalent for Unicode data types. If there are varchar columns, you are exposed to a risk, but it is small as the main differences for these two collations concern the letters Œ, Š and Ž.
This is something you only can run into, if you are changing between collations with different code pages. Furthermore, it can only occur with varchar columns. To see which code pages the collations involved are using, you can run queries like these:
SELECT collationproperty('Latin1_General_CI_AS', 'Codepage') SELECT collationproperty('Polish_CI_AS', 'Codepage')
In this particular example, the values returned are different, and if you are changing the collation from Latin1_General_CI_AS to Polish_CI_AS, data could be distorted. For instance, say that in a varchar column in your database there is the value señor. The character ñ is not available in the code page for the Polish collation. Therefore, it is replaced by what is known as a fallback character. The fallback is often a lookalike character, and in this particular example you will get senor instead. However, sometimes there is no lookalike character, or the translation table does for some reason not consider the lookalike. In this case you will get the general fallback character which is the question mark, resulting in se?or.
In contrast to the data errors in the previous section that occur with a bang as you are running your collation change, data distortion happens silently, and may not be discovered until later. There is all reason to keep a backup of the database so that you can restore distorted data weeks or months later. In terms of preparation, there is not a whole lot you can do, but you can at least investigate what happens with characters that you know are important to your users by doing a test like this:
DROP TABLE IF EXISTS testtbl CREATE TABLE testtbl (a varchar(10) COLLATE Latin1_General_CI_AS) INSERT testtbl(a) VALUES(N'señor') ALTER TABLE testtbl ALTER COLUMN a varchar(10) COLLATE Polish_CI_AS SELECT a FROM testtbl
If changing the data type to nvarchar is not an option, you have no choice but to accept things like senor. What you may want to prevent is the question marks. (Try the above with Thai_CI_AS instead of Polish_CI_AS to see this happen.)
This sort of data distortion can only occur if the new collation is a non-UTF8 collation. With a UTF8 collation, you have the full Unicode range also for varchar available, and therefore there is no need for fallback characters. However, when changing to a UTF8 collation, you may face truncation. Consider this example:
DROP TABLE IF EXISTS testtbl CREATE TABLE testtbl (a varchar(10) COLLATE Latin1_General_CI_AS) INSERT testtbl(a) VALUES(N'räksmörgås') ALTER TABLE testtbl ALTER COLUMN a varchar(10) COLLATE Latin1_General_100_CI_AS_SC_UTF8 SELECT a FROM testtbl
This outputs räksmörg, that is, the last two characters are lost. This happens because 10 in varchar(10) is ten bytes, not ten characters, and encoded in UTF‑8, räksmörgås (Swedish for shrimp sandwich) requires 13 bytes.
If you are planning to move to a UTF8 collation, you should take this in consideration and enlarge the columns that are too short to fit the current data after a conversion to UTF‑8. Here is a script that you can run on your database to find all char and varchar columns for which the current max length is insufficient:
DROP TABLE IF EXISTS #lengthcheck CREATE TABLE #lengthcheck(schema_name sysname NOT NULL, table_name sysname NOT NULL, colname sysname NOT NULL, maxlen smallint NOT NULL, maxdatalen smallint NULL, PRIMARY KEY (schema_name, table_name, colname) ) DECLARE @sql nvarchar(MAX) SELECT @sql = string_agg(convert(nvarchar(MAX), N'INSERT #lengthcheck SELECT ') + quotename(s.name, '''') + ', ' + quotename(t.name, '''') + ', ' + quotename(c.name, '''') + ', ' + convert(nvarchar(10), c.max_length) + ', MAX(datalength(cast(' + quotename(c.name) + ' AS varchar(MAX)) COLLATE Thai_100_CS_AS_SC_UTF8)) FROM ' + quotename(s.name) + '.' + quotename(t.name), nchar(13) + nchar(10)) FROM sys.schemas AS s JOIN sys.tables AS t ON t.schema_id = s.schema_id JOIN sys.columns AS c ON c.object_id = t.object_id WHERE type_name(c.system_type_id) IN ('char', 'varchar') AND c.max_length > 0 -- PRINT @sql EXEC(@sql) SELECT * FROM #lengthcheck WHERE maxdatalen > maxlen
You may be looking at this thinking What on Earth is this doing? (Answer: The power of dynamic SQL where you can easily write something you can't read the next day.) The query with string_agg generates a number of queries like this (here with some line breaks for a more readable format):
INSERT #lengthcheck SELECT 'dbo', 'sometable', 'poseffect', 30, MAX(datalength(cast(somecol AS varchar(MAX)) COLLATE Thai_100_CS_AS_SC_UTF8)) FROM [dbo].[sometable]
The key is that the datalength function returns the length in bytes. We cast the column values to a UTF8 collation. Which one doesn't matter. I picked Thai, because the name is short. We need the convert to varchar(MAX), because if we cast a varchar(30) to a different collation, the data type is still varchar(30), so the values would be truncated before datalength sees them. Thus, to be sure that we get the actual length in bytes we need to cast to a longer string type.
The final query against the temp table returns the columns which have data that will be truncated unless you adjust the data type before you change to a UTF8 collation. It's probably not a bad idea to also enlarge the columns that are close to the max, as they may be accidents waiting to happen.
Observe that the generated queries may have to scan large tables, with one scan per char or varchar column. So it goes without saying that this is not something you should run on your production database during peak hours. Ideally, you should run on a copy of production in a quiet environment. For a large database, it could take hours to get the result.
Running a collation change can take a lot of time and consume a lot of disk space.
As discussed above, you will have to drop and restore indexes and constraints, and the latter part can take considerable time for large tables. As can, for that matter, dropping the clustered index of a table (because it is affected by the collation change), while keeping the non-clustered indexes (because they are not).
Altering the column collation with ALTER TABLE ALTER COLUMN itself is a metadata operation with one important exception: When you change to a collation with a different code page, all char, varchar and text columns must be rewritten to perform character-set conversions. Since you can only alter one column at a time, this means one scan per each such column in a table.
The scripts I will present later in this article are not very much focused on performance; only to get the task done at all. (That is challenging enough!) I will give some brief tips on performance enhancements, but they are just that: brief.
Disk space can also be an issue. You will produce a lot of transaction log when you rebuild all those indexes, and even more so if you change the code page with ALTER TABLE ALTER COLUMN. This is certainly something you need to get a grip on when you do your testing. If your production database is in an availability group, you should try to make a test run in an AG with the same setup, so that you know that the secondaries can keep up with the flood of log records the collation change produces.
In the previous chapter, I discussed the risk that in the flood of objects you need to drop and restore you may forget one or two, or restore objects in a different way from what they were before.
To avoid this to happen, it is essential to use a schema-compare tool. When you have completed your change-collation operation, you need to compare your database with the new collation to a reference copy with the old collation to verify that there are no undesired differences.
There may be more tools on the market, but these are the two prime contenders:
At this point the reader may think, Yeah, we have Visual Studio, but we don't use SSDT, DACPACs and that for our database deployment. No problem, you are in the same boat as me. Schema Compare is the only thing I use in SSDT, and I must say that I am quite impressed with it. And while Schema Compare can be used to deploy changes from one database to another, I only use it to compare schemas. The same applies to SQL Compare. While you can use SQL Compare to deploy changes, you can use it as a compare tool and nothing more.
Getting started with any of them is not very difficult. For Visual Studio/SSDT, find the SQL Server Object Explorer in the View menu. Add your SQL Server instance, if it is not there. Then expand it, find your reference database and right-click it and select Schema Compare. Next, find the other database and just drag it to the right side of the Schema Compare window. Some tips:
In the context of collation changes, here is something which is interesting. Consider this script:
CREATE DATABASE DB1 COLLATE Latin1_General_CI_AS CREATE DATABASE DB2 COLLATE French_CI_AS go USE DB1 GO CREATE TABLE T1 (a nvarchar(20)) CREATE TABLE T2 (a nvarchar(20)) go USE DB2 go CREATE TABLE T1 (a nvarchar(20)) CREATE TABLE T2 (a nvarchar(20) COLLATE Latin1_General_CI_AS)
T2 in both databases are entirely the same, as there is no difference in collation. The two T1 differs on the collation, since the databases have different collations. However, Schema Compare flags none of the tables as different. It sees that the two T1.a use the default collation for their database, and therefore Schema Compare considers them to be equal. This is great in the context of comparing before and after a collation change, since else you would see very many false positives.
Keep in mind that schema-comparison is a complex operation. While both tools have good reputation, there can still be feature gaps. Or bugs. If you are using a high-end SQL Server feature that is important to you, may you want to make a test to ensure that your schema-compare tool is actually detecting differences in this particular feature. I particularly like to call out Replication and Change Data Capture as two features that may not be covered. (I don't work much with any of them, so I have not tested this myself.)
If all you want to do is to change the server collation (for instance, to match a database with a different collation on the instance), that is a relatively simple operation, at least in comparison with changing database and column collations.
There are two ways to go. The official and documented way is described in the Microsoft Learn topic Rebuild System Databases. If you read it, you will find that it is as close to a re-installation as you can get. And specifically, you are not really rebuilding the current system databases. You are replacing them with new databases created from template versions in an install folder. This means that all the current information in master and msdb is lost: logins, jobs, configuration settings, you name it. You will also need to re-attach the user databases.
The topic says that after a rebuild, you can restore backups of master, model and msdb. But then there is an Important note:
If you have changed the server collation, do not restore the system databases. Doing so will replace the new collation with the previous collation setting.
At this point, the reader may sigh and exclaim That is a whole lot of work! Yes, but compared to changing the database collation, it is a breeze.
Nevertheless, there is an alternative route that preserves logins, jobs etc. Just beware that you will be entering an undocumented and unsupported path. Before I go on, I should say that what follows is adapted from a blog post by Jason Brimhall, Change SQL Server Collation – Back to Basics.
The first step is to detach all user databases. Once you have done this, open SQL Server Configuration Manager and stop SQL Server. Then open a command-line window with Run as Administrator and move to the Binn folder of your SQL Server instance. For example:
cd \Program Files\Microsoft SQL Server\MSSQL16.MYINSTANCE\MSSQL\BINN
Then run a command like this:
sqlservr -m -T4022 -T3659 -sMYINSTANCE -q"Latin1_General_100_CS_AS"
-m
is for single-user mode. -T
is for enabling two trace flags. 4022 prevents startup procedures from running and 3659 serves to direct the output also to the SQL Server errorlog. -s
is to specify which instance to start. Finally, -q
is the star of the show: here you specify the new server collation. When SQL Server starts, it will spew out a lot of output in the window. This includes a long list of indexes it rebuilds according to the new collation. (All this output is also in the ERRORLOG file). When you see the message The default collation was successfully changed, followed by Recovery is complete. This is an informational message only. No user action is required, you can press Ctrl/C to stop SQL Server and start SQL Server in the normal way. The last step is to re-attach the user databases.
I strongly recommend that you have a backup around when you do this, because the operation could fail with a duplicate index key, due to names that were distinct no longer are in the new collation. If your SQL Server instance runs on a virtual machine, take a snapshot of the VM or clone it and test on the clone first.
Now that the news is out, we might as well discuss this option right here and now.
In the previous chapter, I advised you to detach all user databases before you used the -q
option to change the server collation. You may ask what happens if you don't, and you have probably already guessed it: Yes, if the user databases are there, SQL Server will move on and change the collation of those databases and all their string columns, one database at a time, until it has completed all – or it runs into a failure with one database and stops leaving remaining databases untouched.
And as we shall see in a second, the latter is by no means unlikely. But let's first get an understanding of why -q
is there. This option serves a very important and well-defined purpose and everyone who has ever installed SQL Server has made use of this option. To wit, at one point during the installation, Setup starts SQL Server with a number of trace flags and options and one of the options is -q
to change collation of the system databases from the install media to the one you have chosen. This is the exact scope for which -q
exists and has been tested. Everything beyond that is at your risk.
I'm inclined to say that when it comes to changing the server collation, -q
is relatively safe. The risk for an index violation or an invalid foreign key in the system databases should not be ignored, but it is still fairly small. This is a different matter when it comes to the user databases.
Here are some observations that I have made when I have played with -q
:
If you want to know more about this option, Solomon Rutzky has a blog post Changing the Collation of the SQL Server Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong? with the subtitle Demystifying What “sqlservr -q
” Actually Does. In this post he discusses this option in detail, when it works and when it does not.
But despite the enthusiasm of Solomon's blog post, my recommendation is entirely against the usage of this option for user databases.
This is the normal way to implement a change of database and column collations. The basic idea is that you run ALTER DATABASE COLLATE to change the database collation, and then you run ALTER TABLE ALTER COLUMN on all string columns to change their collation. But as I briefly discussed in the Risks chapter, it is quite likely you will face obstacles with both, and in this chapter I will present two scripts, AlterDatabaseCollate.sql and ChangeColumnCollation.sql to help you to overcome these obstacles. These scripts generate the commands needed to make the collation change. They don't run the commands but leave that to you, so there is a possibility to make manual adjustments. Which may be needed, since they cover far from all features in SQL Server.
Before we go into details, here is an overview of the steps to perform a collation change with the help of these scripts.
As you run through these steps, you may get errors. This could be errors due to that you are using something the scripts does not consider. There could also be errors with data or metadata not complying with unique indexes or constraints under the new collation. Typically, when you run into an error, you will need to start over by restoring a backup. If your database is of any size, this can be quite tedious. Add to this that some of the commands produced by the scripts can take considerable time to execute on a large database.
Therefore, I recommend that you start your testing with a small database with the same schema as your target database. Use this small database to iron out all issues with metadata or features not supported by the scripts. When you start working with your big database, you "only" have the data errors to handle.
You may already have a small test database that you can use for this purpose. You can also easily create a small schema-only copy of your target database with DBCC CLONEDATABASE. For this particular purpose, I recommend that you create the clone this way:
DBCC CLONEDATABASE(MyBD, MyDB_clone) WITH VERIFY_CLONEDB
The option VERIFY_CLONEDB instructs DBCC CLONEDATABASE to verify the consistency of the database clone. This option also disables the inclusion of statistics and Query Store data, which you have no need for in this particular context. (An important use case of DBCC CLONEDATABASE is to create a copy for performance troubleshooting.) I should add that DBCC CLONEDATABASE does not support exactly all features in SQL Server, but for this purpose – initial testing of your collation change – it should be good enough. Beware that when you have created the clone it is marked as read-only, so you will have to change it to be read-write.
To change the collation of a database you say something like:
ALTER DATABASE MyDB COLLATE Latin1_General_100_CI_AS_SC_UTF8
This command sets the default for new columns, and it changes the collation of the system tables, and thereby the rules for identifiers in the database. However, it does not change the collation of columns in user tables.
To run this command, you need exclusive access to the database. It does not need to be set to single-user mode explicitly, but there cannot be other users in the database when you run this command. (So you may prefer to go the single-user route for simplicity anyway.)
Unless this is a database you just started working on, or it is just some small toy thing, the chances that this command succeeds on the first attempt are about nil. This is an abbreviated version of the output I get when I try it on my demo database Northgale, an inflated and slightly modified version of Microsoft's old demo database Northwind:
Msg 5075, Level 16, State 1, Line 677
The object 'CK_Birthdate' is dependent on database collation. The database collation cannot be changed if a schema-bound object depends on it. Remove the dependencies on the database collation and then retry the operation.
Msg 5075, Level 16, State 2, Line 677
The statistics 'Customers.VATno_ix' is dependent on database collation. The database collation cannot be changed if a schema-bound object depends on it. Remove the dependencies on the database collation and then retry the operation.
Msg 5075, Level 16, State 2, Line 677
The statistics 'Orders.status_ix' is dependent on database collation. The database collation cannot be changed if a schema-bound object depends on it. Remove the dependencies on the database collation and then retry the operation.
...
Msg 5075, Level 16, State 1, Line 677
The object 'ckt_Discounts' is dependent on database collation. The database collation cannot be changed if a schema-bound object depends on it. Remove the dependencies on the database collation and then retry the operation.
Msg 5072, Level 16, State 1, Line 677
ALTER DATABASE failed. The default collation of database 'Northgale' cannot be set to Danish_Norwegian_CS_AS.
In total, I get 32 lines of error messages for a database with 16 tables. When I try the command on a more complex database with over 1800 tables, the output extends over 2000 lines!
These are the features I know of that block ALTER DATABASE COLLATE:
You may not have all of these, but if your database has zero CHECK constraints, I want to have a serious discussion with you about data integrity.
Why do these features block ALTER DATABASE COLLATE? For all but the last, the issue lies in the SQL code that is part of the definition and which is stored with the constraint/column/etc in the system tables. In this SQL code there are names of columns or objects, which may no longer be valid after the collation change. Consider this silly database:
CREATE DATABASE SillyDB COLLATE Latin1_General_CI_AS go USE SillyDB go CREATE TABLE SillyTable(a int NOT NULL CHECK (A > 0), b int NULL, c AS a + B, d int) CREATE INDEX normal_ix ON SillyTable (A) CREATE INDEX filter_ix ON SillyTable (b) WHERE b > 0 CREATE STATISTICS filter_stat ON SillyTable (d) WHERE (D > 0) go CREATE VIEW SillyView AS SELECT a, b, c FROM dbo.sillytable go CREATE VIEW SillyView2 WITH SCHEMABINDING AS SELECT a, b, c FROM dbo.sillytable
The database has a case-insensitive (CI) collation, and the script frivolously varies the case. With a case-sensitive (CS) collation, there is not a single definition that is valid. In the system catalogue, SQL Server has stored the expression of the CHECK constraint, the computed column and the index and statistics filters in various columns in the system tables. It has normalised them with regards to syntax, but column names are left as they were entered. When it comes to the key column in the index normal_ix, it is a different matter. Since this is a column name and nothing else, this relation is stored by column id in the system tables. But that is more difficult to do with the expressions in the CHECK constraint etc.
When it comes to the two views, SQL Server keeps the text stored for both views. However, for the view without SCHEMABINDING, the fact that a change to a CS collation makes the view invalid is no different from making the view invalid by dropping or renaming the table or a column. The onus for keeping the view valid is on you. When you add WITH SCHEMABINDING, you tell SQL Server to protect the view with its life and object to any change that would make the view invalid. Including collation changes.
As you see, changing to a CS collation would pose a lot of problems for SillyDB, and we may even be glad that SQL Server blocks the change. But what if we were to change to another CI collation, under which all definitions would remain valid? Or what if we correct the script to be consistent in case usage, before we create the database and then try to change to a CS collation? The answer is that it doesn't matter. SQL Server does not care. It sees that there could be problems, and it has absolutely zero interest in figuring out if there are problems for real. The mere fact that there could be problems is enough for SQL Server to take its hands from the collation change as long as those objects are there and it tells us: Get these things out of my sight!
The last item on the list, multi-statement functions, is one I have not fully understood myself. I guess it is related to the fact that the collation of string columns in the return table has to change if you change the database collation. But SQL Server is not consistent here. What is the difference between these two:
CREATE FUNCTION dbo.fun1() RETURNS @T TABLE (S nvarchar(20)) AS BEGIN RETURN END go CREATE FUNCTION dbo.fun2() RETURNS @T TABLE (S nvarchar(20) COLLATE DATABASE_DEFAULT) AS BEGIN RETURN END
Both return tables depend on the database collation in the sense that the collation of the column S should be changed as we change the database collation. But if you add both fun1 and fun2 to SillyDB and try ALTER DATABASE COLLATE, you will find that only fun1 is flagged. I am not going to try to find an explanation for this inconsistency, but I will simply assume that someone had a lapse at some point. To perform a collation change, you will have no choice but to drop fun1 and recreate it. Nevertheless, you also need to handle fun2, and we will come back to this in the section Finalising the Collation Change.
There is a general way to see if a module would block ALTER DATABASE COLLATE: In sys.sql_modules there is a column uses_database_collation. (sys.computed_columns and sys.check_constraints also have such a column.) My assumption is that ALTER DATABASE COLLATE checks this column for blocking items, and thus does not check the definition of the objects themselves.
Before I move on, I like to make a brief discussion about Azure SQL Database, where things should be different but are not.
By default, when you create a new Azure SQL Database, you get a fixed catalogue collation, SQL_Latin1_General_CP1_CI_AS. (You can override this, but it is not exposed in the portal.) That is, no matter what database collation you select, the system tables will use the aforementioned collation, so that your table and column names are always case-insensitive, even if the database collation is case-sensitive. Furthermore, if you change the database collation, the system catalogue is not affected. This means that of the points I listed in the previous section, only the last about table-valued parameters would have to block ALTER DATABASE COLLATE. (And as I discussed, this is also dubious.)
But, no, there is no difference. Azure SQL Database will be just as adamant that you should drop your CHECK constraints, computed columns etc. as on-prem SQL Server. I can't but consider this to be a bug and I have filed a bug report. Votes are welcome!
The same issue applies to partially contained databases, which also has a fixed catalogue collation (in this case Latin1_General_100_CI_AS_KS_WS). So again, your CHECK constraints etc. should not have to block ALTER DATABASE COLLATE, but they do. Admittedly, contained databases are quite a bit of a niche feature. But Azure SQL Database certainly is not.
We have now seen that many objects can block ALTER DATABASE COLLATE, and you will need to drop these objects to be able to change the database collation. And it does not stop there. Your computed column or schema-bound view may be indexed, so those indexes also need to be dropped before you can drop the view/column. Once you have changed the collation, you need to restore these objects, including things like permissions that were dropped implicitly, when you dropped a view or table-valued function.
To make this simpler, I've written a script, AlterDatabaseCollate.sql, which produces commands to drop and recreate the objects which it saves into a table.
As posted, the script only runs on SQL 2019 and later. If you are on older versions, you will get a few errors about missing columns in catalogue views, relating to features not present in your version of SQL Server. Just comment out these columns and the pieces of syntax that go with them. I have permitted myself to use convenience syntax like DROP IF NOT EXISTS and CREATE OR ALTER, added in SQL 2016, so if you are on SQL 2014 or earlier, you will need to make some simple edits. But to not leave readers of older versions completely stranded, I use FOR XML PATH for all string aggregation, and stay away from the more modern and more convenient string_agg.
The script first creates a schema ChangeCollation, and in this schema it creates the table AffectedObjects:
CREATE TABLE ChangeCollation.AffectedObjects (drop_order int NOT NULL, create_order int NULL, schema_name sysname NOT NULL, object_name sysname NOT NULL, minor_name sysname NOT NULL, reason varchar(30) NOT NULL, drop_cmd nvarchar(MAX) NOT NULL, create_cmd nvarchar(MAX) NOT NULL, CONSTRAINT pk_AffectedObjects PRIMARY KEY (drop_order), CONSTRAINT u_AffectedObjects UNIQUE (schema_name, object_name, minor_name) )
The columns schema_name, object_name, minor_name and reason are for informational purposes only. That is, they permit you to easily see which objects are affected and why. The column minor_name can be the name of a constraint, a column, an index etc. When not applicable, it is blank.
The column drop_order specifies in which order the objects are to be dropped. The script fills the table with an increment of 100 in drop_order, to permit you to rearrange the order. This can sometimes be needed, as discussed below. When recreating the objects, default is to recreate them in the reverse order of drop_order, but you can override this with the column create_order. We will come back to this in more detail when we discuss the finalisation of the collation change. The columns drop_cmd and create_cmd hold the commands to drop and create the objects respectively. For some entries, drop_cmd is blank. This is for things that only need to be handled at finalisation, and again, this is something we will return to later.
The script fills the table with various items to be dropped and recreated. At the end of the script, there are two simple stored procedures in the ChangeCollation schema, RunDropCmds and RunCreateCmds. Both loop the AffectedObjects table to execute the stored commands. In this section, we will only work with RunDropCmds which loops the table in ascending order of drop_order.
So what do you get in this table? In order, the script fills in these DROP commands:
Some remarks:
When you have been able to run RunDropCmds without errors, it is time to try to change the database collation for real:
EXEC ChangeCollation.RunDropCmds go ALTER DATABASE CURRENT SET SINGLE_USER WITH ROLLBACK IMMEDIATE ALTER DATABASE CURRENT COLLATE <your collation here> ALTER DATABASE CURRENT SET MULTI_USER
As noted earlier, SINGLE_USER is not needed, but you need to have the database for yourself, so you may just as well use it.
With some luck, ALTER DATABASE COLLATE succeeds. But you could be using a blocking feature that I have overlooked. In that case, I suggest that you add DROP and CREATE commands for those objects to AffectedObjects. Or, if you have plenty of these, augment the script and send me the improved version, so that other people can benefit from your changes. As noted earlier, ALTER DATABASE COLLATE can also fail due to names that previously were distinct, no longer are in the new collation.
Once you have been able to change the database collation, you still have the job of changing the collation of all string columns with ALTER TABLE ALTER COLUMN. Before we look at a script to do this, let's first study this rather unfriendly command. There are two complications.
The first is that if there is any object tied to the column, the command will block. This can be one of:
Thus, you will have no choice but to drop all of the above. And recreate when you are done. Does this seem familiar?
The other complication is that ALTER TABLE ALTER COLUMN requires you to specify the column exactly according to its current definition, save for the detail you want to change. If the column was defined as NOT NULL, and you leave out NOT NULL, the column will now permit NULL values. Or consider this. We create a table with a masked column, and we create a test user to verify that the masking works:
CREATE TABLE MaskedTable (email nvarchar(60) MASKED WITH (function = 'email()')) INSERT MaskedTable (email) VALUES('someone@somewhere.com') CREATE USER Testuser WITHOUT LOGIN GRANT SELECT ON MaskedTable TO Testuser EXECUTE AS USER = 'Testuser' go SELECT email FROM MaskedTable go REVERT
The output from the script is something like sXXX@XXXX.com, exactly what we wanted. But say now that we want to change the collation, and we run this, which seems pretty normal:
ALTER TABLE MaskedTable ALTER COLUMN email nvarchar(60) COLLATE Polish_CS_AI
But when we now run this with our test user, things have changed:
EXECUTE AS USER = 'Testuser' go SELECT email FROM MaskedTable go REVERT
The output is now someone@somewhere.com. Since we did not specify the masking option in the ALTER TABLE command, it was dropped. Do you see now why you need a schema-compare tool when run a collation-change project?
It's now time to look at the second script, ChangeColumnCollation.sql. This script has a different background than AlterDatabaseCollate.sql that I presented earlier. That script is, so far, entirely my own work. ChangeColumnCollation.sql, on the other hand, is the work of many. A couple of years ago, Data Platform MVP Hugo Kornelis had a client which needed help with changing column collations. Hugo found a script by Phillip C to which he made numerous corrections and additions. When he was done, he shared his work in a blog post, Collation change script. Other people found his blog post, and in the comments field they added several enhancements. Eventually, Bas Roovers took one of the improved versions and put it up on his GitHub site.
When I set off to write this article, I took the script from Bas's GitHub and tried it on a few databases I had around. I had to fix a few bugs. And because these databases were using features that none of the others had reason to consider, I also had to make a couple of amendments, including adding support for table types. Still, there is a lot missing. More about that later in the section Feature Coverage.
To be able to use the script, you first need to configure it. When you open the script in SSMS, you should look for the declaration of the variables @FromCollation and @ToCollation on top of the script. Here you specify the collations you want to change from and to. The script as posted by Hugo and Bas just changes all string columns to the database collation. But I said to myself: what if there are columns that deviate from the old database collation? For instance, you may be using a binary collation for code columns. @ToCollation defaults to the current database collation, so unless you are changing the collation of your deviating columns, you can leave as it is. But you need to set @FromCollation.
In contrast to AlterDatabaseCollate.sql, ChangeColumnCollation.sql does not store the result in a permanent table, but just returns a one-column result set with all the commands that you can copy and paste into a query window.
As posted, the script runs on SQL 2016 and up. On older versions, there are a few lines relating to newer features you need to comment out. As with AlterDatabaseCollate.sql, FOR XML PATH is used throughout for string aggregation rather than string_agg to support older versions.
The resulting script has a number of sections (whereof some may be empty, depending on what you have in your database). In order, they are:
Some remarks:
When you run the generated statements, you could very well get one or more error messages. Just double-click on a message to see the failing statement. Possible failure reasons include:
For the first point, you will need to clean up your source data. For the other two, you may have to make manual tweaks to the generated script or AffectedObjects. Or modify one of AlterDatabaseCollate.sql and ChangeColumnCollation.sql to handle the situation.
Once you have ironed out the errors you got when running the output of ChangeColumnCollation.sql – or simply decided to defer them for later – it is time to run ChangeCollation.RunCreateCmds to restore the objects previously dropped by RunDropCmds. The reason that you should not run this procedure until now is that some of the objects you dropped could also block ALTER TABLE ALTER COLUMN, so it is better to have them out of the way.
RunCreateCmds is slightly more sophisticated than RunDropCmds, but only slightly. The ORDER BY clause that sets up the cursor looks like this:
ORDER BY isnull(create_order, drop_order) DESC
You may recall that create_order is nullable, and AlterDatabaseCollate.sql always leaves it NULL. Thus, the default is to recreate objects in reverse to the order in which they were dropped. If you run into issues, you can set an explicit create_order to override the order implied by drop_order. You can also set create_order to 0 or a negative value, in which case RunCreateCmds will skip the object entirely.
As I said previously, there can be items in ChangeCollation.AffectedObjects that only have creation commands. Here is a full list of the actions that can be performed by RunCreateCmds:
Remarks:
Once you have been able to RunCreateCmds without errors (or decided that you are done anyway), run this query, to see if you still have any objects with the old database collation:
SELECT s.name AS "schema", o.name AS "table", c.name AS "column" FROM sys.schemas s JOIN sys.objects o ON s.schema_id = o.schema_id JOIN sys.columns c ON c.object_id = o.object_id WHERE c.collation_name = '<your old collation>' ORDER BY s.name, o.name, c.name
And then use your schema-compare tool to verify that all objects have been restored faithfully.
Before you call it a success, don't forget to test the application. You can still have errors in stored procedures with identifiers not referring to names of tables, columns and other objects correctly after the collation change. If there are cross-database or cross-server queries, collation errors may pop up. You can find some of these errors by running sp_refreshsqlmodule on all modules not covered by RunCreateCmds. Due to deferred name resolution, you will not detect if a table name is written in a way which no longer matches the definition in the new collation. Likewise, errors related to temp tables will not show up. Such errors you can only find by running the code. Also keep in mind that the collation change can lead to unexpected changes in query results etc.
Note: When making your initial testing, it may be a good idea to make a dry run of RunCreateCmds directly after a successful run of RunDropCmds without changing the database collation. This is particularly true, if you have made manual tweaks to the table or changed the script to accommodate your needs. This permits you to trap errors earlier. You should also make a schema-compare after this step to ensure that there are no surprises.
As I have mentioned, these two scripts by no means aspire to handle or consider every feature in SQL Server. They handle the basics, but they consider very few of the "fancy" features in SQL Server. And that's "fancy" in a very broad sense. In this section, I will try to briefly discuss what works and what does not. Since the two scripts have different origins, their coverage is not exactly the same. In the interest of brevity, I will refer to AlterDatabaseCollate.sql as ADC and ChangeColumnCollation.sql as CCC in this section.
If a specific feature does not appear in the list below it may be because I consider that it would be too far-fetched that it would be affected by the collation-change process, but I've been wrong before. It could also be that I simply overlooked the feature.
I like to end with some thoughts about error handling and performance.
I am not really happy with any of the scripts with regards to error handling. With both of them, there is a risk that a failing command aborts the entire batch, all depending on what the error is and your setting of XACT_ABORT. With RunDropCmds and RunCreateCmds, this means that no more commands in the table will be executed. The output created by CCC is somewhat better in this regard since there are multiple batches. But still most sections consist of a single batch, so if one ALTER TABLE ALTER COLUMN fails, a lot more could be skipped. I think that in most cases you want to see all errors, so that you can quite directly get an idea of how bad your situation is.
It would probably be better in both cases to have a script with a GO after each command to ensure that execution continues. Then you can easily double-click on an error message to see where it occurred. Or the commands could be executed by a simple client-side script or program that runs one statement at a time and logs any error together with the statement that produced it.
One reason I did not pursue this option is that I was pretty exhausted after having managed to use these scripts to change the collation on my three test databases at all. (All three required manual tweaks for things the scripts do not handle.)
There is also the aspect of performance. Those who worked on CCC before me seem to have made some improvements to the original work to prevent actions being performed more than once. Still there is room for more improvements, for instance with the dropping and re-creating of indexes. ADC drops non-clustered indexes before dropping the clustered index on a table/view and creates them in reverse order. CCC, which is likely to drop more indexes than ADC, lets it suffice with dropping the PK constraints last of all indexes and recreate them first, assuming that in the PK is the clustered index in most tables. But on one database I tested, this is not true for the largest table, and only the dropping section generated by CCC took over an hour to execute. This is because when you drop a clustered index with non-clustered indexes present, the latter have to be rebuilt to use the so-called RID as the row locator instead of the CI keys. And when you recreate the clustered indexes, the non-clustered indexes need to be rebuilt a second time. From this perspective, once the clustered index is added to the drop-and-recreate list, it would be better to add all non-clustered indexes, including those that do not relate to any string columns.
But the considerations about index rebuilds is still only a matter about fiddling with details. If you have a multi-terabyte database, can you really get through with the collation change if you run serial scripts with one statement at a time? You can probably cut down the execution time quite a bit by running things in parallel. Setting up that parallel execution and finding out what can run in parallel independently – that's a major task.
As you have seen, running an in-place change means that you have to go through a lot of hoops. My hope is that the scripts I've presented should help you on the way. Nevertheless, you may be using features which these scripts do not handle correctly, and you can't figure out how to adjust the scripts to these features. Nor are you keen on manually adjusting the sets of generated commands because of the sheer volume.
This is when you may look into the alternative to create a new empty database where all columns have the correct collation and copy all data from the old database to the new database. If you have issues with data errors or data distortion on a larger scale, you could adapt the copy-script to modify the data while you are copying it. And, who knows, maybe there is a lot of old data in the database that is best purged. If so, you simply don't copy that old data, killing two birds with one stone.
It goes without saying that the bigger your database is, the less appealing this method becomes. However, there can be cases where copying to a new database can be a faster than an in-place change. One particular situation is when you are changing between collations with different code pages, for instance from a legacy code page to a UTF8 collation. If you have large tables with many varchar columns, an in-place change requires you to run one ALTER TABLE ALTER COLUMN for each column. And each of these ALTER TABLE ALTER COLUMN will perform an update of all rows in the table to implement the code-page change. To add insult to injury, the query plan for this operation is likely to be serial. If you have several such large tables, you can run ALTER TABLE statements on multiple tables in parallel. But for the work in a single table, all these ALTER TABLE ALTER COLUMN must run one at a time sequentially, since they lock the table exclusively. Thus, copying the data to empty tables may be faster.
In this chapter I give an outline of the procedure, and I provide some helper queries to generate statements for the important steps. I also briefly discuss possible performance improvements.
The first step is to create a faithful copy of the database. This alone is a challenge, and as with an in-place change, there is a risk that you lose features or configuration if you are not careful. We will discuss a couple of ways to do it.
You may be in the fortunate position that you already have a method of building an empty database from scripts you have in source control. In this case, this is a fairly trivial operation. You only have to create an empty database with the desired collation and then run your creation scripts.
...although, your creation scripts may not include users, certificates and similar items, so you may have to take care of these manually. Don't forget to run your schema-compare tool – it will remind you of many of these things.
There might be quite a few of these out here, but I would expect that the most commonly used is the option in SQL Server Management Studio. You reach it by finding the database in Object Explorer. Then from the context menu, you select Tasks, and then Generate Scripts from that very long submenu. When you use this option, you need to be careful, because by default some objects are not scripted. When you come to the page Set Scripting Options, click the Advanced button to inspect the settings. You probably want to activate most of them.
When it comes to the setting Script collation, this is a case of "it depends". If you want to change the collation of exactly all columns, just leave it false. But if you know that you have columns with deviating collations, and which should remain deviating (for instance some columns may be accent-insensitive, although the database collation is accent-sensitive), enable this option. When you have the final script, run a find/replace to change the collation(s) you want to replace.
Keep in mind that no matter the setting, you need to review the CREATE DATABASE statement on top to make sure that there is a COLLATE clause with the desired collation. You will also need to adjust the file paths.
Once you have executed the resulting script without errors, run your schema-compare tool to verify that the schema is indeed as identical with the source database as it can be.
In the previous section I suggested that you should start testing out an in-place change with a small version of your target database to keep down the turnaround time. Once you have been able to get the in-place change going in the small database, you may prefer to copy the data over to this small database, rather than running the in-place change in the big database. (For one or more of the reasons I discussed above.) Obviously, for this to work out, it does not only have to be a small database – it must be an empty database. If you worked from a test database, you may already have routines for clearing all data.
Then again, you might as well work from your target database, that is as faithful as a schema copy can get. This is how you can do it. Restore a copy of the target database next to the source database. Never mind the disk space – in the end you will need the size to fit two full-size databases anyway. Then generate code to drop and recreate all foreign-key constraints; you can work from the code in ChangeColumnCollation.sql to produce these statements. Drop the foreign keys. Next run TRUNCATE TABLE on all tables and then restore the foreign keys. At this point, you could preserve space by shrinking the database, but that's pointless – your aim is to fill it up again. Run the in-place change of the collation in this empty database, and then perform the data copy as I will outline in the coming sections.
If you worked with a clone created with DBCC CLONEDATABASE, you may think that you could continue working with that clone. You could, but you need to be aware of a few things. First of all, if you take this route, it is essential that you have created the clone with the VERIFY_CLONEDB option. The reason for this is found in the topic for the command: This option is required if the cloned database is intended for production use. If you are uncertain how you created the clone, you can use this query to verify:
SELECT DATABASEPROPERTYEX('YourClone', 'IsClone'), DATABASEPROPERTYEX('YourClone', 'IsVerifiedClone');
Both calls should return 1. If you get 1-0 back, you need to drop your clone and create a new one, this time using VERIFY_CLONEDB.
It is also very important that you read the topic for DBCC CLONEDATABASE closely. The command does not copy exactly everything, and the topic includes a detailed discussion of what it supports and what it does not. Don't forget to make a schema compare!
Before you start copying you need to perform some preparatory steps. Run these queries:
SELECT 'ALTER TABLE ' + quotename(s.name) + '.' + quotename(t.name) + ' DISABLE TRIGGER ALL' FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id SELECT 'ALTER TABLE ' + quotename(s.name) + '.' + quotename(t.name) + ' NOCHECK CONSTRAINT ALL' FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id SELECT 'ALTER INDEX ' + quotename(i.name) + ' ON ' + quotename(s.name) + '.' + quotename(t.name) + ' DISABLE' FROM sys.indexes i JOIN sys.tables t ON t.object_id = i.object_id JOIN sys.schemas s ON s.schema_id = t.schema_id WHERE i.index_id >= 2
Then copy the results into a query window and execute them. This will disable all triggers, all CHECK and FOREIGN KEY constraints and all non-clustered indexes.
There are several reasons why you should do this:
Final check before you start loading: Ensure that the database is in simple recovery. There is little reason to run the data-copy with full recovery. If it fails or is interrupted, you are likely to start over. Also, simple recovery permits for minimally logged operations which can help things to run faster.
And, oh, you might just as well grow the data file(s) to the same size as the source database before you start, so that has been done.
You are now ready to copy the data. You can use this SELECT statement below to generate INSERT statements for all tables:
SELECT CASE WHEN objectproperty(t.object_id, 'TableHasIdentity') = 1 THEN 'SET IDENTITY_INSERT ' + quotename(s.name) + '.' + quotename(t.name) + ' ON ' + nchar(13) + nchar(10) ELSE '' END + 'INSERT ' + quotename(s.name) + '.' + quotename(t.name) + '( ' + substring(c.collist, 1, len(collist)-1) + ') SELECT ' + substring(c.collist, 1, len(collist)-1) + ' FROM YourSourceDB.' + quotename(s.name) + '.' + quotename(t.name) + nchar(13) + nchar(10) + CASE WHEN objectproperty(t.object_id, 'TableHasIdentity') = 1 THEN 'SET IDENTITY_INSERT ' + quotename(s.name) + '.' + quotename(t.name) + ' OFF ' + nchar(13) + nchar(10) ELSE '' END + 'GO' FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id CROSS APPLY (SELECT (SELECT quotename(c.name) + ', ' FROM sys.columns c WHERE c.object_id = t.object_id AND NOT (c.is_computed = 1 OR type_name(c.system_type_id) = 'timestamp') ORDER BY c.column_id FOR XML PATH(''), TYPE).value('.', 'nvarchar(MAX)')) AS c(collist)
Note that you need to replace YourSourceDB with the actual name of your source database.
There are quite a few caveats here. I've exempted timestamp (a.k.a rowversion) columns and computed columns from the column lists, and I've added SET IDENTITY_INSERT ON and OFF for tables with IDENTITY columns. But there may be other column features I have overlooked. See the above as a starting point.
For tables with tens or hundreds of millions of rows, you may not want to copy all rows in a single statement. Even if you have the database in simple recovery, and the TABLOCK hint may help to make the INSERT minimally logged, you may still exceed resource limits. Therefore, it may be better to insert the data in chunks of, say, five million rows at a time. How to do this is nothing I cover here, but if you want inspiration, you can watch a recording of my session Don't Bite Off More than You Can Chew – Take it Chunks. Keep in mind that when settling on a chunk size, you cannot only look at the number of rows, but you also need to consider the number of bytes. If a table has a large LOB column, you may need to cut down on the number of rows per chunk quite a bit.
Moreover, you can speed things up by running things in parallel. A simple-minded approach is to split up the generated script in parts that you run from different query windows. A more sophisticated method would be to write a multi-threaded program where the thread reads statements from a queue table and runs that statement. If you are well versed in SQL Server Integration Services, you may know how to best use SSIS for the task. (Myself, I've never come around to learn SSIS.)
Once you have copied all tables, you may want to check that all rows were actually copied, particularly if you engaged in any fancy chunking scheme. You can of course run a SELECT COUNT(*) all tables, but you can also get a row count of a table this way:
SELECT SUM(p.rows) FROM sys.partitions p WHERE p.object_id = object_id('MySchema.MyTable') AND p.index_id IN (0, 1)
The filter on index_id serves to exclude non-clustered indexes to avoid multiplication of the row count.
How to extend this query to all tables in the database and how to use it to compare two databases is left as an exercise for the reader.
Next, it's time to re-enable indexes, triggers and constraints. Here are the SELECT statements to generate the statements for these operations:
SELECT 'ALTER INDEX ' + quotename(i.name) + ' ON ' + quotename(s.name) + '.' + quotename(t.name) + ' REBUILD' FROM sys.indexes i JOIN sys.tables t ON t.object_id = i.object_id JOIN sys.schemas s ON s.schema_id = t.schema_id WHERE i.index_id >= 2 SELECT 'ALTER TABLE ' + quotename(s.name) + '.' + quotename(t.name) + ' ENABLE TRIGGER ALL' FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id SELECT 'ALTER TABLE ' + quotename(s.name) + '.' + quotename(t.name) + ' WITH CHECK CHECK CONSTRAINT ALL' FROM sys.tables t JOIN sys.schemas s ON s.schema_id = t.schema_id
They are very similar to the queries above. A few notes:
Rebuilding the indexes and enabling the constraints are both operations that can take considerable time. Again, you can win time by running operations in parallel. Here you need to be careful and make sure that you have all ALTER INDEX statements for a table in the same thread. You should probably let all index rebuilds complete before you start validating constraints.
Once all steps have completed successfully, run a schema compare to make sure that nothing is missing or have been left disabled. As with an in-place change – don't forget to test the application!
Don't forget to put the database into full recovery, if it is intended to be a production database.
I'd like to extend a big thanks to Jayakumar R, Philip C, Hugo Kornelis, Walter Charrière, Damon Clark and Bas Roovers who worked with ChangeColumnCollation.sql before me.
If you use the script and you make changes that you think could be of general interest, please send me your updated version, and I will update the file(s) that are attached to this article. Of course, I will give you credit for the changes! My email address is esquel@sommarskog.se.
If you are working with a collation-changing project and you run into problems, you can ask in a public forum or drop me a line (email address above), and I will respond as my free time permits. I may not have the time to write queries to support feature X, but maybe I can give you some tips to get going.
And, as always with my articles, if you see any spelling or grammar errors, don't hesitate to tell me.