Find columns with special characters
Table of Contents
Problem⌗
Imagine a database where you want to know how many times a certain column contains a special character. The reason I had to do this was that I needed to know the impact of an export interface that wasn’t working properly with special characters.
But as lazy as I am when it comes to doing repetitive tasks, I had to find a way could easily get all columns with special characters in a data set without a lot of hassle.
The solution⌗
Fortunately for me, I can use the internal INFORMATION views of a database in SQL Server to get all the columns with their table, schema, and type of data.
Because this database only used varchar columns for saving text data I only had to search for this data type. You can easily extend the selection by adding the necessary types like TEXT, NVARCHAR, etc.
To create the script I had to accomplish the following things:
- Retrieve all the columns that could contain text
- Set up a pattern to check the columns with
- Loop through the columns
- In every loop save the results
- Make up a data set of the data retrieved
I eventually got to this script:
-- Replace the mydatabase with the name of your database
USE [mydatabase]
-- Delete possible present temporary tables
IF OBJECT_ID('tempdb..#columns') IS NOT NULL
DROP TABLE #columns
IF OBJECT_ID('tempdb..#results') IS NOT NULL
DROP TABLE #results
IF OBJECT_ID('tempdb..#rows') IS NOT NULL
DROP TABLE #rows
-- Create a table for holding all the columns
CREATE TABLE #columns
(
id INT NOT NULL PRIMARY KEY IDENTITY(1, 1),
database_name VARCHAR(100),
schema_name VARCHAR(50),
table_name VARCHAR(200),
column_name VARCHAR(200),
position INT
)
-- Create a table for holding the results
CREATE TABLE #results
(
database_name VARCHAR(100),
schema_name VARCHAR(50),
table_name VARCHAR(200),
column_name VARCHAR(200),
count INT
)
-- Declare the variables needed
DECLARE @max INT,
@cnt INT
DECLARE @sql NVARCHAR(max),
@pattern VARCHAR(200)
-- Get all the columns
INSERT INTO #columns
SELECT table_catalog,
table_schema,
table_name,
column_name,
ordinal_position
FROM information_schema.columns
WHERE data_type = 'varchar'
ORDER BY table_catalog,
table_name,
ordinal_position
-- Get the minimum id and max that are needed for the loop
SELECT @cnt = Min(id)
FROM #columns
SELECT @max = Max(id)
FROM #columns
--Setup the pattern
SELECT @pattern = 'ëïöüäéíóáèìòàËÏÖÄÉÍÓÁÈÌÒÀ'
-- Make sure the @sql variable is empty
SELECT @sql = ''
-- Start loop
WHILE( @cnt < ��@max )
BEGIN
-- Setup the sql statement
SELECT @sql = 'INSERT INTO #results SELECT ''' + database_name + ''', '''
+ schema_name + ''', ''' + table_name + ''', '''
+ column_name + ''', COUNT(1) AS count FROM [' + database_name + '].['
+ schema_name + '].[' + table_name + '] WHERE ['
+ column_name
+ '] COLLATE Latin1_General_BIN2 LIKE ''%['
+ @pattern + ']%'' ESCAPE ''E'' '
FROM #columns
WHERE id = @cnt
-- Execute the sql statement
EXEC sp_executesql
@sql
-- Increase the counter
SELECT @cnt += 1
END
-- Select the data
SELECT results.database_name,
results.schema_name,
results.table_name,
results.column_name,
results.count AS 'count_results'
FROM #results results
WHERE results.count >= 1
This might take a while when you need to search through a lot of columns and records.
When the script you will get a result like the below:
The pattern can contain any character that you want. I only searched for some of the most common special characters but any character will do. I hope you find this post helpful and I appreciate any comments.