Dynamic Data Masking (DDM), a valuable feature added in SQL Server 2016. Hiding sensitive data from unauthorized users is achievable using DDM. Most importantly, Masking of data and encryption are two different ways of securing data. Dynamic Data Masking uses different strategies to hide data – benefiting by not modifying stored data as Encryption feature in SQL Server.
Dynamic Data Masking can be extensively used in Healthcare, Banking domain with stringent measures to maintain data confidentiality from unauthorized entities. DDM enables us to secure existing stored data without modifying any application code and queries.
As a case study, a developer should not be exposed to production data which has customer’s health & banking records like his identity, contacts, credit card number and financial history. Neither a third party sales representative should know customer’s insurance history that can be used for marketing their own products.
Therefore, Dynamic Data Masking can help us to limit exposing confidential data in plain text format to any unauthorized users even though they have a read access.
Create New User To Test Masking Functionality
Before proceeding with Dynamic data masking queries. We shall create a new user named MaskedTestUser having only SELECT permission on tables created under dbo user. We shall create tables under dbo user using masked columns functionality and query them using MaskedTestUser to view masked data. Assuming MaskedTestUser is an unauthorized user who should not have access to actual data.
DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN;
Masking Functions & Supported Data Types In SQL Server
SQL Server has incorporated four different functions to mask sensitive data. Designed to work with all possible data types like string, numbers and date along with some special data types like XML, varbinary and hierarchyid.
Let’s go through each of them to understand their functionality and use.
1. Default
Full masking of data is achievable using default masking function. Works with all possible data types in SQL Server. Masked data is represented as per the masked column’s data type. String data types are represented using XXXX characters like IdentityNumber and Phone column. Numeric data types are shown as zero number. A date is displayed with default date ‘1900-01-01’ like DateOfBirth column.
Let’s populate sample data with masking columns for demonstration purpose.
DROP TABLE IF EXISTS DefaultMaskTest; CREATE TABLE DefaultMaskTest ( ID INT IDENTITY (1,1) PRIMARY KEY NOT NULL ,DefaultMask_Varchar VARCHAR(255) MASKED WITH (FUNCTION = 'default()') NULL ,DefaultMask_Char CHAR(1) MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_Bit BIT MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_Date DATE MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_DateTime DATETIME MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_Time TIME MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_Integer BIGINT MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_Decimal DECIMAL(9,2) MASKED WITH (FUNCTION = 'default()') NOT NULL ,DefaultMask_XML XML MASKED WITH (FUNCTION = 'default()') NOT NULL ); GO INSERT INTO DefaultMaskTest ( DefaultMask_Varchar, DefaultMask_Char, DefaultMask_Bit, DefaultMask_Date, DefaultMask_DateTime, DefaultMask_Time , DefaultMask_Integer, DefaultMask_Decimal, DefaultMask_XML ) VALUES ( 'Chetan Sharma', 'M', 1, '2020-06-12', '2021-06-12 12:23:32:543', '08:12:46:342' , 5282991, 45628.39,'<root>Tech-Recipes</root>' );
Following queries will show us how masked data will be displayed to MaskedTestUser and dbo user.
--Drop & Create User - MaskedTestUser DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN; --Query table using dbo user SELECT * FROM DefaultMaskTest; --Grant SELECT permission to MaskedTestUser GRANT SELECT ON DefaultMaskTest TO MaskedTestUser; --Query table using MaskedTestUser EXECUTE AS USER = 'MaskedTestUser'; SELECT * FROM DefaultMaskTest; --Revert user impersonation to dbo user REVERT;
2.Partial
Partial function works with only string data types. Partial masking of data shows few characters either at the beginning or end of the string. A custom string can be embedded in between instead of using default ‘XXXX’ characters. In the above example, CreditCard column was masked using partial(2, “XXXXXXX”, 0) method thus showing the first two characters only and adding X character which is customizable.
Let’s populate sample data with masking columns for demonstration purpose.
DROP TABLE IF EXISTS PartialMaskTest; CREATE TABLE PartialMaskTest ( ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL ,PartialMask_Varchar VARCHAR(255) MASKED WITH (FUNCTION = 'partial(1, "XXXX",1)') NOT NULL ,PartialMask_Nvarchar NVARCHAR(255)MASKED WITH (FUNCTION = 'partial(2, "ABCDEFG",3)') NOT NULL ); GO INSERT INTO PartialMaskTest ( PartialMask_Varchar ,PartialMask_Nvarchar ) VALUES ( 'I am on tech-recipes.com' ,'SQL Server keep rocking' );
Following queries will show us how masked data will be displayed to MaskedTestUser and dbo user.
--Drop & Create User - MaskedTestUser DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN; --Query table using dbo (owner) user SELECT * FROM PartialMaskTest; --Grant SELECT permission to MaskedTestUser GRANT SELECT ON PartialMaskTest TO MaskedTestUser; --Query table using MaskedTestUser EXECUTE AS USER = 'MaskedTestUser'; SELECT * FROM PartialMaskTest; --Revert user impersonation to dbo user REVERT;
3.Email
Email, a specific function included masking email ids, considering important sensitive data. Email function works with string data types only. Masking email id and domain name to ‘[email protected]’ keeping the first character intact and changing any domain names to .COM. EmailId column in Customer table was masked from [email protected] to [email protected]. Even partial function can replicate the use of Email function as partial(1,’[email protected]’,0).
Let’s populate sample data with masking columns to see Email function in action.
DROP TABLE IF EXISTS EmailMaskTest; CREATE TABLE EmailMaskTest ( ID INT IDENTITY (1,1) PRIMARY KEY NOT NULL ,EmailMask VARCHAR(255) MASKED WITH (FUNCTION = 'email()') NOT NULL ); GO INSERT INTO EmailMaskTest ( EmailMask ) VALUES ('[email protected]'), ('[email protected]'), ('[email protected]');
Following queries will show us how masked data will be displayed to MaskedTestUser and dbo user.
--Drop & Create User - MaskedTestUser DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN; --Query table using dbo (owner) user SELECT * FROM EmailMaskTest; --Grant SELECT permission to MaskedTestUser GRANT SELECT ON EmailMaskTest TO MaskedTestUser; --Query table using MaskedTestUser EXECUTE AS USER = 'MaskedTestUser'; SELECT * FROM EmailMaskTest; --Revert user impersonation to dbo user REVERT;
4.Random
Masking numeric data type columns using random numbers against original values. Allowed to generate random numbers between a defined range. Salary column in Customer table is masked using random(1,10) – generating random numbers between 1 to 10 only. We can define a decimal range using random(0.1,0.75).
DROP TABLE IF EXISTS RandomMaskTest; CREATE TABLE RandomMaskTest ( ID INT IDENTITY (1,1) PRIMARY KEY NOT NULL ,RandomMask_INT INT MASKED WITH (FUNCTION = 'random(1,999)') NOT NULL ,RandomMask_BIGINT BIGINT MASKED WITH (FUNCTION = 'random(1000,2000)') NOT NULL ,RandomMask_DECIMAL DECIMAL(9,2) MASKED WITH (FUNCTION = 'random(1.1,10.5)') NOT NULL ); GO INSERT INTO RandomMaskTest ( RandomMask_INT ,RandomMask_BIGINT ,RandomMask_DECIMAL ) VALUES (33405691, 401204193524, 311531.56);
Following queries will show us how masked data will be displayed to MaskedTestUser and dbo user.
--Drop & Create User - MaskedTestUser DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN; --Query table using dbo (owner) user SELECT * FROM RandomMaskTest; --Grant SELECT permission to MaskedTestUser GRANT SELECT ON RandomMaskTest TO MaskedTestUser; --Query table using MaskedTestUser EXECUTE AS USER = 'MaskedTestUser'; SELECT * FROM RandomMaskTest; --Revert user impersonation to dbo user REVERT;
Masking Function & Data Type Compatibility
Masking functions work with their supported data types only. While trying to use any masking function with an incompatible data type, SQL Server generates the following error. When we tried to use random function with a character data type. Random function is only compatible with numeric data types.
Msg 16003, Level 16, State 0, Line 21
The data type of column ‘IdentityNumber’ does not support data masking function ‘random’.
Querying Masked Columns In Database
Added sys.masked_columns view includes all masked columns in a database. We can use it to query masked column name, masking function and associated table name.
SELECT OBJECT_NAME(mc.object_id) as TableName ,mc.name as ColumnName ,TYPE_NAME(system_type_id) as DataType ,mc.is_masked as IsMasked ,mc.masking_function as MaskingFunction FROM sys.masked_columns as mc WHERE mc.is_masked = 1;
Adding Multiple Masking Function to Create Table Statement
We’ve already gone through each masking function in detail with helpful examples. Let’s create a table with all masking function as per requirement and see the output.
Understanding user roles and permission is important. MaskedTestUser doesn’t have permission to view actual data thus masked data is shown to him whereas dbo who is authorized user and owner of Customer table can see all the available data in plain text.
Use tempdb; DROP TABLE IF EXISTS Customer; CREATE TABLE Customer ( Id INT IDENTITY(1,1) ,DateOfBirth DATE MASKED WITH (FUNCTION = 'default()') NOT NULL ,EmailId VARCHAR(255) MASKED WITH (FUNCTION = 'email()') NOT NULL ,IdentityNumber VARCHAR(11) MASKED WITH (FUNCTION = 'default()') NOT NULL ,Phone VARCHAR(11) MASKED WITH (FUNCTION = 'default()') NOT NULL ,Salary INT MASKED WITH (FUNCTION = 'random(1,9)') NOT NULL ,CreditCard VARCHAR(20) MASKED WITH (FUNCTION = 'partial(2,"XXXXXX",0)') NOT NULL ); INSERT INTO Customer (DateOfBirth, EmailId, IdentityNumber, Phone, Salary, CreditCard) VALUES ('1985-10-28', '[email protected]', 'BZVLPE1258Q', '98564533213', 85000, '9764-3451-0916-1047');
Following queries will show us how masked data will be displayed to MaskedTestUser and dbo user.
--Drop and Create MaskedTestUser DROP USER IF EXISTS MaskedTestUser; CREATE USER MaskedTestUser WITHOUT LOGIN; --Query table using dbo user SELECT * FROM Customer; --Grant SELECT permission to MaskedTestUser GRANT SELECT ON Customer TO MaskedTestUser; --Query table using MaskedTestUser EXECUTE AS USER = 'MaskedTestUser'; SELECT * FROM Customer; --Revert user impersonation to dbo user REVERT;
Adding Masking Function to Existing Column In Table
ALTER TABLE Customer DROP COLUMN IF EXISTS AccountNumber; ALTER TABLE Customer ADD AccountNumber INT; ALTER TABLE Customer ALTER COLUMN AccountNumber ADD MASKED WITH (FUNCTION = 'random(1000,5000)');
Summary
Dynamic data masking is useful to hide sensitive data from unauthorized users. We can use this in cases where data confidentiality is important to business.