Posts for: #PII

Scanning for PII with dbatools

Recently a brand new command was released that could help you scan for PII (Personal Identifiable Information) in our databases.

What Is Personally Identifiable Information (PII)?

Personally identifiable information (PII) is like the name implies, data that can be used to identify a person. It is typically actively collected, meaning the information is provided directly by the individual. Here are a couple of identifiers that qualify as PII-based data:

  • Name
  • Email address
  • Postal address
  • Phone number
  • Personal ID numbers (e.g., social security, passport, driver’s license, bank account)

Why is this command developed

The idea came from a line of commands that are present in dbatools to mask data. Although these commands are great, going through all of the tables and looking through the data was a bit tedious for me. Especially when you’re dealing with databases that have hundreds to thousands of tables, you easily run into the thousands to tens of thousands of columns. So that’s how I came up with the command to scan for PII and it’s called Invoke-DbaDbPiiScan and is present in dbatools from version 0.9.819. The command returns all the columns that potentially contain PII. I must say potentially because the results still need to be assessed if it indeed contains PII. But it takes care of eliminating the majority of the columns saving you a lot of time. This information is very valuable when you have to deal with the GDPR, but also when you have to deal with things like HIPAA.

Read more →

SQL Tuesday #110 Deterministic masking with dbatools

The dbatools module recently got a couple of new commands mask data in their databases. One feature with the masking commands that were not yet put in was deterministic masking.

What is deterministic masking

Deterministic masking is the process of replacing a value in a column with the exact value across tables.

For example, a database has multiple tables with a column that has first names. With deterministic masking, the first name that’s present will always be replaced with the same value.

Read more →

Data masking with dbatools

Recently I developed a few PowerShell commands to make it possible to enable data masking for databases.

The commands were originally written for the module PSDatabaseClone to enable users to automatically mask the data for a database image.

The reason the commands were created was that the cloning process would otherwise expose production data to other users which is not preferable.

The commands were released and picked up by Chrissy LeMaire who implemented them in dbatools and even improved them.

Read more →