Scanning for PII with dbatools

Recently a brand new command was released that could help you scan for PII (Personal Identifiable Information) in our databases.

What Is Personally Identifiable Information (PII)?

Personally identifiable information (PII) is like the name implies, data that can be used to identify a person. It is typically actively collected, meaning the information is provided directly by the individual. Here are a couple of identifiers that qualify as PII-based data:

  • Name
  • Email address
  • Postal address
  • Phone number
  • Personal ID numbers (e.g., social security, passport, driver’s license, bank account)

Why is this command developed

The idea came from a line of commands that are present in dbatools to mask data. Although these commands are great, going through all of the tables and looking through the data was a bit tedious for me. Especially when you’re dealing with databases that have hundreds to thousands of tables, you easily run into the thousands to tens of thousands of columns. So that’s how I came up with the command to scan for PII and it’s called Invoke-DbaDbPiiScan and is present in dbatools from version 0.9.819. The command returns all the columns that potentially contain PII. I must say potentially because the results still need to be assessed if it indeed contains PII. But it takes care of eliminating the majority of the columns saving you a lot of time. This information is very valuable when you have to deal with the GDPR, but also when you have to deal with things like HIPAA.

Read more →

The double-edged sword of open-source

I got involved in a discussion about open-source software. A maintainer of an open-source project handed over the reins to another person and the other person changed the software to include a coin/mining exploit. This got me thinking about the double-edged sword of open source.

Where did open-source originate from?

A little history lesson about open-source projects. Open-source came to be in 1998 when it was developed after Netscape’s announcement that the software for the Navigator software was going to be publicly released. The term got more momentum during the Freeware Summit organized by Tim O’Reilly in April 1998. The highlight was when the Open Source Summit which is known as the birth of open-source.

Read more →

T-SQL Tuesday #89: The times they are a-changing

sql tuesday

This month’s T-SQL Tuesday is inspired by the blog post Will the Cloud Eat My DBA Job? by Kendra Little.

Technology has changed a lot in the past years, especially with cloud/globalization/automation. What impact has this had on your job? Do you feel endangered? Over the years I’ve seen a lot of things change with SQL Server.

I remember that somebody told me when I started with SQL Server 200 that T-SQL was never going to be a big thing. How wrong was that guy right?!

Read more →

Deterministic masking with dbatools

The dbatools module recently got a couple of new commands mask data in their databases. One feature with the masking commands that were not yet put in was deterministic masking.

What is deterministic masking

Deterministic masking is the process of replacing a value in a column with the exact value across tables. For example, a database has multiple tables with a column that has first names. With deterministic masking, the first name that’s present will always be replaced with the same value.

Read more →

SQL Tuesday #110 Deterministic masking with dbatools

The dbatools module recently got a couple of new commands mask data in their databases. One feature with the masking commands that were not yet put in was deterministic masking.

What is deterministic masking

Deterministic masking is the process of replacing a value in a column with the exact value across tables.

For example, a database has multiple tables with a column that has first names. With deterministic masking, the first name that’s present will always be replaced with the same value.

Read more →