In a recent independent poll of people attending a data privacy seminar, almost 50% of the respondents claimed that their company had experienced a data breach in the past 2 years. This is quite alarming, don’t you think? It’s no secret that the concern for data privacy has grown exponentially in the last few years.
This growing concern has spawned a number of privacy regulations, including but not limited to HIPPA, GLBA, FTC, Safe Harbor, PIPEDA ( in Canada ), and the Australian Privacy Act. In fact, the FTC ( Federal Trade Commission ) has recently pursued a number of companies, because of their insufficient practices around protecting sensitive consumer data ( i.e. credit card numbers ). In almost all cases, when the dust settles, these companies involuntarily signed themselves up for a 10 – 20 year sentence of mandatory bi-annual security audits!
Another byproduct of the growing data privacy concern, is the creation of the Generally Accepted Privacy Principles ( GAPP – sometimes pronounced “gap-p” ). You may already be familiar with the Generally Accepted Accounting Principles ( GAAP – sometimes pronounced “gap” ) that accountants use when building and auditing financial records. Although the new Privacy Principles come out of the same group ( American Institute of Certified Public Accountants or AICPA ), they should not be confused with their more familiar accounting based rules. These principles were created specifically for the purpose of helping companies keep their data privacy practices under control.
GAPP is quickly becoming the de-facto standard framework for privacy control, so I’d like to use this framework to discuss how we might architect some compliance data system solutions to support the framework, and reduce your company’s risk of data privacy violations.
GAPP has a total of 66 principles split across 10 categories:
-
Management
-
Notice
-
Choice and Consent
-
Collection
-
Use and Retention
-
Access
-
Disclosure to 3rd Parties
-
Security for Privacy
-
Quality
-
Monitoring and Enforcement
Of particular interest to me today is Principle 8 – Security for Privacy. According to GAPP, “
-
What makes a piece of data sensitive?
-
How do I know who’s authorized to see the data?
-
How do I know if the data has been viewed?
It’s important to avoid getting ahead of yourself in the solutioning here. Stay focused on the definition of a violation, however you know there’s more thought work right around the corner.
For now, we’ll say that a piece of data is sensitive, when it has a SENSITIVE_FLAG set. Of course, this implies that you have a master data dictionary available in your database, that catalogs all of the data points. This is a good idea anyway, and well serve you well in your data governance efforts.
So, imagine a metadata table that has the following columns:
-
DATA_POINTS
-
DATA_POINT_ID
-
DATA_POINT
-
DATA_TYPE
-
DATA_LENGTH
-
DESCRIPTION
-
SENSITIVE_FLAG
As noted above, our SENSITIVE_FLAG was added to aid our data privacy solution, so if this is set to ‘Y’ then we know we need to monitor it for violations. Here’s a sample row in that table:
Now that we have that solved, what should we do about authorized viewing? In my mind, authorization is determined by the person viewing it. Not just anybody can view sensitive data, so there must be rules around who can see a sensitive piece of data. Although the inclination might be to use roles, I wouldn’t rely solely on them. I would tag the individual in the viewing event, then make a second pass to understand if that individual was authorized to view the data. You can always use role as a justifier, but keep the grain at the individual level.
Which leads us to the viewing event. You will need to capture the fact that this data was viewed. This may be easy, or this may be hard – of course it depends on your transactional system’s architecture. The thought process is pretty straight forward however. Once again, there needs to be a catalog of where each data point lives on each screen that can be viewed, and there needs to be a mechanism that records when each screen is viewed. The catalog can live outside of the transactional system, but the mechanism that records a screen viewing must be embedded in there somehow. If you system does this already, that’s great, just downstream from there. If not, you may have to use a logging or triggering mechanism to store this information in your database. While you’re there, keep in mind that at a minimum, you need to know the screen that was viewed, who was viewing it, and the date and time of the viewing.
Once we have this data, I think we’re in the clear. Although the transactional system is capturing every screen viewing, I would only downstream screen views where sensitive data was involved. Of course, you can do this my leveraging your new SENSITIVE_FLAG column in your metadata table. Here’s the table our first transformation produces:
-
SENSITIVE_DATA_VIEWS
-
VIEWING_ID
-
VIEWING_DATE
-
SCREEN
-
EMPLOYEE_ID
-
DATA_POINT_ID
Then, using our authorization formulas which key on EMPLOYEE_ID, we can build a violations table:
This would highlight any violations encountered, due to unauthorized viewing of sensitive data.
I would continue down this line of solutioning until my privacy solution was complete. Of course, this only protects you from application level risks. This does nothing to protect you against a savvy technical person that can simply bypass the application layer. I’ll save that discussion for a later date. That said, it’s still very much worth pursuing, as it will reduce your risk and demonstrate that you are committed to taking measures to control data privacy.
I know this only scratches the surface of data privacy, but I hope it gives you a platform to build from, and highlights the importance of proving your innocence when it comes to data privacy auditing. Data privacy is a huge concern, and if you company isn’t talking about it yet, please urge them to start.