It’s Time to Organize: How to Inventorize Your Mess of Data
When was the last time you tried to inventorize your closet?
Inventorizing (or organizing, in plain speak) your closet can be a frightening proposition. Lots of clothes, all in the wrong places, some in boxes with labels, some in boxes without labels, some labeled incorrectly altogether.
Oh, and don't forget about the clothes that made their way into your kid’s closet. How did they get there? And that poncho — which doesn't even resemble something a human could wear — tucked away in your linen closet. Let's not even mention the clothes that are, for some unknowable reason, bundled up in the trunk of your car. When you need to find something specific in this mess, you’re in real trouble.
Taking Inventory of Your Data
When was the last time you tried to inventorize your data?
Though a whole lot of brainpower has, of late, been used upon a little issue called the coronavirus, it’s thankfully winding down. That means it’s time to return to our regularly scheduled program called “Oh Man, It's Almost Time For CCPA.”
Yes, you might have forgotten about CCPA, which went into effect on Jan. 1, and will be enforced starting July 1. If your annual revenue is more than $25 million, if your business holds data on 50,000 consumers or more, if your business makes more than 50% of its annual revenue from selling data, and you serve residents of the Golden State of California, regardless of where your company is located, you need to make sure you're CCPA compliant.
Being compliant means that when Sandy from San Diego submits a Subject Rights Request (SRR), asking you to ...
- Tell her exactly what you've collected about her.
- Delete everything you hold on her.
- Prevent her data from being sold.
- Provide her with copies of her records.
... you are able to comply with her requests within 45 days of receipt.
But in order to fulfill her request, you have to know where all that data resides. This means all data — structured and unstructured, known and unknown, in motion and at rest. And this is where things get tricky if you haven't established a robust data inventory process. According to Gartner, 80% of data is unstructured and thus, is not where you’d expect it to be. Data is likely to be stored in odd places like legacy databases, unstructured text files, structured data files, archive files, audio files, video files, etc. So in this mess, how can you possibly find what you're looking for, let alone create a semblance of structure for the future?
Dynamic Data Is Messy
Most compliance solutions protect areas that are known to contain sensitive data. But data usage is dynamic and it evolves over time. For example, last week, all you had on file for Sandy was her name and email address. Today, she made a purchase, so now, you hold new sensitive data on her. And this new data needs to be discovered, mapped, tracked and aggregated. With each consumer interaction, the personal information you hold changes and yesterday's network isn't the same as today’s.
What this all adds up to is that it's impossible to account for all the various places your data might have been scattered to — and yet, in order to comply with CCPA, and to fulfill Sandy’s request in a timely manner, you need to do exactly that.
Related Article: A Responsive Data Strategy Is More Critical Now Than Before
Finding it All When the Subject Rights Requests Come Knocking
Remember our closet, desperately in need of organizing?
There you are, surrounded by a sea of shirts, pants, socks and other semi-square pieces of cloth. Order is long overdue but it's overwhelmingly impossible to put into practice when you don’t know which boxes are marked properly, or if the one that says “winter shirts” might, in actuality, contain ancient baby clothes. You need something to help you clean up the mess by automatically finding all the bags, boxes and piles, even if they are in the least likely of places. Even if they aren't stored in the right format. And even those items you forgot you had, like that holiday sweater from Aunt Tildy.
Now back to that Subject Rights Request: To be able to fulfill Sandy’s SRR, you must be able to locate all the data you hold on her — from the structured to the unstructured, to the known and unknown, to the data in motion and at rest regardless of where it's located, even when you don't know you were holding that data.
To do this effectively, you need to automatically discover, map, track and aggregate sensitive data and its flow. By monitoring and analyzing traffic constantly, you can determine how sensitive data is processed, to build a complete picture of how your data is stored, processed and shared in real time. This will help you maintain compliance and identify and reduce risks before they become real problems.
Related Article: Data Ingestion Best Practices
Your End Goal: A Dynamic and Accurate View of Data
In compliance and in closet-keeping, continuously maintaining a dynamic and accurate view of what you have is critical to extracting what you need in a reasonable window of time. In compliance, it will also ensure you avoid the fines and potential lawsuits that come along with failing to fulfill requests in time (let’s hope no one’s planning on suing you for failing to find their favorite dress shirt in a timely fashion). The right tools not only contain the mess, they make it possible to create a data program wherein you effortlessly can know about all the PI you hold on your customers.
About the Author
Itzhak Assaraf is CTO and cofounder of 1touch.io and has more than 20 years of experience in all aspects of technology, software, network, security and hardware. Prior to joining 1touch.io, he spent a decade running a successful software house that specialized in startup innovation and solving enterprise challenges using technology.