Unstructured data consists of any type of data that exists beyond the scope of an organization’s application or database. Such data includes things like word documents, audio files, videos, photos, webpages, presentations, and so on. The amount of unstructured data that companies store has exploded in recent years due to the rapid increase in storage capabilities. While it is true that many companies are still not engaging in any form of unstructured data governance – largely due to the fact that they already struggling to govern their structured data – more companies are at least starting to recognize the value of this type of data.
As companies grapple with a barrage of complex and varied types of data, they are seeking new and improved methods to help them get their house in order. There are a number of tools available which can help them process and store vast amounts of unstructured data. Such tools typically provide dashboards, data mining features, as well as searching and indexing features. There are also hardware solutions available.
Discovery and Classification
The main problem companies have in trying to keep track of unstructured data is that they don’t know where to start, as they often don’t know what they are looking for or how to accurately assess the value of the data they hold. It should be noted, however, that it is not necessary to govern every piece of unstructured data that exists, only the most important types of data. It is therefore very important that you classify your data based on a set of pre-defined categories. These categories typically include: public, internal and restricted data. Of course, you will first need to discover your sensitive data, and then implement a data classification policy which outlines the objectives, workflows, categories, data owners and details about how the data should be correctly handled.
There are a number of commercial tools available which can help you discover your sensitive data. Such tools can discover data in network file shares, SharePoint, Dropbox, OneDrive etc. They often come with features that enable you to identify the value of the data, classify the data based on its value, and apply protection measures to the sensitive data (PCI, PII, PHI) based on certain properties. They typically come with tools which enable you to quarantine or flag certain files that are stored in a manner which may present a security threat. They also usually come with built-in analytics and reports. Due to the sheer amount of unstructured data that typically resides on a corporate network, you will also need to ensure that you have the most sophisticated suite of auditing tools available. In order to sufficiently protect your sensitive data, there are certain questions you will need to ask, which include:
- Who has access to what files, and what privileges do they have?
- Who has been viewing, modifying and deleting these files?
- Why/When were these files accessed, modifying or deleted?
So, Where Do You Start?
You might think that implementing a data governance program will be expensive, time-consuming and resource intensive, and there are many vendors out there that would have you believe this is the case. Fortunately, you probably already have the tools to get started. The File Classification Infrastructure in File Server Resource Manager is terribly underutilized, and actually provides a pretty powerful method of discovering, tagging and classifying your sensitive data. Simply input a load of regular expressions related to all manner of PII (you can find a list here), and FSRM will continually scan and enable reports to be generated listing your sensitive data and the relative criticality.
This is a fantastic place to start. Now you will know where your most sensitive data is. However, it’s only the first step and unfortunately, until Windows builds the functionality in, you will have to look for third parties to make sense of this data. Many vendors will try and charge you extortionate prices to implement data governance solutions, and that’s because they believe the discovery and classification element to be very valuable. If you already have this in place through FSRM, then you can look for vendors that integrate this functionality already.
This is where Lepide comes in. Lepide’s File Server Auditing solution has a built-in integration with FSRM that enables you to run reports and set alerts on changes occurring to critical, at risk data. This means that if a file containing vast amounts of PII is accessed, moved, deleted or modified in any way, you’ll know. You are also able to see who has permissions to the critical files and folders and when these permissions change. Best of all, the integration with FSRM comes for free with LepideAuditor, making it a very competitive choice for a Data Access Governance solution. Take a look for yourself!