Last Updated on June 17, 2020 by Ajit Singh
These days, organizations store vast amounts of data. In fact, 65% of companies are collecting too much data, and are unable to find the time or the resources to analyze it. Not surprisingly, 54% of organizations didn’t know where all of their sensitive data is located. If businesses are unable to identify exactly what data they have and how it is being used, how are they supposed to protect it? The answer is, they can’t.
Data classification is not an area of data security that gets a lot of attention. Yet, having a deep understanding of what data we store, where it is located, and who has access to it, is paramount to keeping it secure.
Data classification is the process by which we categorize data based on pre-defined levels of sensitivity. It gives us the visibility we need to setup access controls to protect our most critical assets and enables us to keep track of how those assets are being used. For example, financial institutions store large amounts of sensitive information relating to their customers, such as information about income, loans and mortgage applications, as well as less sensitive information such as their name, date-of-birth and home address. Naturally, different types of information will have their own level of sensitivity, and thus different protocols are required to protect them.
While every organization should define categories that relate to their specific needs, most will conform to a common structure. Common categories include protected, sensitive, confidential, and public. Having a classification schema that reflects the type of data we store is crucial to determining how we manage that information.
There are also legal implications associated with failing to adequately classify one’s data. For example, under the GDPR, EU citizens have elevated rights, which include the right to access, modify, move or delete their personal data. If an organization is unable to fulfil a Subject Access Request (SAR), they may be subject to fines, or some form of disciplinary action. Data belonging to EU citizens could be classified in such a way that enables it to be located in a fast and efficient manner.
Data protection regulations, such as the GDPR, are primarily focused on data privacy. Data privacy relates more to how data is used, as opposed to how the data is secured. For example, under the GDPR, EU citizens have a right to be informed about automated decision making and profiling and have a right to object if they feel that their privacy is a stake. In this case, data classification can be used to shield certain categories of data from the algorithms used for profiling and decision making.
When it comes to data security, it’s never a good idea to store data that we don’t really need. Yet, as mentioned above, many do. According to a report published by Veritas Global Databerg, “85% of stored data is either dark or redundant, obsolete, or trivial (ROT)”. When you store vast amounts of unstructured data, and you’re not entirely sure what the data is, who it belongs to, or why it was collected, it’s understandable why organizations are hesitant to hit the delete key. However, using an automated data classification tool, the process of removing redundant data can be greatly simplified. These tools can scan a wide range of file types, such as word documents and excel spreadsheets, and classify a wide range of data types, such as social security numbers, protected health information (PHI), and payment card information (PCI). Additionally, after the initial scan, files containing sensitive data can be classified at the point of creation.
There are many ways in which data classification can help to streamline data security operations – enabling security teams to focus on what is important – saving both time and money. To find out how the Lepide Data Security Platform can help you discover and classify data, protect data and detect/react to threats, schedule a demo today.