The metaphor of “capture the flag” is ubiquitous in cybersecurity. It epitomizes the adversarial notion of having something of value that an aggressive opponent seeks to take for themselves. There are only a few underlying motives for mounting a cyberattack: disruption/damage/warfare, espionage, and financial gain. Each of these motives involves theft of data to some degree.
Data Security is loosely defined as protecting digital data from data breaches and cyberattacks. While a breach is the unauthorized opening of data with the goal of copying or reading the information, a cyberattack is a more aggressive scenario that causes damage or interruptions as a product of attempting a data breach.
In this primer we’ll look at various methods for securing and protecting data, as well as technologies available that can make the job easier.
Data Encryption
One of the most common and widely used security controls is data encryption. It’s a foundational technique that ensures data is only readable by the intended audience.
When data is encrypted, it can only be decrypted using a key. Encrypted data is stored as ciphertext, appearing unreadable or scrambled to anyone attempting to read it without first decrypting it.
Encryption of data transmitted over the Internet is the norm, especially web traffic where authentication form submissions or any communication that transfers sensitive data such as passwords or financial information must be private. If it uses the https protocol, it’s using SSL to encrypt all data sent over http by using an encryption layer.
The intent is to protect data from being intercepted by other listeners on a network, or a “man in the middle” sniffing traffic between the origin and the destination.
Full Disk Encryption
Data encryption is also used in storage of data. Full disk encryption (FDE) applies encryption to an entire storage volume or device. requiring a passkey for read or write. This can include data, the operating system, files, and applications.
Since some cybercrimes involved physical access of devices, restricting storage volume reads at the lowest level is a compulsory step. In the event a device is stolen, FDE prevents access of storage without credentials, and therefore any data contained therein.
Network storage volumes can also use encryption, as in the case of cloud computing providers such as AWS or Microsoft Azure where logical/virtual storage volumes exist on the same physical hardware.
Before diving in with device encryption, it’s best to establish an encryption policy that will define which data types and devices require encryption.
Strategies for handling data
Modern organizations generate and manage massive amounts of data, requiring a measured approach to controlling how data is stored, read, tracked, and in some cases, intentionally destroyed.
This requires a methodology for determining which data should be controlled and how. It begins with identifying the different types of data and classifying them.
It should be said that data security is largely an institutional thing. It requires employees to utilize processes and tools above and beyond normal data handling when interacting with files, saving new data, and sharing data with internal or external parties. It requires a level of comprehension behind the motivation to protect data so that employees can become willing participants rather than view data security measures as some impediment to be circumvented.
Data Classification
Data classification is the process of analyzing data in order to categorize its contents to determine the degree to which it will be secured or handled.
There are three primary types of data classification that most businesses see as industry standards:
- Content-based
- Context-based
- User-based
In order to identify and control data in an organization, it’s necessary to apply some rules of classification.
Structured and Unstructured Data
Unchecked data growth and disorganization have been proven to dramatically increase cyber risk, including ransomware, breaches, and compliance violations. Data can be categorized as one of two types: structured or unstructured, and each poses specific challenges for defining, categorizing, and storing it.
Structured Data
Structured data is more easily processed and categorized since it is alphanumeric. It is usually tabled-based, displayed in defined columns and rows and stored in relational databases, accessible to search and analysis tooling. It is easy to search using SQL or other query languages. It is trivial to filter and modify based on parameters.
Business organizations have traditionally relied on structured data in the decision-making processes. The collection and analysis of structured data to support business decisions is an ongoing endeavor for the modern enterprise.
Unstructured Data and the Squishy Middle
Unstructured data doesn’t adhere to predefined data models. It can be difficult for an enterprise to parse and digest.
Data of this type generally includes emails, photos, text files, videos, call transcripts, and messaging apps that can all create a ton of data. This data is usually stored in its native form, which can be vague, disparate, and not easily managed. This can cause cybersecurity and business continuity challenges for organizations that don’t get it under control.
Having the tools that can analyze the squishy middle and reveal its secrets leads to resources for customer analytics and marketing intelligence that are highly compelling. It also comes in handy for uncovering compliance issues earlier when emails and chatbot conversations are analyzed.
Systems that can store, analyze, and report on data from many different sources are invaluable to business stakeholders.
Data in motion
Sometimes this type of data is also referred to as data in flight or data in transit. It is a process in which digital information is transported between locations either within or between computer systems.
Another use of the terms refers to data within a computer’s RAM that is ready to be accessed, read, updated, or processed.
Data at Rest
Corporate files, backup data storage, USB drive data, cloud storage, and file archives.
Data in Motion
Email attachments, FTP sites, Wifi, mobile networks, and files being downloaded, synced, or transferred.
Data in Use
Files in Office apps (Word, Excel, Powerpoint), PDFs, database apps, CPU data, and memory.
Data is generally categorized as at rest, in motion, or in use. Data sent between devices can be stolen, intercepted, or leaked if not properly secured for transmission.
Another security risk for data in motion is man-in-the-middle attacks. Due to this, data is often encrypted to prevent interception. The process isn’t perfect but adds a layer of security to data in transit.
There are several methods to encrypting data in motion. These methods include asymmetric encryption, TLS and SSL, HTTPS, Cryptography, and IPsec. TLS and SSL are commonly used for data transport between email servers, while HTTPS, an encrypted form of the common HTTP protocol, is used to secure transport between a web server and a client browser. The most common and important scenario for HTTPS is authentication form submissions to prevent user credentials from being transported in clear text on a network.
The Role of Data Loss Prevention (DLP)
Security teams work continually to avoid the loss of data in an environment. A collection of technologies that is used during this process is referred to as DLP. When it is implemented, DLP is protects data in a few places:
- Data in use by authorized personnel
- Data in motion (being transferred over the network)
- Data at rest
Data loss prevention tools can prevent users from attempting to move or copy data to a location outside of a business’s network.
Central to DLP software and tools is the inspection of content. As data is moved around the network, DLP tools evaluate the types of files that contain data to catch policy violations by determining where they should be and if the intended usage is correct. There are several method to accomplish this:
1 Rule-based expressions are used that lead to additional actions when detected. For example, if a rule has been set that block emailing credit card numbers that include the CVV code and expiration, DLP software will prevent the email from being sent or automate encryption.
2 Exact file matching. It is also known as data fingerprinting. Content that matches exactly an already indexed file, either as at rest or in motion depending on the usage.
3 Conception or lexicon analysis uses a compilation of lists or dictionaries to identify undesirable behavior, such as specific types of internet queries or sharing confidential trade information to outside parties. Sensitive data can be put at risk by either accidental exposure or malicious activity. It makes DLP security critical to protecting an organization’s data assets.
4 Statistical analysis techniques at a high level: These use machine learning to keep specific information protected. As the machine learns the pattern of what the data looks like, it continually searches for anomalous data that doesn’t match the pattern.
Egress Points
The more common meaning of egress is the process of data leaving a network as it is transferred outside the boundaries of the environment. Data egress can expose private data to unintended or unauthorized recipients.
Whenever data leaves a business’s network, it is called data egress. Egress points can include outbound emails, files going to cloud storage, or messaging apps. When data enters the network from an external source it is called ingress.
Often, cybercriminals use egress traffic to steal data through backdoor trojans, social engineering by disguising it as regular network traffic. Egress filtering helps prevent criminals from exfiltrating data surreptitiously.
DLP Security Policies
Defining how a business can protect and share data is part of a robust data loss prevention policy. It sets out guidelines for how data will be used in decision making while ensuring only authorized parties have access to it.
A data loss prevention policy helps businesses protect themselves and block unauthorized access risks. While no measures are bulletproof, there are some best practices that can help ensure your DLP policy is successful.
- Identify the data that needs to be protected. Often this is done by classifying based on risk and vulnerability factors.
- Set up guidelines that will be used to evaluate DLP vendors if the business will be using them.
- Clearly define the roles of staff who be involved in the DLP process. Establishing responsibilities early on can help prevent misuse.
- Start with simplicity in mind. Choose which data or risk that will be the focus of the policy. The goal should be securing the most critical data.
- Ensure that each department head has a role in shaping the DLP policy.
- Educate everyone in the business environment about why and how the DLP has been established.
- Document the DLP process carefully with a written policy that focuses on the data being protected.
There are many data privacy laws already established in addition to many legal requirements and laws pending inclusion. Some of these specify mentioning where the policy will be enforced, conditions or the parameters for the policy to prevent data loss, and clearly defined actions that will be taken to prevent data losses.
Encryption Keys
To encrypt and decrypt data an encryption key is used consisting of a random string of bits, most commonly 1024, 2048, up to 4096 bits in length. Cryptographic algorithms are used to create them, making each string difficult to crack due its unique nature. It also means an organization’s encryption keys are high value items.
Consensus is that the best way to secure them is by hosting your own encryption keys. Top reasons to consider this include:
- Better control over access
- Easier compliance with data regulations and legislation
Proper encryption key management is critical in ensuring this lynch pin of an organization’s security is available and safe from compromise at all times.
Why are HSMs Critical?
Businesses that provision encryption keys rely on a hardware security module (HSM) appliance to provide an extra layer of security. For example, a master key can be encrypted using an encryption key that is stored on an HSM. Once it is encrypted the firewall will request the HSM to decrypt the master key when it is required to decrypt a password or private key on the firewall.
Some use case scenarios include, keeping transactions, identities separate from other functions on a network by using an HSM. another example of HSM use is restricting access to trade secrets by leveraging cryptographic key transfers to ensure only authorized people have access.
HSMs also help to manage all phases of a key’s lifecycle. This includes these steps:
- Provisioning
- Backup and storage
- Deployment
- Management
- Archiving
- Disposal
In essence the hardware security module manages the encryption and decryption process to ensure the keys are protected.
Assets and Inventory in Data Security
Organizations don’t need to look far to uncover the value in asset and inventory management and how it relates to data security. It helps to empower the security team and provides them with the visibility needed to mitigate threats proactively.
The primary benefits include:
- Traceable endpoint and user account correlation
- Proactive responses
- Security visibility
Assets and inventory put organizations in a stronger position to identify and respond to security risks. While it’s only one component of a robust security strategy, without it proactive security operations are impossible.
Cloud Data Security
The cloud has transformed many business processes. While there are many advantages related to the cloud, there are also many risks. It is critical for organizations to analyze these risks before leveraging the cloud for data distribution.
Data security in the cloud includes controls, processes, and technologies that ensure a company’s cloud-based infrastructure, systems, and data are protected.
Why is Cloud Data Security Critical?
Cloud technology use is soaring due to the increase in remote work, but security processes haven’t changed with the on-prem to cloud shift. To improve security in the cloud, organizations need to ensure there is visibility over access to data in the cloud.
It is critical establish a baseline on these concepts:
- Who’s accessing the data
- What they’re doing with the data
- Where they are accessing the data from
- When they are accessing the data
- Which servers they are using
Organizations should adopt a culture of visibility and ensure that access controls are in place.
Data Security Conclusions
Keep data as secure as possible with the creation of risk-based data security processes. It is critical to identify, classify, and understand your data to mitigate risks and implement controls in order to maintain an effective data security strategy.