Cybersecurity Practice #4: Data Protection and Loss Prevention (medium/large)

Modified on Wed, 14 Jun, 2023 at 2:30 PM

All organizations within the HPH sector access, process, and transmit sensitive information, such as health information or PII. The fundamental data used in operations are highly sensitive, representing a unique challenge to the HPH sector. Most of the health care workforce must leverage these data to carry out their respective missions.

In that context, healthcare faces a growing challenge of understanding where data assets exist, how they are used, and how they are transmitted. PHI is discussed, processed, and transmitted between information systems daily. Protecting these data requires robust policies, processes, and technologies.²¹

As your organization starts shoring up its data protection and prevention controls, it is best to begin by understanding the types of data that exist in the organization, setting a classification schema for these data, and then determining how the data are processed. Establish a set of policies and procedures for normal data use and then build in “guardrail” systems to guide your user base toward these business processes.

Cybersecurity Practice 4: Data Protection and Loss Prevention

Data that may be affected	Passwords, PHI
	4.M.A Classification of Data
Medium Sub- Practices	4.M.B Data Use Procedures
	4.M.C Data Security
	4.M.D Backup Strategies
	4.M.E Data Loss Prevention
Large Sub-Practices	4.L.A Advanced Data Loss Prevention
Large Sub-Practices	4.L.B Mapping of Data Flows
Key Mitigated Risks	Ransomware Attacks Loss of Theft of Equipment or Data Insider, Accidental or Intentional Data Loss

Sub-Practices for Medium-Sized Organizations

4.M.A

Classification of Data

NIST FRAMEWKORK REF:

(ID.AM-5)

There is a vast proliferation of data in healthcare environments. Data can range from records, including treatment information, social security numbers, insurance numbers and billing information to research information. Health care data also includes nonobvious, but still important, information such as

business strategies and development plans, business finances, employee records, and corporate board materials.

Before establishing policies describing how these varied data types should be used and disclosed, it is best to classify them into high-level categories that provide a consistent framework when developing policies and procedures. Table 3 provides a sample classification schema, with examples of the types of documents that the classification comprises.

Table 3. Example of a Data Classification Schema

Classification	Description	Examples
Highly Sensitive Data	Data that could easily be used for financial fraud, or could cause significant reputational damage.	SSN, credit card number, mental health information, substance abuse information, sexually transmitted infections.
Sensitive Data	Regulated data, or data that could cause embarrassment to patients or organizations.	Health information, clinical research data, insurance information, human/employee data, board materials.
Internal Data	Data that are not considered sensitive, but should not be exposed publicly.	Policies and procedures, contracts, business plans, corporate strategy and business development plans, internal business communications.
Public Data	All data that have been sanitized and approved for distribution to the public with no restrictions on use.	Materials published on websites, presentations, and research publications.

4.M.B

Data Use Procedures

NIST FRAMEWKORK REF:

ID.GV-1

After data have been classified, procedures can be written that describe how to use these data based on their classification. Such procedures describe the processes of setting usage expectations and of labeling the information properly. These two functions are described further in the following paragraphs.

Usage and disclosure: Based on the classification type, data use should be limited appropriately and disclosed using specific methods. Consider the procedures in Table 4.

Table 4. Suggested Procedures for Data Disclosure

Classification	Use	Disclosure
Highly Sensitive	Must be restricted to only individuals who have a need to know. Must use extreme caution when handling data.	Only share information internally and only when expressly permitted and when directed by the data owner.
Sensitive	3. Must be restricted to only individuals who have a need to know.	Only share information internally and only when expressly permitted.
Internal Use	4. Data can be generally used, but care should be considered in its consumption.	Only share information internally within the organization.
Public	5. No restrictions.	Share freely with no restrictions.

Be careful when sending information through e-mail. Ensure that sending PHI via e-mail is consistent with ONC guidance. Do not send unencrypted PHI through regular e-mail or text message. However, patients can request and receive access to their PHI via unencrypted electronic communications following a brief warning to the patient that unencrypted communications could be accessed by a third-party in transit and the patient confirms that they still want to receive the unencrypted communication.

Labeling: It is important to label information properly to facilitate implementation of restrictions related to its usage and disclosure. Labeling helps keep data secure in two ways. First, users will understand how to handle information that is properly labeled. Second, specialized security tools, such as data loss prevention (DLP) systems, can be configured to discover and control information when it is properly labeled. At minimum, the labeling process should ensure that labels are readily apparent when users view information. Use techniques like placing the classification in the footer of the document. Collaborate with your marketing and communication departments to create document templates based on data classification levels. Organization-wide document templates enable specialized tokens or signatures to be embedded in the documents and tracked by DLP systems.

4.M.C

Data Security

NIST FRAMEWKORK REF:

PR.DS, PR.DS-1, PR.DS-2, PR.IP-6, PR.DS-5

After policies and procedures have been defined, you can establish additional data security methods. Consider the security methods described in Table 5.

Table 5. Security Methods to Protect Data

Security Method

Description

Considerations

Encrypt data at rest

Ensure data are encrypted when resident on file systems.

When using the cloud-based services, enable native encryption capabilities to prevent exposures if the cloud provider is hacked.
Ensure that full disk encryption is enabled on all workstations and laptops.

Encrypt data in transit

Ensure that secure transport methods are used for both internal and external movement.

Ensure that websites containing sensitive data use encrypted transport methods, such as Hypertext Transfer Protocol Secure (HTTPS).
Enable internal encryption methods when moving data in the organization.
Never send unencrypted sensitive data outside of the organization.

Data retention and destruction

Ensure that retention policies are set.

Contractually bind third parties to destroy data when terminating contracts.

Use standard destruction forms and require vendors to attest that data have been destroyed pursuant to those forms.
Set retention policies and quotas on e-mail systems to reduce the amount of data that can be exposed. Ensure that legal retention requirements are met.
Establish a purge strategy that includes purge mechanisms.

Scrub production data from test and development environments

Ensure that identifiable information is removed when replicating production environments for testing.

Leverage specialized tools to deidentify data elements within large systems (such as EMRs).
Regularly audit data elements within test and production environments to ensure that they are clean.

Mask sensitive data within applications

Restrict users from accessing highly sensitive information, such as SSNs, by masking it unless authorized.

Permit SSN access only to members who require it (e.g., registration desks, admitting desks, payor processing).

Limit the ability to print, save, or export data based on function

Restrict the workforce’s ability to export data out of systems that contain sensitive data, unless they have proper authorization.

Encourage users to work within applications. Minimize data exporting by providing the required capabilities to manipulate data within the application.
Implement restrictions on data exports, especially in reporting or database systems that can query and return large datasets.
Consider removing the ability to print and copy/paste from EMR applications or web mail accessed from home.

4.M.D

Backup Strategies

NIST FRAMEWKORK REF:

PR.IP-4

A robust backup strategy for enterprise assets is critical to daily IT operations. It is equally important to have such a backup strategy in the event of cybersecurity incidents. There will be events that cause an asset, or multiple assets, to be thoroughly compromised. During these events, routine backups can be the only way to ensure proper execution of the recovery phase of your IR process. Fully decommissioning affected assets and restoring them to a time before the compromise occurred is the best method to neutralize the compromise.

At minimum, each mission-critical asset in your environment should have a backup plan. Backups can be executed using a variety of methods, the most common being disk-to-tape, disk-to-disk, or disk-to- cloud backups. The integrity of these backups is paramount; these copies are your last line of defense, and you want to make sure they are complete and accurate when you need them.

No matter what backup strategy you choose, it is very important to make sure these backup locations are not accessible from the general network or from the general user populations. These backups are can be the last line of defense against a ransomware attack, as such access to them should be severely limited. This includes access from the servers and systems themselves that are being backed up; considering letting systems only write new data rather than overwriting existing data. This can thwart the attempts of encryption attacks against these backup files.

Disk-to-tape: This method makes backups by accessing designated systems and files and writing all content to a tape drive, or a tape library. Specialized software, hardware, and inventory controls are required. To conduct backups efficiently, you will need the tape robots and a tape library appropriate to the number and size of systems being backed up. These backups can be very large. Configure the tapes to use a “write once and read many” option. It is of utmost importance that encryption is enabled in writing to these tapes. If a tape is lost or stolen, unencrypted data could be breached.

There are great advantages to maintaining offline backups. You can rely on these copies to be available when you need them, and tape backups prevent attacks against the backup medium itself, because they are offline.

Disk-to-disk: This method involves taking backup copies from a disk and replicating them to a separate disk or storage array that is dedicated to maintaining backup copies. This option generally costs less than disk-to-tape strategies, and disk-to-disk backups usually execute more quickly than disk-to-tape. It is important to use encryption on backup files, in case the files are copied outside of the organization.

It is important to consider controlling access to the disk storage system as part of a disk-to-disk approach. With cyberattacks like ransomware, attackers intend to disrupt both production and backup files. Attackers that launch ransomware attacks are aware that an organization’s first response will be to contain the ransomware and then restore the uncorrupted files from a backup source. If they can compromise the backup and production files, there is a much higher likelihood that the organization will pay a ransom to get its files back. Access control mechanisms should prevent the system being backed up from accessing the disk array, except via required access channels. Do not permit other access to the array from other accounts, including administrative accounts. Remember, everyone is a potential target of a ransomware attack, especially administrators.

Disk-to-cloud: This method is very similar to disk-to-disk backup. Cloud backup offers multiple added values, however. With a disk-to-cloud backup, you automatically get the resiliency and flexibility of the cloud environment, as well as the benefits from investments made by the cloud providers, to maintain 100 percent data availability. Rather than a single-point-of-failure model, as seen in disk-to-disk and disk-to-tape backups, cloud providers replicate data backups, leveraging cloud infrastructure with multi-fault–tolerant capabilities.

As with the disk-to-disk model, it is important to limit access to cloud-based backup storage to only the systems and disks that are backed up and the data repository. Never implement a drive that maps to the backup repository. That mapped drive could be the vehicle that delivers the ransomware encryption. Always encrypt backup files to protect your organization if the cloud provider is breached.

Lastly, whatever method of backup is used, it is important to test the recovery of these backups on a periodic basis to ensure data availability. Again, your backup process is the last line of defense and must be demonstrated to be trustworthy in a time of need.

4.M.E

Data Loss Prevention

NIST FRAMEWKORK REF:

PR.DS-5

Once standard data policies and procedures are established and the workforce is trained to use them, DLP systems should be implemented to ensure that sensitive data are used in compliance with these policies.

Multiple DLP solutions exist and can be applicable depending on the types of data access channels that need to be monitored. Traditionally, DLP systems monitor e-mail, file storage, endpoint usage, web usage, and network transmission. All these channels should be considered.

A challenge with DLP systems is to determine which methods will be used to positively identify sensitive information. Within a health care environment, that can be tricky. Generally, there are two approaches, and both have limitations:

Identify sensitive data based on dictionary words that may trigger the inclusion of sensitive data. These dictionaries include robust language repositories that identify health information. The challenge with this technique is related to the terminology. Medical terms are often used in the regular course of business, outside the context of sensitive information. This can lead to a high rate of false positives, forcing the workforce to apply prevention practices that are not necessary.
Identify sensitive data based on identifiers that are known to be sensitive, a process known as matching. There are two popular methods of matching: (a) leveraging tokens embedded in documents classified as sensitive (document matching) and (b) leveraging actual patient identifiers from your EMR (exact data matching). Document matching dramatically reduces the number of false positives. However, the workforce must be trained on proper data classification. With exact data matching, the false positive rate will be lower than with the dictionary approach, since it involves positive confirmation. Exact data matching requires regularly extracting information from the EMR to load these identifiers into the system. Extra precautions must be taken so that the resulting large datasets are not exposed.

Once your identification methodology is established, DLP systems can be configured to monitor data access channels of interest and make policy decisions based on the data types and the access channels. It is best to provide direct feedback to users when the data policy has been violated, to avoid recurrent violations. Real-time feedback helps users adjust their data usage behaviors. Data channels are presented in Table 6 for your consideration.

Table 6. Data Channels for Enforcing Data Policies

Data Channel

Implementation Specification

Considerations

E-mail

Implement inline through SMTP routing for e-mail messages delivered outside the organization.

Define thresholds of risky behavior. Implement a DLP block for these thresholds (e.g., > 100 records of PHI in the e-mail).
Define thresholds of risky behavior. Implement a DLP encrypt action for these thresholds, forcing the message to be encrypted before delivered.

Endpoint

Install DLP agents on managed endpoints that can apply data policies.

Standardize and deploy encrypted thumb drives to users who require mobile storage options.
Prevent the copying of data to unencrypted thumb drives, or force encryption when copying data.
Control the use of noncontrolled peripherals and/or storage devices (e.g., backups of iPhones on devices). Permit only when specifically authorized.
Conduct data discovery scans of data residing on endpoints, exposing data on the endpoint so the user can make data destruction decisions.

Network

Implement through Switched Port Analyzer ports from egress network points or through Internet Content Application Protocol on web proxies.

If online, prevent the leakage of unencrypted sensitive data based upon predefined thresholds (e.g., files that contain > 100 records of PHI).
If out of band, activate IR procedures to contain data leakages that occur through the network.

Sub-Practices for Large Organizations

4.L.A

Advanced Data Loss Prevention

NIST FRAMEWKORK REF:

PR.DS-5

After implementing basic DLP controls, you should consider expanding your DLP capabilities to monitor other common data access channels. Table 7 recommends methods for your consideration.

Table 7. Expanding DLP to Other Data Channels

Data Channel

Implementation Specification

Considerations

Cloud storage

Use cloud access security broker systems to monitor data flows into cloud systems.

Label data identified as sensitive. Implement digital rights and encryption to limit access to sensitive data.
Ensure that cloud-based file storage and sharing systems do not expose sensitive data in an “open sharing” construct without authentication (i.e., do not permit the use of sharing data through a simple URL link).

Onsite file storage

Point discovery scanning systems at known file servers or other large data repositories.

Conduct regular DLP scans against the file systems to scan and identify sensitive data.
Query security access permissions for each file that contains sensitive data. Define thresholds for excessive access and set alerts if these are crossed. Forward alerts to the SOC for response, as described in Cybersecurity Practice #8: Security Operations Center and Incident Response.
Determine staleness of records with sensitive data. Consider executing data destruction practices for records that have not been opened or viewed for an extended duration.
Determine data ownership of sensitive files identified in file storage systems, leveraging automated tools. Establish workflow options that allow data owners to provide input into access permission reviews of their sensitive files.

Web-based scanning

Configure DLP systems to crawl known public websites for sensitive information.

Conduct a “spearing” exercise, which is similar to methods deployed by search engines. Compare files and results posted on websites against DLP matching policies and respond quickly to any sensitive data that are exposed.
Conduct manual searching activities on a periodic basis over exposed websites. Look for files that may contain large amounts of sensitive data (e.g., xls(x), csv, txt and pdf).

4.L.B

Mapping Data Flows

NIST FRAMEWKORK REF:

ID.AM-3, DE.AE-1

After data business practices are defined, it is advisable to describe these processes in a data map. Data maps should include the following components:

Applications that house sensitive data
Standard direction movement of data
Users of applications and data
Methods used to store and transmit data

Conducting this type of mapping, and potentially adding it to a larger enterprise architecture reference, enables an organization to identify data protection and monitoring requirements.

Threats Mitigated

Ransomware attacks
Loss or theft of equipment or data
Insider, accidental or intentional data loss

Suggested Metrics

Number of encrypted e-mail messages, trended by week. The goal is to establish a baseline of encrypted messages sent. Be on the lookout for spikes of encryption (which could indicate data exfiltration) and no encryption (which could indicate that encryption is not working properly).
Number of blocked e-mail messages, trended by week. The goal is to detect large numbers of blocked messages, which could indicate potential malicious data exfiltration or user training.
Number of files with excessive access on the file systems, trended by week. The goal is to enact actions that limit access on the file storage systems to sensitive data, create tickets, and deliver to access management.
Number of unencrypted devices with access attempts, trended by week. The goal is to use this information to educate the workforce on the risks of removable media.

21. Erika McCallister, Tim Grance, and Karen Scarfone, Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), (NIST Special Publication 800-122, April, 2010, Gaithersburg, MD), https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-122.pdf.