Identify sensitive data based on dictionary words that may trigger the inclusion of sensitive data. These dictionaries include robust language repositories that identify health information. The challenge with this technique is related to the terminology. Medical terms are often used in the regular course of business, outside the context of sensitive information. This can lead to a high rate of false positives, forcing the workforce to apply prevention practices that are not necessary.
Identify sensitive data based on identifiers that are known to be sensitive, a process known as matching. There are two popular methods of matching: (a) leveraging tokens embedded in documents classified as sensitive (document matching) and (b) leveraging actual patient identifiers from your EMR (exact data matching). Document matching dramatically reduces the number of false positives. However, the workforce must be trained on proper data classification. With exact data matching, the false positive rate will be lower than with the dictionary approach, since it involves positive confirmation. Exact data matching requires regularly extracting information from the EMR to load these identifiers into the system. Extra precautions must be taken so that the resulting large datasets are not exposed.
Once your identification methodology is established, DLP systems can be configured to monitor data access channels of interest and make policy decisions based on the data types and the access channels. It is best to provide direct feedback to users when the data policy has been violated, to avoid recurrent violations. Real-time feedback helps users adjust their data usage behaviors. Data channels are presented in Table 6 for your consideration. Table 6. Data Channels for Enforcing Data Policies Data Channel
Endpoint
Implementation Specification Implement inline through SMTP rounding or e-mail messages delivered outside the organization.
Install DLP agents on managed endpoints that can apply data policies.
Considerations
Define thresholds of risky behavior. Implement a DLP block for these thresholds (e.g., > 100 records of PHI in the e-mail).
Define thresholds of risky behavior. Implement a DLP encrypt action for these thresholds, forcing the message to be encrypted before delivered.
Standardize and deploy encrypted thumb drives to users who require mobile storage options.
Prevent the copying of data to unencrypted thumb drives, or force encryption when copying data.
Control the use of noncontrolled peripherals and/or storage devices (e.g., backups of iPhones on devices). Permit only when specifically authorized.
Conduct data discovery scans of data residing on endpoints, exposing data on the endpoint so the user can make data destruction decisions. 48