The Data Loss Prevention API provides programmatic access to a powerful detection engine for personally identifiable information and other privacy-sensitive data in unstructured data streams.
- A Google Cloud project with billing enabled
- Enable the DLP API.
- (Local testing) Create a service account
and set the
GOOGLE_APPLICATION_CREDENTIALSenvironment variable pointing to the downloaded credentials file. - (Local testing) Set the
DLP_DEID_WRAPPED_KEYenvironment variable to an AES-256 key encrypted ('wrapped') with a Cloud Key Management Service (KMS) key. - (Local testing) Set the
DLP_DEID_KEY_NAMEenvironment variable to the path-name of the Cloud KMS key you wrappedDLP_DEID_WRAPPED_KEYwith.
This project uses the Assembly Plugin to build an uber jar. Run:
mvn clean package -DskipTests
An InfoType identifier represents an element of sensitive data.
InfoTypes are updated periodically. Use the API to retrieve the most current InfoTypes.
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata
The Quickstart demonstrates using the DLP API to identify an InfoType in a given string.
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.QuickStart
Inspect strings, files locally and on Google Cloud Storage, Cloud Datastore, and BigQuery with the DLP API.
Note: image scanning is not currently supported on Google Cloud Storage. For more information, refer to the API documentation. Optional flags are explained in this resource.
usage: com.example.dlp.Inspect
-bq,--Google BigQuery inspect BigQuery table
-bucketName <arg>
-customDictionaries <arg>
-customRegexes <arg>
-datasetId <arg>
-ds,--Google Datastore inspect Datastore kind
-f,--file path <arg> inspect input file path
-fileName <arg>
-gcs,--Google Cloud Storage inspect GCS file
-includeQuote <arg>
-infoTypes <arg>
-kind <arg>
-maxFindings <arg>
-minLikelihood <arg>
-namespace <arg>
-projectId <arg>
-s,--string <arg> inspect string
-subscriptionId <arg>
-tableId <arg>
-topicId <arg>
- Inspect a string:
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is [email protected]" -infoTypes PHONE_NUMBER EMAIL_ADDRESS java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is [email protected]" -customDictionaries [email protected] -customRegexes "\(\d{3}\) \d{3}-\d{4}" - Inspect a local file (text / image):
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.txt -infoTypes PHONE_NUMBER EMAIL_ADDRESS java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.png -infoTypes PHONE_NUMBER EMAIL_ADDRESS - Inspect a file on Google Cloud Storage:
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -gcs -bucketName my-bucket -fileName my-file.txt -infoTypes PHONE_NUMBER EMAIL_ADDRESS - Inspect a Google Cloud Datastore kind:
java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -ds -kind my-kind -infoTypes PHONE_NUMBER EMAIL_ADDRESS
Automatic redaction produces an output image with sensitive data matches removed.
Commands:
-f <string> Source image file
-o <string> Destination image file
Options:
--help Show help
-minLikelihood choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
[default: "LIKELIHOOD_UNSPECIFIED"]
specifies the minimum reporting likelihood threshold.
-infoTypes set of infoTypes to search for [eg. PHONE_NUMBER US_PASSPORT]
- Redact phone numbers and email addresses from
test.png:java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Redact -f src/test/resources/test.png -o test-redacted.png -infoTypes PHONE_NUMBER EMAIL_ADDRESS
- Create a Google Cloud Storage bucket and upload test.txt.
- Create a Google Cloud Datastore kind and add an entity with properties:
property1: [email protected]property2: 343-343-3435
- Update the Google Cloud Storage path and Datastore kind in InspectIT.java.
- Ensure that
GOOGLE_APPLICATION_CREDENTIALSpoints to authorized service account credentials file.
Run all tests:
mvn clean verify