Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Cloud Data Loss Prevention (DLP) API Samples

Open in Cloud Shell

The Data Loss Prevention API provides programmatic access to a powerful detection engine for personally identifiable information and other privacy-sensitive data in unstructured data streams.

Setup

  • A Google Cloud project with billing enabled
  • Enable the DLP API.
  • (Local testing) Create a service account and set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to the downloaded credentials file.
  • (Local testing) Set the DLP_DEID_WRAPPED_KEY environment variable to an AES-256 key encrypted ('wrapped') with a Cloud Key Management Service (KMS) key.
  • (Local testing) Set the DLP_DEID_KEY_NAME environment variable to the path-name of the Cloud KMS key you wrapped DLP_DEID_WRAPPED_KEY with.

Build

This project uses the Assembly Plugin to build an uber jar. Run:

   mvn clean package -DskipTests

Retrieve InfoTypes

An InfoType identifier represents an element of sensitive data.

InfoTypes are updated periodically. Use the API to retrieve the most current InfoTypes.

  java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata

Run the quickstart

The Quickstart demonstrates using the DLP API to identify an InfoType in a given string.

   java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.QuickStart

Inspect data for sensitive elements

Inspect strings, files locally and on Google Cloud Storage, Cloud Datastore, and BigQuery with the DLP API.

Note: image scanning is not currently supported on Google Cloud Storage. For more information, refer to the API documentation. Optional flags are explained in this resource.

Commands:
  -s <string>                   Inspect a string using the Data Loss Prevention API.
  -f <filepath>                 Inspects a local text, PNG, or JPEG file using the Data Loss Prevention API.
  -gcs -bucketName <bucketName> -fileName <fileName>  Inspects a text file stored on Google Cloud Storage using the Data Loss
                                          Prevention API.
  -ds -projectId [projectId] -namespace [namespace] - kind <kind> Inspect a Datastore instance using the Data Loss Prevention API.

Options:
  --help               Show help 
  -minLikelihood       [string] [choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
                       [default: "LIKELIHOOD_UNSPECIFIED"]
                       specifies the minimum reporting likelihood threshold.
  -f, --maxFindings    [number] [default: 0]
                       maximum number of results to retrieve
  -q, --includeQuote   [boolean] [default: true] include matching string in results
  -t, --infoTypes      set of infoTypes to search for [eg. PHONE_NUMBER US_PASSPORT]
  -customDictionaries  set of comma-separated dictionary words to search for as customInfoTypes
  -customRegexes       set of regex patterns to search for as customInfoTypes

Examples

  • Inspect a string:
    java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is [email protected]" --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is [email protected]" -customDictionaries [email protected] -customRegexes "\(\d{3}\) \d{3}-\d{4}"
    
  • Inspect a local file (text / image):
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.txt --infoTypes PHONE_NUMBER EMAIL_ADDRESS
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f src/test/resources/test.png --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    
  • Inspect a file on Google Cloud Storage:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -gcs -bucketName my-bucket -fileName my-file.txt --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    
  • Inspect a Google Cloud Datastore kind:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -ds -kind my-kind --infoTypes PHONE_NUMBER EMAIL_ADDRESS
    

Automatic redaction of sensitive data from images

Automatic redaction produces an output image with sensitive data matches removed.

Commands:
  -f <string>                   Source image file
  -o <string>                   Destination image file
 Options:
  --help               Show help
  -minLikelihood       choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
                       [default: "LIKELIHOOD_UNSPECIFIED"]
                       specifies the minimum reporting likelihood threshold.
  
  -infoTypes      set of infoTypes to search for [eg. PHONE_NUMBER US_PASSPORT]

Example

  • Redact phone numbers and email addresses from test.png:
      java -cp dlp/target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Redact -f src/test/resources/test.png -o test-redacted.png -infoTypes PHONE_NUMBER EMAIL_ADDRESS
    

Integration tests

Setup

Run

Run all tests:

   mvn clean verify