• /
  • ログイン
  • 無料アカウント

Hash or mask sensitive data in your logs (Obfuscation)

After logs have been shipped to New Relic, any sensitive information in the logs can be obfuscated before being stored in NRDB by using obfuscation rules.

Sensitive information might include personally-identifiable information such as credit card numbers or other data that you may be required by regulation to protect when stored.

You can define regular expressions matching your sensitive information, and then create rules to obfuscate that data. You can choose either to have sensitive information masked or hashed.

Definitions

  • Obfuscation rules define what logs to apply obfuscation actions to.
  • Obfuscation rule actions define what attributes to look at, what text to obfuscate, and how to obfuscate (either by masking or hashing).
  • Obfuscation expressions are named regular expression identifying what text to obfuscate.
  • Masking completely removes information, replacing it with "X" characters. You cannot search for specific values once this is done.
  • Hashing hides information. You can use the hashing tool to get the hash of a sensitive value, and then search for logs containing that hash.

How obfuscation works

The JSON objects displayed in the following example are simplifications of the payloads used in the Obfuscation API, in order to help you better correlate the different API operations with their UI equivalent counterparts.

Example: log record before obfuscation

Imagine you have the following log record:

{
"message": "The credit card number 4321-5678-9876-2345 belongs to user user@email.com (born on 01/02/2003) with SSN 123-12-1234",
"creditCardNumber": "4321-5678-9876-2345",
"ssn": "123-12-1234",
"department": "sales",
"serviceName": "loginService"
}

This log record contains several sensitive data. Ideally, you would like your log to end up looking like this:

{
"message": "The credit card number 9aa9bc1528859aee1b1df75795f1ebd54beb2f0d26c8a1d4580a71a07189cdd5 belongs to user user@email.com (born on XXXXXXXXXX) with SSN 30e6897f76dc102e32ee1d781c43417d259e586eac15c963d75ab8b5187769da",
"creditCardNumber": "9aa9bc1528859aee1b1df75795f1ebd54beb2f0d26c8a1d4580a71a07189cdd5",
"ssn": "30e6897f76dc102e32ee1d781c43417d259e586eac15c963d75ab8b5187769da",
"department": "sales",
"serviceName": "loginService"
}

1. What actions to apply

You decide you want to apply the following obfuscation actions to all the logs coming from that service:

  • HASH the credit card number present in the message and creditCardNumber attributes.
  • MASK the birth date present in the message attribute.
  • HASH the social security number present in the message and ssn attributes.

2. How expressions will capture sensitive data

The first thing you need to do is to define some obfuscation expressions that allow you to capture this sensitive information:

Obfuscation expression

Definition

Credit card number

We need to capture 4 groups of 4 digits separated by hyphens:

{
"name": "Credit Card Number",
"regex": "(d{4}-d{4}-d{4}-d{4})"
}

Social security number

We need to capture 3 groups of 3, 2, and 4 digits separated by hyphens:

{
"name": "Social Security Number",
"regex": "(d{3}-d{2}-d{4})"
}

Born date (loginService specific)

In this example, the born date is part of the Login service. We define the portion to obfuscate based on the date information in the surrounding words "(born on 01/02/2003)":

{
"name": "Born date - loginService specific",
"regex": "born on (.*))"
}

Each obfuscation expression defines how to capture some sensitive information out of a string (using a regex) and associates it with some friendly name so that you can easily identify it later.

Obfuscation expressions can be reusable: they are totally agnostic to how the log attribute containing the sensitive data is named. For instance, the Social Security expression defined above could be applied to a log attribute named either ssn, socialSecurityNumber, or socSecNum.

But still, you can create non-reusable obfuscation expressions (like the Born date (loginService specific) one) that are tightly coupled to the log attribute's format; for example, whatever comes after "born on" and "before".

3. Which logs will use your rule

Now that we have defined how to capture our sensitive data, we need to specify which logs need to be obfuscated (the ones of the Login Service) and how (with the obfuscation actions we defined). To achieve this, we define an obfuscation rule.

{
"name": "Obfuscate Login Service Logs",
"filter": "serviceName = 'loginService' AND department = 'sales'",
"actions": [
{
"attributes": ["message", "creditCardNumber"],
"expression": { "name": "Credit Card Number" },
"method": "HASH_SHA256"
},
{
"attributes": ["message"],
"expression": { "name": "Born date - loginService specific" },
"method": "MASK"
},
{
"attributes": ["message", "ssn"],
"expression": { "name": "Social Security Number" },
"method": "HASH_SHA256"
}
]
}

This rule contains three main components:

Obfuscation rule component

Description

Name

It helps to easily identify what the rule does. In this example, this rule defines how to obfuscate the different attributes of the logs coming from the Login Service.

Filter

The filter uses NRQL format to tell our system how identify the target logs coming from the Login Service. This example queries for logs where serviceName = loginService and department = sales.

Actions

Finally, this rule defines the set of obfuscation actions to apply to the logs matching the filter. Each action defines which previously created obfuscation expression to use to extract the sensitive information from each set of attributes, as well as the obfuscation method (HASH_SHA256 or MASK) to be applied to obfuscate it.

Note that when defining obfuscation rules via the GraphQL API, you will need to specify the id of the obfuscation expressions instead of their names. To make the previous example more readable, we used the obfuscation expression names instead.

4. Reusing expressions in other rules

As a final example, imagine we also needed to obfuscate logs coming from another service named "Checkout Service" that have an attribute serviceName = checkoutService as well as a ccn attribute that contains credit card information:

{
"message": "Order completed",
"ccn": "4321-5678-9876-2345",
"department": "sales",
"serviceName": "checkoutService"
}

To obfuscate the logs from this service, we would only have to define another obfuscation rule targeting these specific logs, and we would simply reuse the previously created Credit card number obfuscation expression:

{
"name": "Obfuscate Checkout Service Logs",
"filter": "serviceName = 'checkoutService' AND department = 'sales'",
"actions": [
{
"attributes": ["ccn"],
"expression": { "name": "Credit Card Number" },
"method": "HASH_SHA256"
}
]
}

Checklist: steps to obfuscate logs

To obfuscate your logs:

  1. Study the shape of your logs:
  • Do all your logs contain sensitive information? Or can you be more specific (only the logs from service A or region B)?
  • What sensitive information do they contain: credit card numbers, driver's license numbers, biometrics, other values?
  1. Create obfuscation expressions to identify how to extract sensitive data.
  2. Define obfuscation rules for each set of logs:
  • Define how you will capture them using NRQL.
  • Define which obfuscation actions need to be applied to each of them. Ask yourself: Will I need to query my logs using this sensitive information later (consider using HASH), or do I need to remove this information entirely from my logs (consider using MASK)?

Obfuscation expressions

You can read, create, update, or delete obfuscation expressions by using the New Relic One UI or by using NerdGraph, our GraphQL Explorer.

Obfuscation rules

You can read, create, update, or delete obfuscation rules by using the New Relic One UI or by using NerdGraph, our GraphQL Explorer.

問題を作成する
Copyright © 2022 New Relic Inc.