After logs have been shipped to New Relic, any sensitive information in the logs can be obfuscated before being stored in NRDB by using obfuscation rules.
Sensitive information might include personally-identifiable information such as credit card numbers or other data that you may be required by regulation to protect when stored.
You can define regular expressions matching your sensitive information, and then create rules to obfuscate that data. You can choose either to have sensitive information masked or hashed.
Definitions
- Obfuscation rules define what logs to apply obfuscation actions to.
- Obfuscation rule actions define what attributes to look at, what text to obfuscate, and how to obfuscate (either by masking or hashing).
- Obfuscation expressions are named regular expression identifying what text to obfuscate.
- Masking completely removes information, replacing it with "X" characters. You cannot search for specific values once this is done.
- Hashing hides information. You can use the hashing tool to get the hash of a sensitive value, and then search for logs containing that hash.
How obfuscation works
The JSON objects displayed in the following example are simplifications of the payloads used in the Obfuscation API, in order to help you better correlate the different API operations with their UI equivalent counterparts.
Example: log record before obfuscation
Imagine you have the following log record:
{ "message": "The credit card number 4321-5678-9876-2345 belongs to user user@email.com (born on 01/02/2003) with SSN 123-12-1234", "creditCardNumber": "4321-5678-9876-2345", "ssn": "123-12-1234", "department": "sales", "serviceName": "loginService"}
This log record contains several sensitive data. Ideally, you would like your log to end up looking like this:
{ "message": "The credit card number 9aa9bc1528859aee1b1df75795f1ebd54beb2f0d26c8a1d4580a71a07189cdd5 belongs to user user@email.com (born on XXXXXXXXXX) with SSN 30e6897f76dc102e32ee1d781c43417d259e586eac15c963d75ab8b5187769da", "creditCardNumber": "9aa9bc1528859aee1b1df75795f1ebd54beb2f0d26c8a1d4580a71a07189cdd5", "ssn": "30e6897f76dc102e32ee1d781c43417d259e586eac15c963d75ab8b5187769da", "department": "sales", "serviceName": "loginService"}
1. What actions to apply
You decide you want to apply the following obfuscation actions to all the logs coming from that service:
HASH
the credit card number present in themessage
andcreditCardNumber
attributes.MASK
the birth date present in themessage
attribute.HASH
the social security number present in themessage
andssn
attributes.
2. How expressions will capture sensitive data
The first thing you need to do is to define some obfuscation expressions that allow you to capture this sensitive information:
Obfuscation expression | Definition |
---|---|
Credit card number | We need to capture 4 groups of 4 digits separated by hyphens:
|
Social security number | We need to capture 3 groups of 3, 2, and 4 digits separated by hyphens:
|
Born date ( | In this example, the born date is part of the Login service. We define the portion to obfuscate based on the date information in the surrounding words
|
Each obfuscation expression defines how to capture some sensitive information out of a string (using a regex) and associates it with some friendly name so that you can easily identify it later.
Obfuscation expressions can be reusable: they are totally agnostic to how the log attribute containing the sensitive data is named. For instance, the Social Security expression defined above could be applied to a log attribute named either ssn
, socialSecurityNumber
, or socSecNum
.
But still, you can create non-reusable obfuscation expressions (like the Born date (loginService specific)
one) that are tightly coupled to the log attribute's format; for example, whatever comes after "born on" and "before".
3. Which logs will use your rule
Now that we have defined how to capture our sensitive data, we need to specify which logs need to be obfuscated (the ones of the Login Service) and how (with the obfuscation actions we defined). To achieve this, we define an obfuscation rule.
{ "name": "Obfuscate Login Service Logs", "filter": "serviceName = 'loginService' AND department = 'sales'", "actions": [ { "attributes": ["message", "creditCardNumber"], "expression": { "name": "Credit Card Number" }, "method": "HASH_SHA256" }, { "attributes": ["message"], "expression": { "name": "Born date - loginService specific" }, "method": "MASK" }, { "attributes": ["message", "ssn"], "expression": { "name": "Social Security Number" }, "method": "HASH_SHA256" } ]}
This rule contains three main components:
Obfuscation rule component | Description |
---|---|
Name | It helps to easily identify what the rule does. In this example, this rule defines how to obfuscate the different attributes of the logs coming from the Login Service. |
Filter | The filter uses NRQL format to tell our system how identify the target logs coming from the Login Service. This example queries for logs where |
Actions | Finally, this rule defines the set of obfuscation actions to apply to the logs matching the filter. Each action defines which previously created obfuscation expression to use to extract the sensitive information from each set of attributes, as well as the obfuscation method ( Note that when defining obfuscation rules via the GraphQL API, you will need to specify the |
4. Reusing expressions in other rules
As a final example, imagine we also needed to obfuscate logs coming from another service named "Checkout Service" that have an attribute serviceName = checkoutService
as well as a ccn
attribute that contains credit card information:
{ "message": "Order completed", "ccn": "4321-5678-9876-2345", "department": "sales", "serviceName": "checkoutService"}
To obfuscate the logs from this service, we would only have to define another obfuscation rule targeting these specific logs, and we would simply reuse the previously created Credit card number
obfuscation expression:
{ "name": "Obfuscate Checkout Service Logs", "filter": "serviceName = 'checkoutService' AND department = 'sales'", "actions": [ { "attributes": ["ccn"], "expression": { "name": "Credit Card Number" }, "method": "HASH_SHA256" } ]}
Checklist: steps to obfuscate logs
To obfuscate your logs:
- Study the shape of your logs:
- Do all your logs contain sensitive information? Or can you be more specific (only the logs from service A or region B)?
- What sensitive information do they contain: credit card numbers, driver's license numbers, biometrics, other values?
- Create obfuscation expressions to identify how to extract sensitive data.
- Define obfuscation rules for each set of logs:
- Define how you will capture them using NRQL.
- Define which obfuscation actions need to be applied to each of them. Ask yourself: Will I need to query my logs using this sensitive information later (consider using
HASH
), or do I need to remove this information entirely from my logs (consider usingMASK
)?
Obfuscation expressions
You can read, create, update, or delete obfuscation expressions by using the New Relic One UI or by using NerdGraph, our GraphQL Explorer.
Obfuscation rules
You can read, create, update, or delete obfuscation rules by using the New Relic One UI or by using NerdGraph, our GraphQL Explorer.