Parsing log data

Parsing is the process of splitting unstructured log data into attributes (key/value pairs). You can use these attributes to facet or filter logs in useful ways. This in turn helps you build better charts and alerts.

To get started with parsing, watch the following video tutorial on YouTube (approx. 4-1/2 minutes).

New Relic parses log data according to rules. This document describes how logs parsing works, how to use built-in rules, and how to create custom rules.

You can also create, query, and manage your log parsing rules by using NerdGraph, our GraphQL API, at api.newrelic.com/graphiql. For more information, see our NerdGraph tutorial for parsing.

Parsing example

A good example is a default NGINX access log containing unstructured text. It is useful for searching but not much else. Here's an example of a typical line:

127.180.71.3 - - [10/May/1997:08:05:32 +0000] "GET /downloads/product_1 HTTP/1.1" 304 0 "-" "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.21)"

In an unparsed format, you would need to do a full text search to answer most questions. After parsing, the log is organized into attributes, like response code and request URL:

{
  "remote_addr":"93.180.71.3",
  "time":"1586514731",
  "method":"GET",
  "path":"/downloads/product_1",
  "version":"HTTP/1.1",
  "response":"304",
  "bytesSent": 0,
  "user_agent": "Debian APT-HTTP/1.3 (0.8.16~exp12ubuntu10.21)"
}

Parsing makes it easier to create custom queries that facet on those values. This helps you understand the distribution of response codes per request URL and quickly find problematic pages.

How log parsing works

Here's an overview of how New Relic implements parsing of logs:

Log parsing	How it works
What	All parsing takes place against the `message` field; no other fields can be parsed. Each parsing rule is created with matching criteria that determines which logs the rule will attempt to parse. To simplify the matching process, we recommend adding a `logtype` attribute to your logs. However, you are not limited to using `logtype`; any attribute can be used as matching criteria.
When	Parsing will only be applied once to each log message. If multiple parsing rules match the log, only the first that succeeds will be applied. Parsing takes place during log ingestion, before data is written to NRDB. Once data has been written to storage, it can no longer be parsed. Parsing occurs in the pipeline before data enrichments take place. Be careful when defining the matching criteria for a parsing rule. If the criteria is based on an attribute that doesn't exist untail after parsing or enrichment take place, that data won't be present in the logs when matching occurs. As a result, no parsing will happen.
How	Rules can be written in Grok, regex, or a mixture of the two. Grok is a collection of patterns that abstract away complicated regular expressions. If the content of the message field is JSON, it will be parsed automatically.

Parse attributes using Grok

Parsing patterns are specified using Grok, an industry standard for parsing log messages. Any incoming log with a logtype field will be checked against our built-in patterns, and if possible, the associated Grok pattern is applied to the log.

Grok is a superset of regular expressions that adds built-in named patterns to be used in place of literal complex regular expressions. For instance, instead of having to remember that an integer can be matched with the regular expression (?:[+-]?(?:[0-9]+)), you can just write %{INT} to use the Grok pattern INT, which represents the same regular expression.

You can always use a mix of regular expressions and Grok pattern names in your matching string. For more information, see our list of Grok syntax and supported types.

A log record could look something like this:

{
  "message": "54.3.120.2 2048 0"
}

This information is accurate, but it's not exactly intuitive what it means. Grok patterns help you extract and understand the telemetry data you want. For example, a log record like this is much easier to use:

{
  "host_ip": "43.3.120.2",
  "bytes_received": 2048,
  "bytes_sent": 0
}

To do this, create a Grok pattern that extracts these three fields; for example:

"%{IP:host_ip} %{INT:bytes_received} %{INT:bytes_sent}"

After processing, your log record will include the fields host_ip, bytes_received, and bytes_sent. Now you can use these fields in New Relic to filter, facet, and perform statistical operations on your log data. For more details about how to parse logs with Grok patterns in New Relic, see our blog post.

If you have the correct permissions, you can create parsing rules in our UI to create, test, and enable Grok parsing. For example, to get a specific type of error message for one of your microservices called Inventory Services, you would create a Grok parsing rule that looks for a specific error message and product. To do this:

Give the rule a name; for example, Inventory Services error parsing.
Identify the attribute/value pair that acts as a pre-filter for the incoming logs; for example, entity.name and the value Inventory Service. This pre-filter narrows down the number of logs that need to be processed by your rule, removing unnecessary processing.

Add the Grok parsing rule; for example:

Inventory error: %{DATA:error_message} for product %{INT:product_id}

Where:

Inventory error: Your parsing rule's name
error_message: The error message you want to select
product_id: The product ID for Inventory Service

Test the Grok pattern to see if any incoming logs match.
Enable and save the parsing rule.
Soon you will see that your Inventory Service logs are enriched with two new fields: error_message and product_id. From here, you can query on these fields, create charts and dashboards, set alerts, etc.
For complete details, see our documentation to add custom parsing rules in the UI.

Grok patterns have the syntax:

%{PATTERN_NAME[:OPTIONAL_EXTRACTED_ATTRIBUTE_NAME[:OPTIONAL_TYPE]]}

Where:

PATTERN_NAME is one of the Grok patterns. Click grok-patterns to see the most commonly-used patterns. The pattern name is just a user-friendly name representing a regular expression. They are exactly equal to the corresponding regular expression.
OPTIONAL_EXTRACTED_ATTRIBUTE_NAME, if provided, is the name of the attribute that will be added to your log message with the value matched by the pattern name. It's equivalent to using a named capture group using regular expressions. If this is not provided, then the parsing rule will just match a region of your string, but not extract an attribute with its value.
OPTIONAL_TYPE specifies the type of attribute value to extract. If omitted, values are extracted as strings. For instance, to extract the value 123 from "File Size: 123" as a number into attribute file_size, use value: %{INT:file_size:int}.
Supported types are:
Type specified in Grok
Type stored in the New Relic database
boolean
boolean
byte short int integer
integer
long
long
float
float
double
double
string (default) text
string
date datetime
ISO 8601 time as a long

Type specified in Grok	Type stored in the New Relic database
`boolean`	`boolean`
`byte` `short` `int` `integer`	`integer`
`long`	`long`
`float`	`float`
`double`	`double`
`string` (default) `text`	`string`
`date` `datetime`	ISO 8601 time as a `long`

Organizing by logtype

New Relic's log ingestion pipeline can parse data by matching a log event to a rule that describes how the log should be parsed. There are two ways log events can be parsed:

Use a built-in rule.
Define a custom rule.

Rules are a combination of matching logic and parsing logic. Matching is done by defining a query match on an attribute of the logs. Rules are not applied retroactively. Logs collected before a rule is created are not parsed by that rule.

The simplest way to organize your logs and how they are parsed is to include the logtype field in your log event. This tells New Relic what built-in rule to apply to the logs.

重要

Once a parsing rule is active, data parsed by the rule is permanently changed. This cannot be reverted.

From the left nav in the Logs UI, select Parsing, then create your own custom parsing rule with an attribute, value, and Grok pattern.

Limits

Parsing is computationally expensive, which introduces risk. Parsing is done for custom rules defined in an account and for matching patterns to a log. A large number of patterns or poorly defined custom rules will consume a huge amount of memory and CPU resources while also taking a very long time to complete.

In order to prevent problems, we apply two parsing limits: per-message-per-rule and per-account.

Limit	Description
Per-message-per-rule	The per-message-per-rule limit prevents the time spent parsing any single message from being greater than 100 ms. If that limit is reached, the system will cease attempting to parse the log message with that rule. The ingestion pipeline will attempt to run any other applicable on that message, and the message will still be passed through the ingestion pipeline and stored in NRDB. The log message will be in its original, unparsed format.
Per-account	The per-account limit exists to prevent accounts from using more than their fair share of resources. The limit considers the total time spent processing all log messages for an account per-minute. The limit is not a fixed value; it scales up or down proportionally to the volume of data stored daily by the account and the environment size that is subsequently allocated to support that customer.

Limit

Description

Per-message-per-rule

The per-message-per-rule limit prevents the time spent parsing any single message from being greater than 100 ms. If that limit is reached, the system will cease attempting to parse the log message with that rule.

The ingestion pipeline will attempt to run any other applicable on that message, and the message will still be passed through the ingestion pipeline and stored in NRDB. The log message will be in its original, unparsed format.

Per-account

The per-account limit exists to prevent accounts from using more than their fair share of resources. The limit considers the total time spent processing all log messages for an account per-minute.

The limit is not a fixed value; it scales up or down proportionally to the volume of data stored daily by the account and the environment size that is subsequently allocated to support that customer.

ヒント

To easily check if your rate limits have been reached, go to your system Limits page in the New Relic UI.

Built-in parsing rules

Common log formats have well-established parsing rules already created for them. To get the benefit of built-in parsing rules, add the logtype attribute when forwarding logs. Set the value to something listed in the following table, and the rules for that type of log will be applied automatically.

List of built-in rules

The following logtype attribute values map to a predefined parsing rule. For example, to query the Application Load Balancer:

From the New Relic UI, use the format logtype:"alb".
From NerdGraph, use the format logtype = 'alb'.

To learn what fields are parsed for each rule, see our documentation about built-in parsing rules.

`logtype`	Log source	Example matching query
`apache`	Apache Access logs	`logtype:"apache"`
`alb`	Application Load Balancer logs	`logtype:"alb"`
`cloudfront-web`	CloudFront Web logs	`logtype:"cloudfront-web"`
`elb`	Elastic Load Balancer logs	`logtype:"elb"`
`haproxy_http`	HAProxy logs	`logtype:"haproxy_http"`
`ktranslate-health`	KTranslate Container Health logs	`logtype:"ktranslate-health"`
`iis_w3c`	Microsoft IIS server logs - W3C format	`logtype:"iis_w3c"`
`monit`	Monit logs	`logtype:"monit"`
`mysql-error`	MySQL Error logs	`logtype:"mysql-error"`
`nginx`	NGINX access logs	`logtype:"nginx"`
`nginx-error`	NGINX error logs	`logtype:"nginx-error"`
`route-53`	Route 53 logs	`logtype:"route-53"`
`syslog-rfc5424`	Syslogs with RFC5424 format	`logtype:"syslog-rfc5424"`

Add the `logtype` attribute

When aggregating logs, it's important to provide metadata that makes it easy to organize, search, and parse those logs. One simple way of doing this is to add the attribute logtype to the log messages when they are shipped. Built-in parsing rules are applied by default to certain logtype values.

ヒント

The fields logType, logtype, and LOGTYPE are all supported for built-in rules. For ease of searching, we recommend that you align on a single syntax in your organization.

Here are some examples of how to add logtype to logs sent by some of our supported shipping methods.

Add logtype as an attribute. You must set the logtype for each named source.

logs:
  - name: file-simple
    file: /path/to/file
    attributes:
      logtype: fileRaw  
  - name: nginx-example
    file: /var/log/nginx.log
    attributes:
      logtype: nginx

Add a filter block to the .conf file, which uses a record_transformer to add a new field. In this example we use a logtype of nginx to trigger the build-in NGINX parsing rule. Check out other Fluentd examples.

<filter containers>
  @type record_transformer
  enable_ruby true
  <record>
    #Add logtype to trigger a built-in parsing rule for nginx access logs
    logtype nginx
    #Set timestamp from the value contained in the field "time"
    timestamp record["time"]
    #Add hostname and tag fields to all records
    hostname "#{Socket.gethostname}"
    tag ${tag}
  </record>
</filter>

Add a filter block to the .conf file that uses a record_modifier to add a new field. In this example we use a logtype of nginx to trigger the build-in NGINX parsing rule. Check out other Fluent Bit examples.

[FILTER]
    Name record_modifier
    Match *
    Record logtype nginx
    Record hostname ${HOSTNAME}
    Record service_name Sample-App-Name

Add a filter block to the Logstash configuration which uses an add_field mutate filter to add a new field. In this example we use a logtype of nginx to trigger the build-in NGINX parsing rule. Check out other Logstash examples.

filter {
  mutate {
    add_field => {
      "logtype" => "nginx"
      "service_name" => "myservicename"
      "hostname" => "%{host}"
    }
  }
}

You can add attributes to the JSON request sent to New Relic. In this example we add a logtype attribute of value nginx to trigger the built-in NGINX parsing rule. Learn more about using the Logs API.

POST /log/v1 HTTP/1.1
Host: log-api.newrelic.com
Content-Type: application/json
X-License-Key: YOUR_LICENSE_KEY
Accept: */*
Content-Length: 133
{
  "timestamp": TIMESTAMP_IN_UNIX_EPOCH,
  "message": "User 'xyz' logged in",
  "logtype": "accesslogs",
  "service": "login-service",
  "hostname": "login.example.com"
}

Create custom parsing rules

Many logs are formatted or structured in a unique way. In order to parse them, custom logic must be built and applied.

From the left nav in the Logs UI, select Parsing, then create your own custom parsing rule with an attribute, value, and Grok pattern.

To create and manage your own, custom parsing rules:

Go to one.newrelic.com > Logs.
From Manage Data on the left nav of the Logs UI, click Parsing, then click Create parsing rule.
Enter the parsing rule's name.
Choose an attribute and value to match on.
Write your Grok pattern and test the rule. To learn about Grok and custom parsing rules, read our blog post about how to parse logs with Grok patterns.
Enable and save the custom parsing rule.

To view the list of custom parsing rules: From Manage Data on the left nav of the Logs UI, click Parsing.

To view existing parsing rules:

Go to one.newrelic.com > Logs.
From Manage Data on the left nav of the Logs UI, click Parsing.

Troubleshooting

If parsing is not working the way you intended, it may be due to:

Logic: The parsing rule matching logic does not match the logs you want.
Timing: If your parsing matching rule targets a value that doesn't exist yet, it will fail. This can occur if the value is added later in the pipeline as part of the enrichment process.
Limits: There is a fixed amount of time available every minute to process logs via parsing, patterns, drop filters, etc. If the maximum amount of time has been spent, parsing will be skipped for additional log event records.

To resolve these problems, create or adjust your custom parsing rules.

Parsing example

How log parsing works

Parse attributes using Grok

Grok example: Getting useful data out of your logs

UI example: Creating a Grok parsing rule

Grok syntax and supported types

Organizing by logtype

重要

Limits

ヒント

Built-in parsing rules

List of built-in rules

Add the `logtype` attribute

ヒント

New Relic infrastructure agent example

Fluentd example

Fluent Bit example

Logstash example

Logs API example

Create custom parsing rules

Troubleshooting

Parsing log data

Parsing example

How log parsing works

Parse attributes using Grok

UI example: Creating a Grok parsing rule

Grok syntax and supported types

Organizing by logtype

重要

Limits

ヒント

Built-in parsing rules

List of built-in rules

Add the logtype attribute

ヒント

New Relic infrastructure agent example

Fluentd example

Fluent Bit example

Logstash example

Logs API example

Create custom parsing rules

Troubleshooting

Add the `logtype` attribute