Forecasting | New Relic Documentation

In order to begin forecasting our future data ingest we must develop an understand of the kinds of growth drivers that will potentially impact some or all of of our data sources. The following sections describe what we call general growth drivers. Finally we introduce the concept of a growth driver worksheet that can be used in your ingest Budgeting process.

Understanding Growth Drivers

Seasonal and Business Cycle Growth

It's critical to understand the sources of telemetry growth that will occur through the year and over the years. Some of these are generally anticipate and others may be unexpected and others completely anomalous. These concepts are important when coming up with baseline budgets and growth targets and can also help during an ad hoc resolution of unexpected telemetry growth. This class of user growth is often welcome but can also seem overwhelming if we have not data governance plan in place. Our business is growing and we are bringing in new users and the activity of each of those users is causing additional Browser, APM, and Log data to be emitted. The need to scale K8s clusters, load balancers, and supporting platforms like Kafka also cause an increase in the emitted telemetry. Another type of growth is caused by an increase in business transactions without an obvious increase in the number of users. For example a website that sells one type of product (Shoes) has now broadened its inventory to offer Hats and Gloves. This results in more business transactions per user causing a similar cascade in telemetry as an increase in users.

Code Refactor

There are some scenarios where a change in application code will cause a sudden increase in telemetry volume without any additional users or business transactions. Some examples:

A developer adds additional java javascript code that interacts with the backend every time a user visits a page.
A developer adds new logging code to some application methods that are called very frequently.
A new database schema requires multiple database calls where previously one was needed.
A monolithic application is broken into 5 microservices with the resulting APM and distriubuted trace data being emitted for each.

Instrumentation Misconfiguration

Field Example: An organization previously used a 30s sampling rate for core operating system metrics for about 2000 hosts. A misconfiguration or unauthorized change to 10s tripled the OS metrics telemetry captured from those hosts.

Later on we will discuss telemetry standards. One of the thing often governed by a telemetry standard is the sampling rate for various monitoring activities.

Increasing Breadth of Observability

Field Example: It is part of the continuous improvement process to expand observability. An organization that was monitoring Kafka broker health for nearly two years decided to start monitoring Kafa topic offsets. Not realizing the verbosity of topic monitoring data they are surprised when the telemetry footprint from Kafka monitoring increase 5 times.

Caution

Mergers and acquisitions are a common way in which telemetry growth "sneaks up" on an organization. We suggest you incorporate observability consolidation as a formal action item as part of the integration process. This is no different then adding cloud compute consolidation to your overall integration plans. This is also an area where you should fall back on your New Relic account team since there are times when these events are somewhat unexpected.

Unexpected Third Party Change

Field Example: A JMX integration is designed to get any metric exposed by a third party application with prefix "Transaction". With version 1.0 of the application that yields in 10 events per sample. The team maintaining the third party application adds 10 additional metrics with the "Transaction" prefix. When our team installs the new code, we are a bit surprised to know what JMX events have increased 2 times.

Growth in User Counts or in Overall User Activity

[TBD]

Growth Driver Worksheet

Understanding the growth drivers of your telemetry is as important as understanding the value drivers. With growth drivers there are ways in which they are highly correlated, but in generally we should choose the main and most direct driver. The following growth drivers can be used to augment the telemetry backlog with information that can be used to understand the quarterly and yearly growth characteristics.

Increase in Active Users
Increase in Number of Products
Infrastructure Scaling Initiative
Innovation Initiatives
Efficency Efforts (Reduction of Telemetry Ingest)
Major code refactor

If your business operation is growing it makes sense to factor in growth. It wouldn't make sense to deploy a system and expect cloud compute costs to be flat despite growing users 20% or adding 2 times more products in a year. This attention to growth drivers is no different.

As part of the data governance practice we recommend you generate a sheet like this for each sub-account. These driver sheets will be used in the planning framework to generate forecasts of telemetry growth.

Team	Growth Driver
Streaming Video	This team is refactoring some K8s infrastructure. Currently the have on-prem K8s clusters managed in their data center. They expect to be spinning up some new clusters in their AWS VPC and for much of the quarter they may have redundant infra. The K8s Telemetry SME helped them arrive a growth rate of just under 5% for the quarter. In the quarter after this they may have a flat or negative reate as they bring down the on-prem clusters
Cloud Platform Team	This team has plans to reduce log volume substantiall by getting rid of some excessively chatty, low value logs from some of their cloud services. Using a deep dive analysis using `bytecountestimate()` they came up with a plan to reduce ingest by 5% over the quarter. So they should see negative growth rate over 90 days
International Services	This teams plan to add support for two additional countries. Working with the APM K8s and Mobile SMEs they were able to come up with an estimate of 7.5% groth, mustly coming from increased Mobile events. Since they have good forecasts of how much user activity they should see they were able to built a relatively good model based on current ingest with the 5 countries they currently support.
Shipping & Receiving	This team plans to add application logs this quarter. Using estimates derived from the number of logs current recorded to disk and using some factors to account for the additional logs-in-context tags that will be added. This team expects a 12.6% growth this quarter. The Logging SME has given them excellent guidance on using New Relic drop rules as well as how to streamline the data in Fluentbit so they are confident that they will be able to steer into this estimate
Marketing Technology	This team is refactoring a Java monolith into 3 or 4 separate microservices. Based on some code analysis from other refactors and a careful audit of the Telemetry behavior of the monolith thsi team has forecast a 26.7% growth rate. This is relatively large. However this is the kind of refactor that should leave the code base relatively stable for another 3 to 5 years.

As part of this analysis we also recommend you generate an overall account growth factor based on your analysis of the baseline reports and the growth driver sheet. This growth drivefr is a single number that will be used in the final telemetry budget sheet.

Additional Technical Resources

Manage Incoming Data

Data Management Hub

Drop Data Using Nerdgraph

Alert on Data Ingest Anomalies

Automating Telemetry Workflows

Metrics Aggregation and Events to Metrics