The New Relic Databricks integration can collect telemetry from Spark running on Databricks as well as from any Spark deployment that is not running on Databricks.
By default, the integration automatically connects to and collects telemetry from Spark deployments in all clusters created through the UI or API in the specified workspace. This integration supports the Collect Spark telemetry capability.
Set up the integration
This integration uses a standalone tool from the New Relic experimental repository. This integration can be run on a host, or locally for testing. This integration runs on these host platforms:
- Linux amd64
- Windows amd64
Sugerencia
For more information, refer to the GitHub ReadMe for this integration.
Deploy on-host
To deploy this integration on a host (example: EC2), follow these steps:
Download the appropriate archive for your platform from the latest release.
Extract the archive to a new or existing directory.
Create a directory named configs in the same directory.
Create a file named
config.yml
in the configs directory and copy the contents of theconfigs/config.template.yml
file in this repository into it.Edit the
config.yml
file to configure the integration appropriately for your environment.From the directory where the archive was extracted, execute the integration binary using the command following command copying any command line options as necessary:
$# Linux$ ./newrelic-databricks-integration $ $ # Windows$ .\newrelic-databricks-integration.exe
Deploy on a databricks cluster
The New Relic Databricks integration can be deployed on the driver node of a Databricks cluster using a cluster-scoped init script. The init script uses custom environment variables to specify configuration parameters necessary for the integration configuration.
To install the init script, follow these steps:
Login to your Databricks account and navigate to the desired workspace.
Follow the recommendations for init scripts to store the
cluster_init_integration.sh
script within your workspace in the recommended manner. For example, if your workspace is enabled for unity catalog, you should store the init script in a unity catalog volume.Go to the Compute tab and select the desired all-purpose or job compute to open the compute details UI.
Click the Edit button to edit the compute's configuration.
Follow the steps to use the UI to configure a cluster-scoped init script and point to the location where you stored the init script in step 2 above.
If your cluster is not running, click the Confirm button to save your changes. Then, restart the cluster. If your cluster is already running, click the Confirm and restart button to save your changes, and restart the cluster.
Additionally, follow the steps to set environment variables to add the following environment variables:
NEW_RELIC_API_KEY
: Your New Relic user API key.NEW_RELIC_LICENSE_KEY
: Your New Relic license key.NEW_RELIC_ACCOUNT_ID
: Your New Relic account ID.NEW_RELIC_REGION
: The region of your New Relic account; one of US or EU.NEW_RELIC_DATABRICKS_WORKSPACE_HOST
: The instance name of the target Databricks instance.NEW_RELIC_DATABRICKS_ACCESS_TOKEN
: To authenticate with a personal access token, your personal access token.NEW_RELIC_DATABRICKS_OAUTH_CLIENT_ID
: To use a service principal to authenticate with Databricks (OAuth M2M), the OAuth client ID for the service principal.NEW_RELIC_DATABRICKS_OAUTH_CLIENT_SECRET
: To use a service principal to authenticate with Databricks (OAuth M2M), an OAuth client secret associated with the service principal.
Sugerencia
Note that the NEW_RELIC_API_KEY
and NEW_RELIC_ACCOUNT_ID
are currently unused, but are required by the new-relic-client-go
module used by the integration.
Additionally, note that only the personal access token or OAuth credentials need to be specified, but not both. If both are specified, the OAuth credentials take precedence.
Finally, make sure to restart the cluster following the configuration of the environment variables.
Install our DataBricks monitoring dashboard
To set up our pre-built DataBricks dashboard to monitor your application metrics, go to the DataBricks dashboard installation and follow the instructions. Once installed, the dashboard should display metrics.
If you need help with dashboards, see:
- Introduction to dashboards to customize your dashboard and carry out different actions.
- Manage your dashboard to adjust your display mode, or to add more content to your dashboard.