Skip to content

Commit

Permalink
feat(triggers): Add Support for defining MBean Triggers to the cryost…
Browse files Browse the repository at this point in the history
…at agent (#197)

Signed-off-by: jmatsuok <[email protected]>
Co-authored-by: Andrew Azores <[email protected]>
  • Loading branch information
Josh-Matsuoka and andrewazores authored Oct 3, 2023
1 parent 2d81089 commit ba975ca
Show file tree
Hide file tree
Showing 12 changed files with 866 additions and 166 deletions.
38 changes: 37 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,41 @@ JAVA_OPTIONS="-Dcom.sun.management.jmxremote.port=9091 -Dcom.sun.management.jmxr
```
This assumes that the agent JAR has been included in the application image within `/deployments/app/`.

## SMART TRIGGERS

`cryostat-agent` supports smart triggers that listen to the values of the MBean Counters and can start recordings based on a set of constraints specified by the user.
The general form of a smart trigger expression is as follows:

```
[constraint1(&&/||)constraint2...constraintN]~recordingTemplate
```

An example for listening to CPU Usage and starting a recording using the Profiling template when it exceeds 0.2%:

```
[ProcessCpuLoad>0.2]~profile
```

An example for watching for the Thread Count to exceed 20 for longer than 10 seconds and starting a recording using the Continuous template:

```
[ThreadCount>20&&TargetDuration>duration("10s")]~Continuous
```

These must be passed as an argument to the cryostat agent, for example:

```
JAVA_OPTIONS="-javaagent:/deployments/app/cryostat-agent-${CRYOSTAT_AGENT_VERSION}.jar=[ProcessCpuLoad>0.2]~profile
```

Multiple smart trigger definitions may be specified and separated by commas, for example:

```
[ProcessCpuLoad>0.2]~profile,[ThreadCount>30]~Continuous
```

**NOTE**: Smart Triggers are evaluated on a polling basis. The poll period is configurable (see list below). This means that your conditions are subject to sampling biases.

## CONFIGURATION

`cryostat-agent` uses [smallrye-config](https://github.com/smallrye/smallrye-config) for configuration.
Expand All @@ -54,7 +89,7 @@ and how it advertises itself to a Cryostat server instance. Required properties
- [ ] `cryostat.agent.app.jmx.port` [`int`]: the JMX RMI port that the application is listening on. The default is to attempt to determine this from the `com.sun.management.jmxremote.port` system property.
- [ ] `cryostat.agent.registration.retry-ms` [`long`]: the duration in milliseconds between attempts to register with the Cryostat server. Default `5000`.
- [ ] `cryostat.agent.exit.signals` [`[String]`]: a comma-separated list of signals that the agent should handle. When any of these signals is caught the agent initiates an orderly shutdown, deregistering from the Cryostat server and potentially uploading the latest harvested JFR data. Default `INT,TERM`.
- [ ] `cryostat.agent.exit.deregistration.timeout-ms` [`long`]: the duration in milliseconds to wait for a response from the Cryostat server when attempting to deregister at shutdown time . Default `3s`.
- [ ] `cryostat.agent.exit.deregistration.timeout-ms` [`long`]: the duration in milliseconds to wait for a response from the Cryostat server when attempting to deregister at shutdown time . Default `3000`.
- [ ] `cryostat.agent.harvester.period-ms` [`long`]: the length of time between JFR collections and pushes by the harvester. This also controls the maximum age of data stored in the buffer for the harvester's managed Flight Recording. Every `period-ms` the harvester will upload a JFR binary file to the `cryostat.agent.baseuri` archives. Default `-1`, which indicates no harvesting will be performed.
- [ ] `cryostat.agent.harvester.template` [`String`]: the name of the `.jfc` event template configuration to use for the harvester's managed Flight Recording. Default `default`, the continuous monitoring event template.
- [ ] `cryostat.agent.harvester.max-files` [`String`]: the maximum number of pushed files that Cryostat will keep over the network from the agent. This is supplied to the harvester's push requests which instructs Cryostat to prune, in a FIFO manner, the oldest JFR files within the attached JVM target's storage, while the number of stored recordings is greater than this configuration's maximum file limit. Default `2147483647` (`Integer.MAX_VALUE`).
Expand All @@ -63,6 +98,7 @@ and how it advertises itself to a Cryostat server instance. Required properties
- [ ] `cryostat.agent.harvester.exit.max-size-b` [`long`]: the JFR `maxsize` setting, specified in bytes, to apply to exit uploads as described above.
- [ ] `cryostat.agent.harvester.max-age-ms` [`long`]: the JFR `maxage` setting, specified in milliseconds, to apply to periodic uploads during the application lifecycle. Defaults to `0`, which is interpreted as 1.5x the harvester period (`cryostat.agent.harvester.period-ms`).
- [ ] `cryostat.agent.harvester.max-size-b` [`long`]: the JFR `maxsize` setting, specified in bytes, to apply to periodic uploads during the application lifecycle. Defaults to `0`, which means `unlimited`.
- [ ] `cryostat.agent.smart-trigger.evaluation.period-ms` [`long`]: the length of time between Smart Trigger evaluations. Default `1000`.

These properties can be set by JVM system properties or by environment variables. For example, the property
`cryostat.agent.baseuri` can be set using `-Dcryostat.agent.baseuri=https://mycryostat.example.com:1234/` or
Expand Down
16 changes: 16 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
<javax.annotation.version>1.3.2</javax.annotation.version><!-- used by smallrye -->
<io.smallrye.config.version>2.12.3</io.smallrye.config.version>
<org.slf4j.version>2.0.7</org.slf4j.version>
<org.projectnessie.cel.bom.version>0.3.21</org.projectnessie.cel.bom.version>

<com.github.spotbugs.version>4.7.3</com.github.spotbugs.version>
<com.github.spotbugs.plugin.version>4.7.3.6</com.github.spotbugs.plugin.version>
Expand All @@ -70,6 +71,17 @@
<org.jsoup.version>1.15.3</org.jsoup.version>
</properties>

<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.projectnessie.cel</groupId>
<artifactId>cel-bom</artifactId>
<version>${org.projectnessie.cel.bom.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>io.cryostat</groupId>
Expand All @@ -82,6 +94,10 @@
<artifactId>dagger</artifactId>
<version>${com.google.dagger.version}</version>
</dependency>
<dependency>
<groupId>org.projectnessie.cel</groupId>
<artifactId>cel-tools</artifactId>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
Expand Down
5 changes: 5 additions & 0 deletions src/main/java/io/cryostat/agent/Agent.java
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
import javax.inject.Named;
import javax.inject.Singleton;

import io.cryostat.agent.triggers.TriggerEvaluator;

import dagger.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand Down Expand Up @@ -87,6 +89,7 @@ public static void main(String[] args) {
});
webServer.start();
registration.start();
client.triggerEvaluator().start(args);
log.info("Startup complete");
} catch (Exception e) {
log.error(Agent.class.getSimpleName() + " startup failure", e);
Expand Down Expand Up @@ -143,6 +146,8 @@ interface Client {

Harvester harvester();

TriggerEvaluator triggerEvaluator();

ScheduledExecutorService executor();

@Named(ConfigModule.CRYOSTAT_AGENT_EXIT_SIGNALS)
Expand Down
10 changes: 10 additions & 0 deletions src/main/java/io/cryostat/agent/ConfigModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ public abstract class ConfigModule {
public static final String CRYOSTAT_AGENT_HARVESTER_MAX_SIZE_B =
"cryostat.agent.harvester.max-size-b";

public static final String CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS =
"cryostat.agent.smart-trigger.evaluation.period-ms";

public static final String CRYOSTAT_AGENT_API_WRITES_ENABLED =
"cryostat.agent.api.writes-enabled";

Expand Down Expand Up @@ -287,4 +290,11 @@ public static List<String> provideCryostatAgentExitSignals(SmallRyeConfig config
public static long provideCryostatAgentExitDeregistrationTimeoutMs(SmallRyeConfig config) {
return config.getValue(CRYOSTAT_AGENT_EXIT_DEREGISTRATION_TIMEOUT_MS, long.class);
}

@Provides
@Singleton
@Named(CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS)
public static long provideCryostatSmartTriggerEvaluationPeriodMs(SmallRyeConfig config) {
return config.getValue(CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS, long.class);
}
}
112 changes: 112 additions & 0 deletions src/main/java/io/cryostat/agent/FlightRecorderHelper.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/*
* Copyright The Cryostat Authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package io.cryostat.agent;

import java.lang.management.ManagementFactory;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;

import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
import jdk.jfr.FlightRecorder;
import jdk.jfr.Recording;
import jdk.management.jfr.ConfigurationInfo;
import jdk.management.jfr.FlightRecorderMXBean;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class FlightRecorderHelper {

private final FlightRecorderMXBean bean =
ManagementFactory.getPlatformMXBean(FlightRecorderMXBean.class);
private final Logger log = LoggerFactory.getLogger(getClass());

// FIXME this is repeated logic shared with Harvester startRecording
public void startRecording(String templateNameOrLabel) {
getTemplate(templateNameOrLabel)
.ifPresentOrElse(
c -> {
long recordingId = bean.newRecording();
bean.setPredefinedConfiguration(recordingId, c.getName());
String recoringName =
String.format("cryostat-smart-trigger-%d", recordingId);
bean.setRecordingOptions(
recordingId, Map.of("name", recoringName, "disk", "true"));
bean.startRecording(recordingId);
log.info(
"Started recording \"{}\" using template \"{}\"",
recoringName,
templateNameOrLabel);
},
() ->
log.error(
"Cannot start recording with template named or labelled {}",
templateNameOrLabel));
}

public Optional<ConfigurationInfo> getTemplate(String nameOrLabel) {
return bean.getConfigurations().stream()
.filter(c -> c.getName().equals(nameOrLabel) || c.getLabel().equals(nameOrLabel))
.findFirst();
}

public boolean isValidTemplate(String nameOrLabel) {
return getTemplate(nameOrLabel).isPresent();
}

public List<RecordingInfo> getRecordings() {
return FlightRecorder.getFlightRecorder().getRecordings().stream()
.map(RecordingInfo::new)
.collect(Collectors.toList());
}

@SuppressFBWarnings(value = "URF_UNREAD_FIELD")
public static class RecordingInfo {

public final long id;
public final String name;
public final String state;
public final Map<String, String> options;
public final long startTime;
public final long duration;
public final boolean isContinuous;
public final boolean toDisk;
public final long maxSize;
public final long maxAge;

RecordingInfo(Recording rec) {
this.id = rec.getId();
this.name = rec.getName();
this.state = rec.getState().name();
this.options = rec.getSettings();
if (rec.getStartTime() != null) {
this.startTime = rec.getStartTime().toEpochMilli();
} else {
this.startTime = 0;
}
this.isContinuous = rec.getDuration() == null;
this.duration = this.isContinuous ? 0 : rec.getDuration().toMillis();
this.toDisk = rec.isToDisk();
this.maxSize = rec.getMaxSize();
if (rec.getMaxAge() != null) {
this.maxAge = rec.getMaxAge().toMillis();
} else {
this.maxAge = 0;
}
}
}
}
33 changes: 33 additions & 0 deletions src/main/java/io/cryostat/agent/MainModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
import io.cryostat.agent.Harvester.RecordingSettings;
import io.cryostat.agent.remote.RemoteContext;
import io.cryostat.agent.remote.RemoteModule;
import io.cryostat.agent.triggers.TriggerEvaluator;
import io.cryostat.agent.triggers.TriggerParser;
import io.cryostat.core.net.JFRConnection;
import io.cryostat.core.net.JFRConnectionToolkit;
import io.cryostat.core.sys.Environment;
Expand Down Expand Up @@ -69,6 +71,7 @@ public abstract class MainModule {
private static final int NUM_WORKER_THREADS = 3;
private static final String JVM_ID = "JVM_ID";
private static final String TEMPLATES_PATH = "TEMPLATES_PATH";
private static final String TRIGGER_SCHEDULER = "TRIGGER_SCHEDULER";

@Provides
@Singleton
Expand Down Expand Up @@ -270,6 +273,36 @@ public static Harvester provideHarvester(
registration);
}

@Provides
@Singleton
@Named(TRIGGER_SCHEDULER)
public static ScheduledExecutorService provideTriggerScheduler() {
return Executors.newScheduledThreadPool(0);
}

@Provides
@Singleton
public static FlightRecorderHelper provideFlightRecorderHelper() {
return new FlightRecorderHelper();
}

@Provides
@Singleton
public static TriggerParser provideTriggerParser(FlightRecorderHelper helper) {
return new TriggerParser(helper);
}

@Provides
@Singleton
public static TriggerEvaluator provideTriggerEvaluatorFactory(
@Named(TRIGGER_SCHEDULER) ScheduledExecutorService scheduler,
TriggerParser parser,
FlightRecorderHelper helper,
@Named(ConfigModule.CRYOSTAT_AGENT_SMART_TRIGGER_EVALUATION_PERIOD_MS)
long evaluationPeriodMs) {
return new TriggerEvaluator(scheduler, parser, helper, evaluationPeriodMs);
}

@Provides
@Singleton
public static FileSystem provideFileSystem() {
Expand Down
Loading

0 comments on commit ba975ca

Please sign in to comment.