Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Diagnostics component #727

Merged
merged 55 commits into from
Oct 17, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
894edd0
Add thread cpu to diagnostics
desyncr Apr 5, 2021
64b256a
Add NodeDiagnostic module with NodeThreadDiagnostics
desyncr Apr 10, 2021
299c2df
Don't use single class imports
desyncr Apr 10, 2021
5b7b486
Fix indentation
desyncr Apr 10, 2021
cafa4a5
Don't use single class imports
desyncr Apr 10, 2021
c999a37
Create Diagnostics component to be able to scale to support multiple …
desyncr Apr 11, 2021
0bf9c25
Add license headers to new files
desyncr Apr 11, 2021
d1c75ad
Move thread info building into its own method
desyncr Apr 16, 2021
3c58658
Make NodeDiagnostics field private
desyncr Apr 16, 2021
23d6e38
Flatten NodeDiagnostics interface
desyncr Apr 16, 2021
de33536
Flatten NodeDiagnostics interface
desyncr Apr 17, 2021
784bdc2
Create NodeDiagnostics and ThreadDiagnostics interfaces and default i…
desyncr Apr 17, 2021
4c1beef
Reduce visibility for fields in ThreadDiagnostics
desyncr Apr 17, 2021
0f81f05
Use atomicReference for nodeThreadInfo list
desyncr Apr 17, 2021
f6aa0c7
Use thread interval to build data points
desyncr Apr 17, 2021
e8c1a5a
Fix calculation CPU time percentage
desyncr Apr 17, 2021
9aec574
Use NodeDiagnostics type interface rather than default implementation
desyncr Apr 17, 2021
b72db1b
Use DefaultNodeDiagnostics implementation
desyncr Apr 17, 2021
99835a9
Remove unnecessary finals in contructor
desyncr Apr 17, 2021
6282ffd
Remove unnecessary throw exception
desyncr Apr 17, 2021
8946cc0
Remove unnecessary copy
desyncr Apr 17, 2021
730f4d7
Fix CPU time percentaje calculation
desyncr Apr 17, 2021
05914a3
Use Comparator class to simplify threads sorting
desyncr Apr 24, 2021
512dcd3
Use single loop to calculate delta and display
desyncr Apr 24, 2021
1996223
Rename private field to follow convention
desyncr Apr 24, 2021
e3c24ec
Show percentage cpu time between process threads
desyncr Apr 27, 2021
170961d
Separate presentation from actual data for NodeThreadInfo
desyncr Apr 29, 2021
ca9e9f3
Compute % CPU by calculate the total CPU time from all threads, not o…
desyncr Apr 29, 2021
d8241d3
Renaming internal variables
desyncr Apr 29, 2021
1c5c529
Re-introduce delta CPU Time and simplify code structure
desyncr Apr 29, 2021
2d36d9e
Output formatting in DiagnosticToadlet
desyncr Apr 30, 2021
56c9e0d
Handle case when thread.getThreadGroup returns null
desyncr May 2, 2021
dd2ccff
Remove long -> double coercion
desyncr May 9, 2021
9d93258
Use NodeThreadSnapshot to hold thread list, total CPU and interval
desyncr May 9, 2021
e52fc35
Update ConfigToadlet to support enabling/disabling node diagnostics m…
desyncr May 9, 2021
c80d2dc
Avoid unnecessary casting to double for getCpuTimeDelta
desyncr May 15, 2021
89b6cfe
Declare interface rather than implementation
desyncr May 15, 2021
085c5d7
Clean up nodeConfig callback
desyncr May 15, 2021
780b362
Update configuration description
desyncr May 15, 2021
030d225
Check thread snapshot is available when displaying
desyncr May 15, 2021
612428a
Simplify description and normalize names
desyncr May 15, 2021
a11e880
Clean up unnecessary space
desyncr May 15, 2021
e554662
Correct language and simplify terms
desyncr May 15, 2021
7fc2794
Fix grammar mistake on translation for DiagnosticsDescription
desyncr May 17, 2021
b64b826
Calculate CPU time as % of wall time
desyncr May 30, 2021
a37a2d7
Merge remote-tracking branch 'origin/thread-diagnostics-cpu' into thr…
desyncr May 30, 2021
62654e9
Avoid possible race condition on start up
desyncr Jun 24, 2021
d33ccaa
Add docblock to threadStats method
desyncr Jun 24, 2021
405aa8d
Fix tab vs space mix up
desyncr Jun 24, 2021
5cfa3c8
Remove unneccessary code style fixes
desyncr Jun 24, 2021
15ede8b
newline at end of line
desyncr Jun 24, 2021
ff72add
Create threadSnapshot inner class to avoid pooler executor messing th…
desyncr Aug 1, 2021
d1660b9
Replace threadCpu for threadSnapshot
desyncr Aug 1, 2021
65de06b
Use thread synchronization to avoid misnaming
desyncr Aug 1, 2021
456d9c2
Use job ID to keep track of CPU usage
desyncr Aug 14, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 64 additions & 8 deletions src/freenet/clients/http/DiagnosticToadlet.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,16 @@

import java.io.File;
import java.io.IOException;
import java.lang.management.*;
import java.net.URI;
import java.text.DecimalFormat;
import java.text.NumberFormat;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import java.util.stream.*;

import freenet.client.HighLevelSimpleClient;
import freenet.client.async.PersistenceDisabledException;
Expand All @@ -29,6 +33,8 @@
import freenet.node.PeerNodeStatus;
import freenet.node.RequestTracker;
import freenet.node.Version;
import freenet.node.diagnostics.*;
import freenet.node.diagnostics.threads.*;
import freenet.node.stats.DataStoreInstanceType;
import freenet.node.stats.DataStoreStats;
import freenet.node.stats.StatsNotAvailableException;
Expand Down Expand Up @@ -410,19 +416,69 @@ public int compare(PeerNodeStatus firstNode, PeerNodeStatus secondNode) {
textBuilder.append("\n");

// drawThreadPriorityStatsBox
textBuilder.append("Threads:\n");
int[] activeThreadsByPriority = stats.getActiveThreadsByPriority();
int[] waitingThreadsByPriority = stats.getWaitingThreadsByPriority();
for(int i=0; i<activeThreadsByPriority.length; i++) {
textBuilder.append(l10n("running")).append(": ").append(String.valueOf(activeThreadsByPriority[i])).append(" (").append(String.valueOf(i+1)).append(")\n");
textBuilder.append(l10n("waiting")).append(": ").append(String.valueOf(waitingThreadsByPriority[i])).append(" (").append(String.valueOf(i+1)).append(")\n");
if (node.isNodeDiagnosticsEnabled()) {
textBuilder.append(threadsStats());
textBuilder.append("\n");
}
textBuilder.append("\n");
}

this.writeTextReply(ctx, 200, "OK", textBuilder.toString());
}

/**
* Retrieves ThreadDiagnostics (through NodeDiagnostics) to display
* thread information (id, name, group, % cpu, etc).
* @return Thread information in tab separated format.
*/
private StringBuilder threadsStats() {
StringBuilder sb = new StringBuilder();

ThreadDiagnostics threadDiagnostics = node
.getNodeDiagnostics()
.getThreadDiagnostics();

NodeThreadSnapshot threadSnapshot = threadDiagnostics.getThreadSnapshot();

double wallTime = TimeUnit.MILLISECONDS.toNanos(
threadSnapshot.getInterval()
);

List<NodeThreadInfo> threads = threadSnapshot.getThreads();
threads.sort(Comparator.comparing(NodeThreadInfo::getCpuTime).reversed());

sb.append(String.format("Threads (%d):%n", threads.size()));

// Thread ID, Job ID, Name, Priority, Group (system, main), Status, % CPU
sb.append(
String.format(
"%10s %15s %-90s %5s %10s %-20s %-5s%n",
"Thread ID",
"Job ID",
"Name",
"Prio.",
"Group",
"Status",
"% CPU"
)
);

for (NodeThreadInfo thread : threads) {
String line = String.format(
"%10s %15s %-90s %5s %10s %-20s %.2f%n",
thread.getId(),
thread.getJobId(),
thread.getName().substring(0, Math.min(90, thread.getName().length())),
thread.getPrio(),
thread.getGroupName().substring(0, Math.min(10, thread.getGroupName().length())),
thread.getState(),
thread.getCpuTime() / wallTime * 100
);
sb.append(line);
}

return sb;
}

private int getPeerStatusCount(PeerNodeStatus[] peerNodeStatuses, int status) {
int count = 0;
for (PeerNodeStatus peerNodeStatus: peerNodeStatuses) {
Expand Down Expand Up @@ -471,4 +527,4 @@ private String l10n(String key, String[] patterns, String[] values) {
public String path() {
return TOADLET_URL;
}
}
}
2 changes: 2 additions & 0 deletions src/freenet/l10n/freenet.l10n.en.properties
Original file line number Diff line number Diff line change
Expand Up @@ -1105,6 +1105,8 @@ Node.enablePerNodeFailureTables=Enable per-node failure tables?
Node.enablePerNodeFailureTablesLong=Enable automatically rerouting around nodes that failed a request within the last 10 minutes?
Node.enableRoutedPing=Enable FNPRoutedPing?
Node.enableRoutedPingLong=Enable FNPRoutedPing? Only useful in simulations, not on the real network. Turn it off.
Node.enableDiagnostics=Enable Diagnostics?
Node.enableDiagnosticsLong=By enabling Diagnostics the node will keep detailed information of its inner-workings (such as CPU usage per thread) which can help to troubleshoot problems. The collected data is kept in memory (i.e. not persisted on disk) and it's not sent to anybody over the network.
Node.enableSwapping=Enable location swapping?
Node.enableSwappingLong=Enable location swapping?
Node.enableSwapQueueing=Enable queueing of swap requests?
Expand Down
54 changes: 52 additions & 2 deletions src/freenet/node/Node.java
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import java.util.Set;

import freenet.config.*;
import freenet.node.diagnostics.*;
import freenet.node.useralerts.*;
import org.tanukisoftware.wrapper.WrapperManager;

Expand Down Expand Up @@ -723,6 +724,9 @@ public String[] getPossibleValues() {

public final SecurityLevels securityLevels;

/** Diagnostics */
private final DefaultNodeDiagnostics nodeDiagnostics;

// Things that's needed to keep track of
public final PluginManager pluginManager;

Expand Down Expand Up @@ -750,6 +754,8 @@ public String[] getPossibleValues() {

private boolean enableRoutedPing;

private boolean enableNodeDiagnostics;

private boolean peersOffersDismissed;

/**
Expand Down Expand Up @@ -2536,7 +2542,38 @@ public void set(Boolean val) throws InvalidConfigValueException,

});
enableRoutedPing = nodeConfig.getBoolean("enableRoutedPing");


nodeConfig.register(
"enableNodeDiagnostics",
false,
sortOrder++,
true,
false,
"Node.enableDiagnostics",
"Node.enableDiagnosticsLong",
new BooleanCallback() {
@Override
public Boolean get() {
synchronized (Node.this) {
return enableNodeDiagnostics;
}
}

@Override
public void set(Boolean val) {
synchronized (Node.this) {
enableNodeDiagnostics = val;
nodeDiagnostics.stop();

if (enableNodeDiagnostics) {
nodeDiagnostics.start();
}
}
}
}
);
enableNodeDiagnostics = nodeConfig.getBoolean("enableNodeDiagnostics");

updateMTU();

// peers-offers/*.fref files
Expand Down Expand Up @@ -2601,6 +2638,8 @@ public void realRun() {
System.out.println("Node constructor completed");

new BandwidthManager(this).start();

nodeDiagnostics = new DefaultNodeDiagnostics(this.nodeStats, this.ticker);
}

private void peersOffersFrefFilesConfiguration(SubConfig nodeConfig, int configOptionSortOrder) {
Expand Down Expand Up @@ -3166,6 +3205,10 @@ public void start(boolean noSwaps) throws NodeInitException {
// Process any data in the extra peer data directory
peers.readExtraPeerData();

if (enableNodeDiagnostics) {
nodeDiagnostics.start();
}

Logger.normal(this, "Started node");

hasStarted = true;
Expand Down Expand Up @@ -4892,5 +4935,12 @@ public PluginManager getPluginManager() {
DatabaseKey getDatabaseKey() {
return databaseKey;
}


public NodeDiagnostics getNodeDiagnostics() {
return nodeDiagnostics;
}

public boolean isNodeDiagnosticsEnabled() {
return enableNodeDiagnostics;
}
}
46 changes: 46 additions & 0 deletions src/freenet/node/diagnostics/DefaultNodeDiagnostics.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/* This code is part of Freenet. It is distributed under the GNU General
* Public License, version 2 (or at your option any later version). See
* http://www.gnu.org/ for further details of the GPL. */
package freenet.node.diagnostics;

import freenet.node.diagnostics.threads.*;
import freenet.support.Ticker;
import freenet.node.NodeStats;

/**
* @author desyncr
*
* A class to retrieve data to build diagnostic dumps to help in determining
* node bottlenecks or misconfiguration.
*
* This class launches various threads at intervals to retrieve information. This information
* is available through the public methods.
* Some data pointers are obtained from NodeStats object.
*/
public class DefaultNodeDiagnostics implements NodeDiagnostics {
private final DefaultThreadDiagnostics defaultThreadDiagnostics;

/**
* @param nodeStats Used to retrieve data points.
* @param ticker Used to queue timed jobs.
*/
public DefaultNodeDiagnostics(NodeStats nodeStats, Ticker ticker) {
defaultThreadDiagnostics = new DefaultThreadDiagnostics(nodeStats, ticker);
}

public void start() {
defaultThreadDiagnostics.start();
}

public void stop() {
defaultThreadDiagnostics.stop();
}

/**
* @return List of threads registered in NodeStats.getThreads()
*/
@Override
public ThreadDiagnostics getThreadDiagnostics() {
return defaultThreadDiagnostics;
}
}
10 changes: 10 additions & 0 deletions src/freenet/node/diagnostics/NodeDiagnostics.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
/* This code is part of Freenet. It is distributed under the GNU General
* Public License, version 2 (or at your option any later version). See
Bombe marked this conversation as resolved.
Show resolved Hide resolved
* http://www.gnu.org/ for further details of the GPL. */
package freenet.node.diagnostics;

import freenet.node.diagnostics.threads.*;

public interface NodeDiagnostics {
ThreadDiagnostics getThreadDiagnostics();
}
10 changes: 10 additions & 0 deletions src/freenet/node/diagnostics/ThreadDiagnostics.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
/* This code is part of Freenet. It is distributed under the GNU General
* Public License, version 2 (or at your option any later version). See
* http://www.gnu.org/ for further details of the GPL. */
package freenet.node.diagnostics;

import freenet.node.diagnostics.threads.*;

public interface ThreadDiagnostics {
NodeThreadSnapshot getThreadSnapshot();
}
Loading