Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CON-151] Refactor snapback into monitoring queue #3142

Merged
merged 9 commits into from
May 25, 2022

Conversation

theoilie
Copy link
Contributor

@theoilie theoilie commented May 24, 2022

Description

  • Made new StateMonitoringQueue and StateReconciliationQueue classes
    • Only StateMonitoringQueue is implemented so I can get feedback before doing something similar for StateReconciliationQueue
    • Add queue to service registry and bull-board
    • Rate limit the queue to a default of 1 job per minute to avoid spamming other nodes with requests. New config options (exposed in health check):
      • stateMonitoringQueueRateLimitInterval: interval (ms) during which at most stateMonitoringQueueRateLimitJobsPerInterval jobs will run
      • stateMonitoringQueueRateLimitJobsPerInterval: number of state monitoring jobs that can run in each interval (0 to pause queue)
  • Refactored the data-gathering part of processStateMachineOperation into its own file
    • Only using functions (no class variables) and new job data (jobs also return their results so bull-board gives more info
    • Made logs more searchable (removed characters like colon, equals, and parenthesis because loggly doesn't seem to be able to escape them)
  • Extracted parts of snapback and peerSetManager into various stateless modules (nodeHealthManager.js, nodeToSpIdManager.js, utils.js, monitoring/utils.js, and clockUtils.js)
  • Move the functionality that updates the list of Service Providers to its own top-level call so we don't query that endpoint on each run. This should be moved to its own cron/queue later

Tests

  • Verified local logs and bull-board job data/results -- rate limiting is working at 1 job per minute
  • Added tests for StateMonitoringQueue
  • This will need more tests. Most core functions are copy/pasted from snapback with minor edits, so they were mostly tested before but need coverage in their new context

How will this change be monitored? Are there sufficient logs?

  • Search logs for StateMonitoringQueue
  • Verify that the monitoring jobs are running on bull-board (/health/bull/ endpoint) at 1 job per minute, and inspect their data. Jobs now take data as input and output their results, so they can be more easily debugged from bull-board without combing logs
    • Compare the output of jobs in the new monitoring queue with the output of steps for jobs from the existing snapback (state-machine) queue. The results of each step in the decision tree should be the same
  • Look through logs containing processStateMonitoringJob if necessary. This is the new equivalent of processStateMachineOperation

@theoilie theoilie requested a review from SidSethi May 24, 2022 02:16
currentModuloSlice
}
try {
// TODO: Wire up metrics
Copy link
Contributor Author

@theoilie theoilie May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will wire this up for its own health check later if desired. there can be separate checks for monitoring and reconciliation queues

Copy link
Contributor

@SidSethi SidSethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submitting what i have so far lol, so much code! still going through

@SidSethi SidSethi self-requested a review May 24, 2022 21:40
- Remove startTime (it's not used, and Bull already tells us the start time)
- Move RECONFIG_MODES to constants
- Rename nodeToSpId to cNodeToSpIdMap
theoilie added 3 commits May 25, 2022 16:47
- Separate business logic from event registration
- Consolidate clockUtils into utils
- Make util export functions instead of using class
Copy link
Contributor

@SidSethi SidSethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incredible - only ask is to change default values to 0 for now

creator-node/src/config.js Show resolved Hide resolved
@theoilie theoilie merged commit 27ccd72 into master May 25, 2022
@theoilie theoilie deleted the theo-refactor-snapback-queues branch May 25, 2022 21:49
@theoilie theoilie changed the title Refactor snapback into monitoring queue [CON-151] Refactor snapback into monitoring queue May 26, 2022
sliptype pushed a commit that referenced this pull request Sep 10, 2023
[a82b6d2] [C-2391 PLAT-833] Handle get-discovery-notification error, notification processing (#3141) Dylan Jeffers
[b6a8470] [PAY-1073] Mobile messages settings screen (#3131) Reed
[c567b35] Separate chat text input to own component (#3142) Reed
[1a38cb1] [PAY-1090] - Subscribe to special access gates to unlock new tracks (#3108) Saliou Diallo
[9e5a45d] Set dev optimizely key (#3126) Sebastian Klingler
[c5e86fa] [PLAT-699] Add tastemaker notification to web (#3140) Dylan Jeffers
[4bc608c] [C-2295] Only show now-playing-artwork-tile when song enqueued (#3137) Dylan Jeffers
[aa34b32] [PAY-1039] Mobile chat header navs to profile (#3139) Reed
[55f5819] Optimizely attributes - use null if empty instead of undefined (#3134) nicoback2
[92ab79e] [C-2387] Improve permalink and user handle caching (#3117) Dylan Jeffers
[086af81] [PAY-1107][PAY-1112] Center mobile chat text input, enlarge icon (#3136) Reed
[fcec3f5] [PAY-1123] Fix mobile chat scroll being stuck (#3135) Reed
[a276e86] Revert "Allow Optimizely targeting w CodePush on mobile C-2394" (#3133) nicoback2
[3335b4a] [C-2357] Update the icon color of the podcast skip buttons to accept the theme color (#3132) Kyle Shanks
[1af5730] Fix dapp store versionCode (#3121) Dylan Jeffers
[7787115] Fix app store version fetched from Apple in Fastlane being out of date first 24h after new release C-2397 (#3128) nicoback2
[06c181d] Allow Optimizely targeting w CodePush on mobile C-2394 (#3127) nicoback2
[ff07c32] [PAY-1038] Fix mobile profile page button borders (#3130) Reed
[9286959] [PAY-1027] Mobile chat websockets (#3129) Reed
[81ef3b3] [PAY-1031] Mobile chat list views use full height (#3123) Reed
[41e3f5d] [C-2231] Remove audius-data, dynamically import web3modal (#1771) Raymond Jacobson
[34d80be] [PAY-1111] Remove "Done" row on ios keyboard for mobile chats (#3124) Reed
[9eb83a8] [PAY-1102] - Add supporters info drawer (#3112) Saliou Diallo
[6fe7abe] Add dependencies, bundle to mobile README (#3111) Reed
[dc8d4f2] Show header shadow in chat screens (#3106) Reed
[f871a38] [PAY-906] Mobile chats KeyboardAvoidingView (#3122) Reed
[fd00bf2] Fix chat reactions with createEntityAdapter (#3118) Reed
[9876137] Switch to 1s (#3120) Raymond Jacobson
[fe9efea] Tip using both erc_wallet and wallet (#3049) Michael Piazza
[fdfc0b6] Update dapp store readme (#3119) Raymond Jacobson
[ca58e69] [C-2359, C-2380] Update duration text for podcast tiles (#3116) Kyle Shanks
[6441421] [PLAT-695] Add trending notifications to valid_types (#3109) Dylan Jeffers
[4233ee6] [PAY-1083] - Only allow prompt modal when single track is being uploaded (#3115) Saliou Diallo
[36909d1] Always include user signature for stream (#3114) Raymond Jacobson
[b90eaf6] [C-2386] Fix notification button not toggling panel (#3110) Dylan Jeffers
[a3c2e91] Check if messagesStatus undefined in ChatScreen (#3113) Reed
[64d89ca] [C-2382] Update dapp-store cli, short_description length (#3104) Dylan Jeffers
[680931d] [C-2361, C-2360, C-2355, C-2346] Update skip 15s button behavior on mobile for podcasts (#3105) Kyle Shanks
[3eb742f] [C-2353, C-2358] Update playback position tracking for web and mobile (#3097) Kyle Shanks
[e06c4d9] Bump sdk to 2.0.3-beta.1 (#3098) Isaac Solo
[8191694] Fix "Pick some for me" button on android (#3102) Sebastian Klingler
@AudiusProject AudiusProject deleted a comment from linear bot Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants