-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to fix lock contention #1610
Conversation
WalkthroughThe recent updates involve enhancing the GraphQL service with a new caching mechanism, utilizing mutex locks to manage concurrent access. The resolver function for daily statistics now leverages this cache, ensuring thread-safe operations. Additionally, the parsing and storage of blockchain events have been refined to handle multiple items and ignore non-bridge events. Time-related functionality has been introduced for database querying, with new utility functions for managing a fallback time. Ownership assignments in the Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ? TipsChat with CodeRabbit Bot (
|
@coderabbitai review |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1610 +/- ##
===================================================
- Coverage 51.78571% 51.41925% -0.36647%
===================================================
Files 374 366 -8
Lines 25648 24837 -811
Branches 284 284
===================================================
- Hits 13282 12771 -511
+ Misses 11076 10837 -239
+ Partials 1290 1229 -61
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 1
Configuration used: CodeRabbit UI
Files ignored due to filter (5)
- contrib/promexporter/go.mod
- contrib/promexporter/go.sum
- go.work.sum
- services/explorer/go.mod
- services/explorer/go.sum
Files selected for processing (3)
- services/explorer/graphql/server/gin.go (2 hunks)
- services/explorer/graphql/server/graph/queries.resolvers.go (1 hunks)
- services/explorer/graphql/server/graph/resolver.go (2 hunks)
Additional comments: 5
services/explorer/graphql/server/gin.go (2)
11-11: The import of the
mapmutex
package is correctly placed and follows the convention of grouping third-party imports together.41-41: The
CacheMutex
field is properly initialized withmapmutex.NewStringMapMutex()
. Ensure that all usages of thegraph.Resolver
struct across the codebase are updated to handle this new field.
The shell script output indicates that the
CacheMutex
field is only referenced in thegin.go
file where it was added. There are no other instances in the codebase where thegraph.Resolver
struct is being initialized or theCacheMutex
field is being handled. This suggests that the changes are localized and do not affect other parts of the codebase.Final review comment:
The initialization of the
CacheMutex
field inservices/explorer/graphql/server/gin.go
is correct and no further action is required as there are no other references to this field in the codebase.services/explorer/graphql/server/graph/queries.resolvers.go (1)
- 189-200: The implementation of caching in the
DailyStatisticsByChain
resolver function appears to be correct. The use of a mutex lock to prevent race conditions when accessing the cache is a good practice to mitigate lock contention issues. Ensure that thekeyGenHandleNilInt
function correctly handlesnil
values to prevent potential cache key collisions.services/explorer/graphql/server/graph/resolver.go (2)
4-4: The import of
mapmutex
is correctly added to usemapmutex.StringMapMutex
in theResolver
struct.22-24: The
DB
,Fetcher
, andCache
fields have been added to theResolver
struct. Ensure that these fields are properly initialized and used within the codebase, and that their addition is intentional and related to the PR's objective of fixing lock contention.
The verification scripts did not produce any output, which suggests that the new fields
DB
,Fetcher
, andCache
in theResolver
struct may not be initialized or used in the codebase. This could indicate that the changes are incomplete or that the patterns used in the scripts did not match the code structure correctly. However, without further evidence, we cannot definitively conclude that there is an issue.Final review comment:
The verification scripts did not find instances where the new
DB
,Fetcher
, andCache
fields in theResolver
struct are initialized or used. Please ensure these fields are properly integrated into the codebase if they are relevant to the PR's objective.
DB db.ConsumerDB | ||
Fetcher fetcher.ScribeFetcher | ||
Cache cache.Service | ||
// CacheMutex is a mutex used for caching. It is used to prevent multiple | ||
// stat queries from being made at the same time. In the future, this should be done somewhere else. | ||
CacheMutex mapmutex.StringMapMutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the CacheMutex
field to the Resolver
struct is consistent with the PR's objective to address lock contention. However, the comment on lines 25-26 suggests that this is a temporary solution and should be moved elsewhere in the future. It's important to track this technical debt and ensure it is addressed in subsequent work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 2
Configuration used: CodeRabbit UI
Files selected for processing (4)
- CODEOWNERS (2 hunks)
- services/explorer/api/suite_test.go (2 hunks)
- services/explorer/graphql/server/graph/fetcher.go (3 hunks)
- services/explorer/graphql/server/graph/queryutils.go (2 hunks)
Files skipped from review due to trivial changes (1)
- CODEOWNERS
Additional comments: 6
services/explorer/api/suite_test.go (1)
- 5-15: The addition of the
graph
andtime
package imports is consistent with the changes made in theSetupTest
function to manage fallback time for tests. Ensure that thegraph
package's time management functions are well-tested and thread-safe, especially if tests are run in parallel.services/explorer/graphql/server/graph/fetcher.go (3)
282-284: The loop added to
parseAndStoreLog
function to store each parsed log could potentially address lock contention by processing logs individually. Ensure that thestoreBridgeEvent
function is thread-safe to handle concurrent calls without causing new contention issues.315-317: Similarly, the loop added to
parseAndStoreLogCCTP
function to store each parsed log should be reviewed for thread safety. Confirm that concurrent execution ofstoreBridgeEvent
does not introduce race conditions or other concurrency issues.491-491: The comment "// will ignore non-bridge events" has been added, but the provided code snippet does not show any logic for filtering non-bridge events. Ensure that the comment accurately reflects the behavior of the
storeBridgeEvent
function.services/explorer/graphql/server/graph/queryutils.go (2)
1638-1639: Creating a context with a timeout based on a potentially mutable global variable can lead to unpredictable behavior if the variable is changed during operation. Ensure that the usage of
timeToFallback
is safe and consider if there's a need to protect against concurrent modifications when setting the fallback time.1667-1670: The error handling strategy in
GetDestinationBridgeTxBW
relies on thebridgeEventMV
being nil or having aTChainID
of 0 to trigger a fallback. Verify that the fallback functionbwDestinationFallback
is robust and that this error handling strategy is consistent with the rest of the application.
func (g *APISuite) SetupTest() { | ||
g.TestSuite.SetupTest() | ||
|
||
initialFallback := graph.GetFallbackTime() | ||
graph.UnsafeSetFallbackTime(time.Second * 20) | ||
g.TestSuite.DeferAfterTest(func() { | ||
graph.UnsafeSetFallbackTime(initialFallback) | ||
}) | ||
|
||
g.db, g.eventDB, g.gqlClient, g.logIndex, g.cleanup, g.testBackend, g.deployManager = testutil.NewTestEnvDB(g.GetTestContext(), g.T(), g.scribeMetrics) | ||
|
||
cleanup, port, err := clickhouse.NewClickhouseStore("explorer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in the SetupTest
function to set and reset the fallback time are good practice to prevent side effects between tests. However, verify that the GetFallbackTime
and UnsafeSetFallbackTime
functions are thread-safe to avoid race conditions when tests are run concurrently.
var timeToFallback = time.Second * 0 | ||
|
||
// GetFallbackTime gets the fallback time for the bridge watcher. | ||
// this is intended only for testing | ||
func GetFallbackTime() time.Duration { | ||
return timeToFallback | ||
} | ||
|
||
// UnsafeSetFallbackTime sets the fallback time for the bridge watcher. | ||
// it is intended for testing. Plese remember to reset this value. | ||
func UnsafeSetFallbackTime(ttf time.Duration) { | ||
timeToFallback = ttf | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The introduction of a global variable for controlling timeouts (timeToFallback
) and the associated functions (GetFallbackTime
and UnsafeSetFallbackTime
) could lead to race conditions if accessed concurrently. Consider using a more thread-safe approach to manage timeouts, such as encapsulating the timeout within a struct and using methods with proper synchronization mechanisms.
- var timeToFallback = time.Second * 0
+ var timeToFallback = atomic.Value{}
- timeToFallback = ttf
+ timeToFallback.Store(ttf)
- return timeToFallback
+ return timeToFallback.Load().(time.Duration)
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
var timeToFallback = time.Second * 0 | |
// GetFallbackTime gets the fallback time for the bridge watcher. | |
// this is intended only for testing | |
func GetFallbackTime() time.Duration { | |
return timeToFallback | |
} | |
// UnsafeSetFallbackTime sets the fallback time for the bridge watcher. | |
// it is intended for testing. Plese remember to reset this value. | |
func UnsafeSetFallbackTime(ttf time.Duration) { | |
timeToFallback = ttf | |
} | |
var timeToFallback = atomic.Value{} | |
// GetFallbackTime gets the fallback time for the bridge watcher. | |
// this is intended only for testing | |
func GetFallbackTime() time.Duration { | |
return timeToFallback.Load().(time.Duration) | |
} | |
// UnsafeSetFallbackTime sets the fallback time for the bridge watcher. | |
// it is intended for testing. Plese remember to reset this value. | |
func UnsafeSetFallbackTime(ttf time.Duration) { | |
timeToFallback.Store(ttf) | |
} |
Attempt lock contention fix
Summary by CodeRabbit
New Features
Enhancements
fetcher
to handle multiple bridge events.Refactor
Tests
Chores
CODEOWNERS
file to reflect changes in directory ownership assignments.