Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rfq] metrics, hotfixes #2854

Merged
merged 16 commits into from
Jul 8, 2024
Merged
9 changes: 9 additions & 0 deletions ethergo/listener/listener.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ type chainListener struct {
newBlockHandler NewBlockHandler
finalityMode rpc.BlockNumber
blockWait uint64
// otelRecorder is the recorder for the otel metrics.
otelRecorder iOtelRecorder
}

var (
Expand All @@ -77,6 +79,12 @@ func NewChainListener(omnirpcClient client.EVM, store listenerDB.ChainListenerDB
option(c)
}

var err error
c.otelRecorder, err = newOtelRecorder(handler)
if err != nil {
return nil, fmt.Errorf("could not create otel recorder: %w", err)
}

return c, nil
}

Expand Down Expand Up @@ -183,6 +191,7 @@ func (c *chainListener) doPoll(parentCtx context.Context, handler HandleLog) (er
if err != nil {
return fmt.Errorf("could not put latest block: %w", err)
}
c.otelRecorder.RecordLastBlock(endBlock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for recording the last block.

The new code for recording the last block using the otelRecorder in the doPoll function is not covered by tests. Ensure that this case is tested to improve code reliability.

Do you want me to generate the unit testing code or open a GitHub issue to track this task?

Tools
GitHub Check: codecov/patch

[warning] 194-194: ethergo/listener/listener.go#L194
Added line #L194 was not covered by tests


c.startBlock = lastUnconfirmedBlock
return nil
Expand Down
117 changes: 117 additions & 0 deletions ethergo/listener/otel.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
package listener

import (
"context"
"fmt"
"time"

"github.com/synapsecns/sanguine/core/metrics"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)

const meterName = "github.com/synapsecns/sanguine/ethergo/listener"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused constant meterName.

The constant meterName is defined but not used anywhere in the file. Consider removing it to clean up the code.

- const meterName = "github.com/synapsecns/sanguine/ethergo/listener"
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const meterName = "github.com/synapsecns/sanguine/ethergo/listener"
Tools
GitHub Check: Lint (ethergo)

[failure] 8-8:
meterName is unused (deadcode)


// generate an interface for otelRecorder that exports the public method.
// this allows us to avoid using recordX externally anad makes the package less confusing.
//
// =============================================================================
// =============================================================================
// IMPORTANT: DO NOT REMOVE THIS COMMENT.
// NOTICE: PLEASE MAKE SURE YOU UPDATE BOTH THE DOCS AND THE GRAFANA DASHBOARD (IF NEEDED) AFTER UPDATING METRICS.
// =============================================================================
// =============================================================================
//
//go:generate go run github.com/vburenin/ifacemaker -f otel.go -s otelRecorder -i iOtelRecorder -p listener -o otel_generated.go -c "autogenerated file"
type otelRecorder struct {
metrics metrics.Handler
// meter is the metrics meter.
meter metric.Meter
// lastBlockGauge is the gauge for the last block.
lastBlockGauge metric.Int64ObservableGauge
// lastBlockAgeGauge is the gauge for the last block age.
lastBlockAgeGauge metric.Float64ObservableGauge
// lastBlock is the last block processed by the listener.
lastBlock *uint64
// lastBlockFetchTime is the time the last block was fetched (used to calculate last block age).
lastBlockFetchTime *time.Time
// chainID is the chain ID for the listener.
chainID uint32
}

func newOtelRecorder(meterHandler metrics.Handler) (_ iOtelRecorder, err error) {
or := otelRecorder{
metrics: meterHandler,
meter: meterHandler.Meter(meterName),
lastBlock: nil,
lastBlockFetchTime: nil,
}

or.lastBlockGauge, err = or.meter.Int64ObservableGauge("last_block")
if err != nil {
return nil, fmt.Errorf("could not create last block gauge")
}

or.lastBlockAgeGauge, err = or.meter.Float64ObservableGauge("last_block_age")
if err != nil {
return nil, fmt.Errorf("could not create last block age gauge")
}

_, err = or.meter.RegisterCallback(or.recordLastBlock, or.lastBlockGauge)
if err != nil {
return nil, fmt.Errorf("could not register callback for last block gauge")
}

_, err = or.meter.RegisterCallback(or.recordLastBlockAge, or.lastBlockAgeGauge)
if err != nil {
return nil, fmt.Errorf("could not register callback for last block age gauge")
}

return &or, nil
}

func (o *otelRecorder) recordLastBlock(_ context.Context, observer metric.Observer) (err error) {
if o.metrics == nil || o.lastBlockGauge == nil || o.lastBlock == nil {
return nil
}
Comment on lines +74 to +77
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for error handling.

The error handling blocks in the recordLastBlock function are not covered by tests. Ensure that these cases are tested to improve code reliability.

Do you want me to generate the unit testing code or open a GitHub issue to track this task?

Tools
GitHub Check: codecov/patch

[warning] 74-77: ethergo/listener/otel.go#L74-L77
Added lines #L74 - L77 were not covered by tests


opts := metric.WithAttributes(
attribute.Int(metrics.ChainID, int(o.chainID)),
)
observer.ObserveInt64(o.lastBlockGauge, int64(*o.lastBlock), opts)

return nil
}

func (o *otelRecorder) recordLastBlockAge(_ context.Context, observer metric.Observer) (err error) {
if o.metrics == nil || o.lastBlockAgeGauge == nil || o.lastBlockFetchTime == nil {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for error handling.

The error handling blocks in the recordLastBlockAge function are not covered by tests. Ensure that these cases are tested to improve code reliability.

Do you want me to generate the unit testing code or open a GitHub issue to track this task?

Tools
GitHub Check: codecov/patch

[warning] 87-90: ethergo/listener/otel.go#L87-L90
Added lines #L87 - L90 were not covered by tests


age := time.Since(*o.lastBlockFetchTime).Seconds()
opts := metric.WithAttributes(
attribute.Int(metrics.ChainID, int(o.chainID)),
)
observer.ObserveFloat64(o.lastBlockAgeGauge, age, opts)

return nil
}

// RecordLastBlock records the last block processed by the listener.
func (o *otelRecorder) RecordLastBlock(lastBlock uint64) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧠 logic: The logic for checking if the last block has changed seems inverted. It should be hasChanged = *o.lastBlock != lastBlock.

// verify if the last block has changed
var hasChanged bool
if o.lastBlock == nil {
hasChanged = true
} else {
hasChanged = *o.lastBlock == lastBlock
}
if !hasChanged {
return
}

// record the block
o.lastBlock = &lastBlock
fetchTime := time.Now()
o.lastBlockFetchTime = &fetchTime
}
9 changes: 9 additions & 0 deletions ethergo/listener/otel_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 1 addition & 7 deletions services/rfq/relayer/service/relayer.go
Original file line number Diff line number Diff line change
Expand Up @@ -387,13 +387,7 @@
} else {
// process in parallel (new goroutine)
request := req // capture func literal
ok := r.semaphore.TryAcquire(1)
if !ok {
span.AddEvent("could not acquire semaphore", trace.WithAttributes(
attribute.String("transaction_id", hexutil.Encode(request.TransactionID[:])),
))
continue
}
err = r.semaphore.Acquire(ctx, 1)

Check warning on line 390 in services/rfq/relayer/service/relayer.go

View check run for this annotation

Codecov / codecov/patch

services/rfq/relayer/service/relayer.go#L390

Added line #L390 was not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding context with timeout for semaphore acquisition.

To avoid potential deadlocks, consider using a context with timeout when acquiring the semaphore.

-  err = r.semaphore.Acquire(ctx, 1)
+  acquireCtx, cancel := context.WithTimeout(ctx, time.Second * 30)
+  defer cancel()
+  err = r.semaphore.Acquire(acquireCtx, 1)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
err = r.semaphore.Acquire(ctx, 1)
acquireCtx, cancel := context.WithTimeout(ctx, time.Second * 30)
defer cancel()
err = r.semaphore.Acquire(acquireCtx, 1)

if err != nil {
return fmt.Errorf("could not acquire semaphore: %w", err)
}
Expand Down
Loading