You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In internal/pkg/peer/blocksprovider/deliverer.go:150
Local variable totalDuration measures the total sleep during retries. If it exceeds reconnectTotalTimeThreshold the Deliverer may exit if MaxRetryDurationExceededHandler returns true.
However, it does not reset on success (block reception), so DeliverBlocks() may exit eventually if the cumulative sleep time is large, but no consecutive reconnect failure sequence exceeds the threshold for stopping retries.
totalDuration += sleepDuration
if totalDuration > d.MaxRetryDuration {
if d.MaxRetryDurationExceededHandler() {
d.Logger.Warningf("attempted to retry block delivery for more than peer.deliveryclient.reconnectTotalTimeThreshold duration %v, giving up", d.MaxRetryDuration)
return
}
d.Logger.Warningf("peer is a static leader, ignoring peer.deliveryclient.reconnectTotalTimeThreshold")
}
The solution is to reset totalDuration together when we reset failureCounter.
Steps to reproduce
Several bursts of failures, each burst followed by a success, each burst with duration < reconnectTotalTimeThreshold
With enough bursts, totalDuration will be > reconnectTotalTimeThreshold, and the Deliverer will give up, even tough it should not.
The text was updated successfully, but these errors were encountered:
Description
In internal/pkg/peer/blocksprovider/deliverer.go:150
Local variable
totalDuration
measures the total sleep during retries. If it exceedsreconnectTotalTimeThreshold
the Deliverer may exit ifMaxRetryDurationExceededHandler
returns true.However, it does not reset on success (block reception), so
DeliverBlocks()
may exit eventually if the cumulative sleep time is large, but no consecutive reconnect failure sequence exceeds the threshold for stopping retries.The solution is to reset
totalDuration
together when we resetfailureCounter
.Steps to reproduce
Several bursts of failures, each burst followed by a success, each burst with duration <
reconnectTotalTimeThreshold
With enough bursts,
totalDuration
will be >reconnectTotalTimeThreshold
, and the Deliverer will give up, even tough it should not.The text was updated successfully, but these errors were encountered: