Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: sync stuck due to networking error #240

Open
hunjixin opened this issue Jan 9, 2025 · 2 comments
Open

bug: sync stuck due to networking error #240

hunjixin opened this issue Jan 9, 2025 · 2 comments

Comments

@hunjixin
Copy link

hunjixin commented Jan 9, 2025

reproduce on mocha network

  1. setup a new full node to sync,
  2. close the network(turn off wifi, pull out wire, etc)
  3. wait for all peers to fail
  4. open the network
  5. the sync goroutine will never recover

the sync goroutine is stucking
image

if network is down, the peers in peerQuene will always be in a decreasing state due to errors that not a ErrNotFound one(network connect fail etc). and eventually run out of the havePeer channel. in this time GetRangeByHeight alway wait for hasPeer channel while getRangeByHeight wait for the result channel

a candidate fix is push back the peer state for errEmptyResponse error. #238

but another fix is to add timeout for GetRangeByHeight

func (s *Syncer[H]) requestHeaders(
	ctx context.Context,
	fromHead H,
	to uint64,
) error {
	amount := to - fromHead.Height()
	// start requesting headers until amount remaining will be 0
	for amount > 0 {
		size := header.MaxRangeRequestSize
		if amount < size {
			size = amount
		}

		to := fromHead.Height() + size + 1
		s.metrics.rangeRequestStart()
                 //to fix , add timeout for this context
		headers, err := s.getter.GetRangeByHeight(ctx, fromHead, to)
		s.metrics.updateGetRangeRequestInfo(s.ctx, int(size)/100, err != nil)
		s.metrics.rangeRequestStop()
		if err != nil {
			return err
		}

		if err := s.storeHeaders(ctx, headers...); err != nil {
			return err
		}

		amount -= size // size == len(headers)
		fromHead = headers[len(headers)-1]
	}
	return nil
}
@hunjixin hunjixin changed the title bug: sync bug: sync stuck due to networking error Jan 9, 2025
@Wondertan
Copy link
Member

Hey @hunjixin, thanks for detailed bug report. I am gonna take a look soon, unless @vgonkivs wants to beat me to it and look faster

@vgonkivs
Copy link
Member

vgonkivs commented Jan 9, 2025

Thanks for opening an issue @hunjixin. I will take a closer look and report you back asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants