Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fullnode is not syncing #6155

Open
cranycrane opened this issue Jan 23, 2025 · 2 comments
Open

Fullnode is not syncing #6155

cranycrane opened this issue Jan 23, 2025 · 2 comments

Comments

@cranycrane
Copy link

cranycrane commented Jan 23, 2025

1.What did you do?
Deploy fullnode

2.What did you expect to see?
Node can gradually sync up to the latest block on the network.

3.What did you see instead?
Node sync is super slow (units of blocks per hour)

System:
OS: Debian GNU/Linux 11
CPU: Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
RAM: 64GB
SSD: 3TB
Bandwidth: 700M

I followed the steps for deployment on https://github.com/tronprotocol/java-tron.
I have:

  1. Downloaded snapshot from November 2024
  2. Downloaded the GreatVoyage-v4.7.7.jar
  3. Created a systemctlservice:
ExecStart=/usr/bin/java -Xms16G -Xmx16G \
           -XX:ReservedCodeCacheSize=256m \
           -XX:MetaspaceSize=256m \
           -XX:MaxMetaspaceSize=512m \
           -XX:MaxDirectMemorySize=1G \
           -XX:+PrintGCDetails \
           -XX:+PrintGCDateStamps \
           -Xloggc:gc.log \
           -XX:+UseConcMarkSweepGC \
           -XX:NewRatio=2 \
           -XX:+CMSScavengeBeforeRemark \
           -XX:+ParallelRefProcEnabled \
           -XX:+HeapDumpOnOutOfMemoryError \
           -XX:+UseCMSInitiatingOccupancyOnly \
           -XX:CMSInitiatingOccupancyFraction=70 \
           -jar FullNode.jar -c main_net_config.conf
  1. Created a main_net_config.conf by copying from: https://github.com/tronprotocol/tron-deployment/blob/master/main_net_config.conf
  2. Enabled and setup Prometheus and Grafana
  3. Waitted for 24+ hours if the syncing starts
    Note: There is no other software running or calling node's API.

For the first 2-3 minutes, I see in the log that the node gets new blocks (gets around 10 in a manner of seconds), but then the syncing stops and it continuously connects/disconnects to other peers.

Appreciate any help.

Thank you!

Screenshots from Grafana:

Image

Image

Image

Logs:
gc.log.gz
tron.log.gz

@angrynurd
Copy link

angrynurd commented Jan 23, 2025

@cranycrane
below is the key error log of last sync block in tron.log

`09:43:20.920 INFO  [sync-handle-block] [DB](Manager.java:1354) PushBlock block number: 67414430, cost/txs: 3645/441 true.
09:43:20.920 INFO  [sync-handle-block] [net](TronNetDelegate.java:269) Success process block Num:67414430,ID:000000000404a99e808e926f5083fb08a88e3c72f0283d61dbab50032953b3a6
09:43:20.937 INFO  [sync-handle-block] [DB](Manager.java:1233) Block num: 67414431, re-push-size: 0, pending-size: 0, block-tx-size: 353, verify-tx-size: 353

09:43:25.376 INFO  [sync-handle-block] [VM](Program.java:1134) minTimeRatio: 0.0, maxTimeRatio: 5.0, vm should end time in us: 4669859944383, vm now time in us: 4669860184725, vm start time in us: 4669859978982
09:43:25.376 INFO  [sync-handle-block] [VM](VM.java:92) VM halted: [CPU timeout for 'RETURNDATASIZE' operation executing]
09:43:25.376 INFO  [sync-handle-block] [VM](VM.java:92) VM halted: [CPU timeout for 'RETURNDATASIZE' operation executing]
09:43:25.376 INFO  [sync-handle-block] [VM](VMActuator.java:276) timeout: CPU timeout for 'RETURNDATASIZE' operation executing
09:43:25.808 INFO  [sync-handle-block] [VM](Program.java:1134) minTimeRatio: 0.0, maxTimeRatio: 5.0, vm should end time in us: 4669860585204, vm now time in us: 4669860616969, vm start time in us: 4669860390579
09:43:25.808 INFO  [sync-handle-block] [VM](VM.java:92) VM halted: [CPU timeout for 'PUSH2' operation executing]
09:43:25.808 INFO  [sync-handle-block] [VM](VM.java:92) VM halted: [CPU timeout for 'PUSH2' operation executing]
09:43:25.808 INFO  [sync-handle-block] [VM](VM.java:92) VM halted: [CPU timeout for 'PUSH2' operation executing]
09:43:25.808 INFO  [sync-handle-block] [VM](VMActuator.java:276) timeout: CPU timeout for 'PUSH2' operation executing
09:43:25.808 INFO  [sync-handle-block] [DB](Manager.java:1486) Retry result when push: true, for tx id: 78955dad753055240809bb373c496fe239d953dca80aa0a729daaa27dfc38e04, tx resultCode in receipt: OUT_OF_TIME.
09:43:25.808 ERROR [sync-handle-block] [DB](Manager.java:1329) different resultCode txId: 78955dad753055240809bb373c496fe239d953dca80aa0a729daaa27dfc38e04, expect: SUCCESS, actual: OUT_OF_TIME
org.tron.core.exception.ReceiptCheckErrException: different resultCode txId: 78955dad753055240809bb373c496fe239d953dca80aa0a729daaa27dfc38e04, expect: SUCCESS, actual: OUT_OF_TIME
	`

which tell us:
Transaction ID: 78955dad753055240809bb373c496fe239d953dca80aa0a729daaa27dfc38e04,Failed due to VM timeout during execution.
Two VM timeout events:RETURNDATASIZE operation timeout,PUSH2 operation timeout.and Expected result: SUCCESS,but Actual result: OUT_OF_TIME.

I've reviewed your startup command, and the heap memory allocation seems unreasonable((16g is not enough for a minimum recommended requirements, but a bigger heap size allocated to JVM is better,eg:32G). The system load metrics also look normal across the board.

I would suggest 2 steps to resolve:
1.update your startup command,set the init heap size to 32G
-Xms32G -Xmx32G
2.change the config below in order to increase the time tolerance of verification transactions:

minTimeRatio = 0.0
maxTimeRatio = 20.0

@cranycrane
Copy link
Author

Thank you, @angrynurd

I have tried both of the steps you have suggested. From the first impression it looks like the problem was solved, the blocks are now syncing continually.

I will reach out later with an update how is the syncing going.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants