Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assistance with Using Equiformer_v2 for Structure Relaxation #21

Open
hhh846 opened this issue Jan 7, 2025 · 4 comments
Open

Assistance with Using Equiformer_v2 for Structure Relaxation #21

hhh846 opened this issue Jan 7, 2025 · 4 comments

Comments

@hhh846
Copy link

hhh846 commented Jan 7, 2025

Hello!

I hope you're doing well. I’d like to use Equiformer_v2 for structure relaxation, but I’m not very experienced with coding. Could you please provide me with an example? Specifically:

Where should I place the initial structure?
How should I modify the input files to work with the model?
Thank you in advance for your help!

Best regards,

@yilunliao
Copy link
Member

Hi @hhh846

Thanks for your interest.

Below are the steps to run relaxations:

  1. For running relaxation, I would recommend using this later version as here.
  2. Once you set up following the instructions as here, you can uncomment these lines to set up relaxation-related configs. You can specify the path to the dataset on which relaxation is performed here, and the result of relaxation trajectories will be save here.
  3. Then, similar to the script, you can run one of the following two commands to perform relaxation:
CHECKPOINT=""
# Multi-nodes
python main.py \
    --mode 'run-relaxations' \
    --distributed \
    --num-gpus 8 \
    --num-nodes 2 \
    --config-yml 'experimental/configs/oc20/all-md/equiformer_v2/equiformer_v2_dens_N@20_L@6_M@3_lr@4e-4_epochs@[email protected]_dens-relax-data-only.yml' \
    --identifier 'relax-id-oc20-all-md_equiformer-v2-dens_epochs@2' \
    --run-dir 'models/oc20/all-md/equiformer_v2_dens/relax-id' \
    --submit \
    --amp \
    --checkpoint $CHECKPOINT

# Single-node
python -u -m torch.distributed.launch --nproc_per_node=8 main.py \
    --distributed \
    --num-gpus 8 \
    --mode 'run-relaxations' \
    --config-yml 'experimental/configs/oc20/all-md/equiformer_v2/equiformer_v2_dens_N@20_L@6_M@3_lr@4e-4_epochs@[email protected]_dens-relax-data-only.yml' \
    --identifier 'relax-id-oc20-all-md_equiformer-v2-dens_epochs@2' \
    --run-dir 'models/oc20/all-md/equiformer_v2_dens/relax-id' \
    --amp \
    --checkpoint $CHECKPOINT

Besides, the initial structures are provided in their dataset, so you would find it helpful to check the OC20 paper for more details.

Please feel free to let me know if you have any further question.

@hhh846
Copy link
Author

hhh846 commented Jan 10, 2025

Hi,

I followed your instructions and tested the setup by placing my initial structures in an LMDB file and using the checkpoint OC20 S2EF-2M_epoch30 for structure relaxation. However, I encountered the following error:

File "/share/apps/miniconda3/envs/equiformer_v2_torch1_ocp2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1671, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DistributedDataParallel:
size mismatch for module.mappingReduced.l_harmonic: copying a param with shape torch.Size([29]) from checkpoint, the shape in current model is torch.Size([37]).
size mismatch for module.mappingReduced.m_harmonic: copying a param with shape torch.Size([29]) from checkpoint, the shape in current model is torch.Size([37]).
...

I suspect this issue might be related to a mismatch in the model's hyperparameters. Could you please confirm if this is the root cause? If so, I would appreciate your guidance on how to adjust the hyperparameters to resolve the issue.

I have attached my scripts and the error log and my LMDB files for your reference.

Thank you for your help!

Best regards
myfiles.zip

@yilunliao
Copy link
Member

Hi @hhh846

I think this is because of the mismatch between configs and checkpoints.
You can take the configs and checkpoints from here.
In the above case, the checkpoint corresponding to your config is from "OC20 S2EF-All+MD" (third row).

@hhh846
Copy link
Author

hhh846 commented Jan 11, 2025

Thank you! I have successfully completed the structure relaxation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants