Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

优化 model_norm 方法 #26

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

volmodaoist
Copy link

对于张量运算应尽量避免使用循环,原版的 model_norm 函数 (i.e. 计算两个模型之间的欧式距离) 使用 for 实现,导致效率偏低且可读性差。为此我们优化了原版 model_norm 方法,增强了可读性且运行效率提高了一倍,若在 GPU 环境之中运行代码, 性能提升会更加显著。

# 原本的 model_norm 方法之中的 for 循环导致其无法充分利用 GPU 加速
def model_norm(model_1, model_2):
	squared_sum = 0
	for name, layer in model_1.named_parameters():
		squared_sum += torch.sum(torch.pow(layer.data - model_2.state_dict()[name].data, 2))
	return math.sqrt(squared_sum)

# 优化之后的 model_norm 方法,这个方法兼具运行效率与可读性
def model_norm2(model_1, model_2):
    params_1 = torch.cat([param.view(-1) for param in model_1.parameters()])
    params_2 = torch.cat([param.view(-1) for param in model_2.parameters()])
    
    return torch.norm(params_1 - params_2, p = 2)
    
# 如果牺牲一部分可读性,其运行效率可以进一步提高,实战之中使用下面代码能有效缩短模型训练的时间。
def quick_model_norm(model_1, model_2):
    diffs = [(p1 - p2).view(-1) for p1, p2 in zip(model_1.parameters(), model_2.parameters())]
    return torch.norm(torch.cat(diffs), p = 2)

为此我们优化了原版 model_norm 方法,增强了可读性且运行效率提高了一倍,若在 GPU 环境之中运行代码,
性能提升会更加明显。

Signed-off-by: volmodaoist <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant