Export training model to StableHlo #8366

Zantares · 2024-11-08T08:02:01Z

❓ Questions and Help

The export API only supports torch.nn.module as input, is any method to export a training model with step_fn to StableHlo?

Here is a simple training case from example:

  def __init__(self):
    ...
    self.device = torch_xla.device()
    self.model = torchvision.models.resnet50().to(self.device)
    self.optimizer = optim.SGD(self.model.parameters(), weight_decay=1e-4)
    self.loss_fn = nn.CrossEntropyLoss()
    ...

  def run_optimizer(self):
    self.optimizer.step()

  def step_fn(self, data, target):
    self.optimizer.zero_grad()
    output = self.model(data)
    loss = self.loss_fn(output, target)
    loss.backward()
    self.run_optimizer()
    return loss

The guidance https://pytorch.org/xla/master/features/stablehlo.html#torch-export-to-stablehlo only introduced how to export the original self.model, but it didn't tell how to export the model with Optimizer and Loss functions.

The text was updated successfully, but these errors were encountered:

JackCaoG · 2024-11-08T18:51:13Z

@qihqi not sure if exporting for training is something we support today.

Zantares · 2024-11-11T03:35:49Z

Add more background:

Compare with Torch-XLA, I found that JAX has a convenient API takes jitted function as input. Here is an example from JAX repo:

...
def loss(params, batch):
  inputs, targets = batch
  preds = predict(params, inputs)
  return -jnp.mean(jnp.sum(preds * targets, axis=1))
...


if __name__ == "__main__":
  @jit
  def update(params, batch):
    grads = grad(loss)(params, batch)
    return [(w - step_size * dw, b - step_size * db)
            for (w, b), (dw, db) in zip(params, grads)]

  ...
  params = update(params, next(batches))
  ...

Then it can be easily exported as below:

  # Export the function to StableHLO
  sh_exported = export.export(update)(params, batch)
  sh_text = get_stablehlo_asm(sh_exported.mlir_module())
  print(sh_text)

I can execute the generated StableHLO and get expected results. So, I'm wondering if Torch-XLA can export training model like this.

Zantares · 2025-01-09T02:00:37Z

Answered in #8486 (comment), close this issue.

Zantares closed this as completed Jan 9, 2025

Zantares mentioned this issue Jan 9, 2025

2 questions for the composite op feature #8486

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export training model to StableHlo #8366

Export training model to StableHlo #8366

Zantares commented Nov 8, 2024

JackCaoG commented Nov 8, 2024

Zantares commented Nov 11, 2024

Zantares commented Jan 9, 2025

Export training model to StableHlo #8366

Export training model to StableHlo #8366

Comments

Zantares commented Nov 8, 2024

❓ Questions and Help

JackCaoG commented Nov 8, 2024

Zantares commented Nov 11, 2024

Zantares commented Jan 9, 2025