Apple Silicon GPU compatibility for Tensorflow #2184

comane · 2024-10-20T12:16:51Z

This pull request includes updates to the doc/sphinx/source/n3fit/runcard_detailed.rst file to clarify instructions for running parallel models and using GPUs on M1/M2 Macs.

Updates to parallel model instructions:

Added a note that savepseudodata must be set to false in the fitting section of the runcard to run with parallel models. (doc/sphinx/source/n3fit/runcard_detailed.rst)

Updates for GPU usage on M1/M2 Macs:

Added instructions to install specific packages (tensorflow-deps, tensorflow-macos, tensorflow-metal, and wandb) to run replicas in parallel using GPUs on M1/M2 Macs. (doc/sphinx/source/n3fit/runcard_detailed.rst)

Radonirinaunimi

Thanks @comane for this! Good to see that this is indeed what's needed to make it run.

Out of curiosity, how is the performance (how many replicas could you run, etc.)?

Radonirinaunimi · 2024-10-20T23:19:30Z

doc/sphinx/source/n3fit/runcard_detailed.rst

+.. code-block:: bash
+
+   conda install -c apple tensorflow-deps
+   pip install tensorflow-macos==2.13.0 tensorflow-metal wandb==0.15.9


Why is it necessary to really pin this version?

It is related to this issue here: wandb/wandb#5935

I was not able to make it run on MaC M2 GPUs with other versions.

Could you also add a reference to that issue in the docs?

comane · 2024-10-21T08:24:54Z

Out of curiosity, how is the performance (how many replicas could you run, etc.)?

The performance is not really good, at least on my laptop, it takes longer than using cpus. But this might be different for someone else with a more powerful mac

scarlehoff · 2024-10-21T08:24:55Z

Out of curiosity, it doesn't work on M3 at all or you only had M1 and M2 to test?

comane · 2024-10-21T08:35:46Z

Out of curiosity, it doesn't work on M3 at all or you only had M1 and M2 to test?

I only tested for M2, but the above mentioned issue is for M1.
Maybe @ecole41, if time allows, can test it with her M3?

But I assume it works for M3 as well.

Radonirinaunimi · 2024-10-21T08:45:47Z

Out of curiosity, how is the performance (how many replicas could you run, etc.)?

The performance is not really good, at least on my laptop, it takes longer than using cpus. But this might be different for someone else with a more powerful mac

When you say longer, how much is it? With how many replicas? (Maybe you are hitting memory bottleneck?)

But in any case, if it is not 4/5 times slower I'd say that's still good because you get all the replicas at then same time.

scarlehoff · 2024-10-21T08:51:38Z

If @ecole41 can test it that would be great.

I'd suggest anyway changing from M1/M2 to something along the lines of "Apple Sillicon".

But in any case, if it is not 4/5 times slower I'd say that's still good because you get all the replicas at then same time.

I would say even 4/5 is still good. In my case, I can run entire fits in ~3 hours in my desktop's GPU, while a single replica takes about 40 minutes. It's about 5 times more but when the cluster is busy is the difference between having the fits ready in the same morning or one day later.

Radonirinaunimi · 2024-10-21T08:53:48Z

I would say even 4/5 is still good. In my case, I can run entire fits in ~3 hours in my desktop's GPU, while a single replica takes about 40 minutes. It's about 5 times more but when the cluster is busy is the difference between having the fits ready in the same morning or one day later.

That's absolutely true! My threshold was really pessimistic 😅

comane · 2024-10-21T09:46:07Z

When you say longer, how much is it? With how many replicas? (Maybe you are hitting memory bottleneck?)

Running with 10 replicas only on GPUs takes 15 minutes to get to epoch 4400 / 17000. If I run on cpu the same thing (still on my laptop) it takes 2 min 45 sec.
So, I think that at least on my computer it's more convenient to run things on CPU.

I would say even 4/5 is still good. In my case, I can run entire fits in ~3 hours in my desktop's GPU, while a single replica takes about 40 minutes. It's about 5 times more but when the cluster is busy is the difference between having the fits ready in the same morning or one day later.

@scarlehoff when you say on your desktop do you mean a MaC Os?

An interesting warning that I am getting is the following:

scarlehoff · 2024-10-21T10:02:21Z

@scarlehoff when you say on your desktop do you mean a MaC Os?

Nop, a linux desktop with an nvidia gpu (at some point I tried it as well with an AMD one and it worked fwiw)

scarlehoff

Thanks for testing this (and adding it to the docs!!!)

doc/sphinx/source/n3fit/runcard_detailed.rst

scarlehoff · 2024-10-21T10:26:48Z

doc/sphinx/source/n3fit/runcard_detailed.rst

+.. code-block:: bash
+
+   conda install -c apple tensorflow-deps
+   pip install tensorflow-macos==2.13.0 tensorflow-metal wandb==0.15.9


Could you also add a reference to that issue in the docs?

ecole41 · 2024-10-22T09:29:23Z

Hello @scarlehoff @comane @Radonirinaunimi , I tested this on M3 GPUs and it worked, the performance on GPUs was much slower than on CPUs on my Mac.
For 200 epochs and 100 replicas for the nnpdf40-like runcard:

GPUs : 3800s
CPUs: 2000s

scarlehoff · 2024-10-22T09:34:02Z

In CPU you also ran 100 replicas or is this 1 replica in CPU vs 100 in GPU?

ecole41 · 2024-10-22T10:10:07Z

Both GPU and CPU for 100 replicas

RoyStegeman · 2024-10-22T10:13:22Z

Is that the timing only for the 200 epochs or does it include overhead?

ecole41 · 2024-10-22T10:21:02Z

I'm not sure how to check this, let me know if this helps:

This is the time is get for the GPU:

This is for the CPU:

scarlehoff · 2024-10-22T10:25:20Z

I think it includes overhead. But in any case, it seems that running a fit on a Mac is not really going to be doable just yet :(

Maybe there's some low hanging fruit to improve it but not sure the effort is worth it

comane added 2 commits October 20, 2024 13:10

added dependencies for installation of tf for m1-m2 mac

597f490

added missing doc on savepseudodata

ceda0c8

comane added the documentation Issues and PRs related to documentation label Oct 20, 2024

Radonirinaunimi reviewed Oct 20, 2024

View reviewed changes

comane changed the title ~~M1-2 GPU compatibility for Tensorflow~~ Apple Silicon GPU compatibility for Tensorflow Oct 21, 2024

scarlehoff approved these changes Oct 21, 2024

View reviewed changes

added link to issue

d745186

comane merged commit 5f06deb into master Oct 21, 2024
6 checks passed

comane deleted the m1_2_gpu_tf_compatibility branch October 21, 2024 14:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apple Silicon GPU compatibility for Tensorflow #2184

Apple Silicon GPU compatibility for Tensorflow #2184

comane commented Oct 20, 2024

Radonirinaunimi left a comment

Radonirinaunimi Oct 20, 2024

comane Oct 21, 2024

scarlehoff Oct 21, 2024

comane commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

comane commented Oct 21, 2024

Radonirinaunimi commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

Radonirinaunimi commented Oct 21, 2024

comane commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

scarlehoff left a comment

scarlehoff Oct 21, 2024

ecole41 commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

ecole41 commented Oct 22, 2024

RoyStegeman commented Oct 22, 2024 •

edited

Loading

ecole41 commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

Apple Silicon GPU compatibility for Tensorflow #2184

Apple Silicon GPU compatibility for Tensorflow #2184

Conversation

comane commented Oct 20, 2024

Radonirinaunimi left a comment

Choose a reason for hiding this comment

Radonirinaunimi Oct 20, 2024

Choose a reason for hiding this comment

comane Oct 21, 2024

Choose a reason for hiding this comment

scarlehoff Oct 21, 2024

Choose a reason for hiding this comment

comane commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

comane commented Oct 21, 2024

Radonirinaunimi commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

Radonirinaunimi commented Oct 21, 2024

comane commented Oct 21, 2024

scarlehoff commented Oct 21, 2024

scarlehoff left a comment

Choose a reason for hiding this comment

scarlehoff Oct 21, 2024

Choose a reason for hiding this comment

ecole41 commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

ecole41 commented Oct 22, 2024

RoyStegeman commented Oct 22, 2024 • edited Loading

ecole41 commented Oct 22, 2024

scarlehoff commented Oct 22, 2024

RoyStegeman commented Oct 22, 2024 •

edited

Loading