Skip to content

Commit

Permalink
Update and rename Cluster-Deletion to Cluster-Deletion.md
Browse files Browse the repository at this point in the history
Finished rough draft
  • Loading branch information
DaxLynch authored Jan 4, 2024
1 parent 0719695 commit 5a524d2
Showing 1 changed file with 12 additions and 12 deletions.
24 changes: 12 additions & 12 deletions docs/GCP/Cluster-Deletion → docs/GCP/Cluster-Deletion.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,26 +34,26 @@ Error: key "e4s2311clu-slurm-compute-script-ghpc_startup_sh" already present in
on .terraform/modules/slurm_controller.slurm_controller_instance/terraform/slurm_cluster/modules/slurm_controller_instance/main.tf line 281, in resource "google_compute_project_metadata_item" "compute_startup_scripts":
281: resource "google_compute_project_metadata_item" "compute_startup_scripts" {
```
You must now go through the process of manually deleting each of the keys and resources that are listed in the error list. As shown https://cloud.google.com/sdk/gcloud/reference/compute/project-info/describe, we will use `gcloud compute project-info describe` to see the cloud metadata, and `gcloud compute project-info remove-metadata --keys="the key" --project=YOUR-PROJECT`. You can either run this command once using a list, such as
You must now go through the process of manually deleting each of the keys that are listed in the error list. As shown https://cloud.google.com/sdk/gcloud/reference/compute/project-info/describe, we will use `gcloud compute project-info describe` to see the cloud metadata, and `gcloud compute project-info remove-metadata --keys="the key" --project=YOUR-PROJECT`. You can either run this command once using a list, such as
```
gcloud compute project-info remove-metadata --keys==["CLUSTER-IMAGEclu-slurm-compute-script-ghpc_startup_sh","CLUSTER-IMAGEclu-slurm-controller-script-ghpc_startup_sh", … ] where you put in each relevant key. Bevery careful in this process that you only delete the relevant keys as this metadata info can affect all of you projects.
```
Or you can also do it one at a time,
```
gcloud compute project-info remove-metadata --keys="CLUSTER-IMAE-clu-slurm-controller-script-ghpc_startup_sh" for each key listed in the error message.
gcloud compute project-info remove-metadata --keys="CLUSTER-IMAgE-clu-slurm-controller-script-ghpc_startup_sh" for each key listed in the error message.
```
In my case the command looked like:
```
gcloud compute project-info remove-metadata --keys=["e4s2311clu-slurm-compute-script-ghpc_startup_sh","e4s2311clu-slurm-controller-script-ghpc_startup_sh","e4s2311clu-slurm-tpl-slurmdbd-conf","e4s2311clu-slurm-tpl-cgroup-conf","e4s2311clu-slurm-tpl-slurm-conf","e4s2311clu-slurm-partition-compute-script-ghpc_startup_sh","e4s2311clu-slurm-compute-script-ghpc_startup_sh","e4s2311clu-slurm-controller-script-ghpc_startup_sh","e4s2311clu-slurm-tpl-slurmdbd-conf","e4s2311clu-slurm-tpl-cgroup-conf"]
```
Furthermore, the networking, and filestore resources will still be active, so those must be deleted.
By searching filestore you should the instances page, in my case it looks like this
I know that it is the filestore created by the instance I improperly deleted. In your case you must be 100% sure, asa if you delete thewrong one you will delete data for other clusters. Be sure to check the creation date. Delete this
By searching in your project you should be able to find the network resource
Furthermore, the networking, and filestore resources will still be active, so those must be deleted. By searching filestore you should the instances page, in my case it looks like this
![image](https://github.com/ParaToolsInc/E4S-Pro/assets/81718016/21e434a9-00a6-4018-8cb8-70c0df068e8f)
I know that this is the filestore created by the instance I improperly deleted. In your case you must be 100% sure, because if you delete the wrong one you will delete data for other clusters. Be sure to check the creation date and delete.
![image](https://github.com/ParaToolsInc/E4S-Pro/assets/81718016/6305427b-8740-4de6-a8b9-f767f3ad4684)
By searching in your project you should be able to find the network resource page,
![image](https://github.com/ParaToolsInc/E4S-Pro/assets/81718016/c9c3ab74-1052-4767-94c7-3224f862720d)
You must delete all resources that are listed in the `Error 409: The resource 'projects/YOUR-PROJECT/regions/us-central1/addresses/CLUSTER-IMAGE' already exists` errors. For network resources they often have to be deleted in a specfic order. It is likely that you should delete the nat gateway, then the subnetwork, and then the VPC network peering, router, and then vpc, then release the IP address. If you can't delete a resource, it is in use by another. Find and delete the prerequisite resources first then delete it.
Now you should run ./ghpc destroy CLUSTER-IMAGE/ and ./ghpc create CLUSTER-IMAGE If any stray resources still exist, delete them as shown above and rerun these two commands.



0 comments on commit 5a524d2

Please sign in to comment.