Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Use Cache@2 task in place of Lighthouse #30

Merged
merged 1 commit into from
Jul 9, 2020

Conversation

DeeDeeG
Copy link
Member

@DeeDeeG DeeDeeG commented Jul 9, 2020

Requirements for Adding, Changing, or Removing a Feature (from template)

Requirements for Adding, Changing, or Removing a Feature

Issue or RFC Endorsed by Atom's Maintainers

As discussed here: #10 (comment)

Alternative implementation of: #11

Description of the Change

Cache the [repo_root]/node_modules, script/node_modules and apm/node_modules folders with the official Cache@v2 task, rather than the Microsoft DevLabs Lighthouse-branded caching task.

This has several benefits: First, it's a newer, official project with a dedicated and active support team behind it. (the Lighthouse caching task is a DevLabs project, and is "not officially supported".) The Lighthouse caching task must be installed from the marketplace, Cache@v2 "just works", no install needed. The Lighthouse caching task needs a vstsFeed (Universal Artifact Feed) ID, which means manual reconfiguration for every fork. The new Cache@v2 task doesn't ask for a feed ID, so it can work across any and all forks of Atom, no re-configuration needed!

Alternate Designs

Alternative implementation of: #11

Alternate solution versus [#13 --> #27]

Possible Drawbacks

The Cache@v2 task only saves its cache if the entire job (bootstrap, build, test) completes.

There is no explicit "save" action we can do like there was in the Lighthouse caching task. Cache@v2 only implicitly saves cache after the "job" completes. (We could split bootstrapping out into a separate job, but that's more refactoring than I want to do in one PR).

If there are issues with building a given branch, or the branch doesn't pass tests (including for cases of flaky tests), bootstrap cache won't be saved from that run.

That said, this effect is mitigated by the fact that caches from recent runs of master branch can be used. Furthermore, previous runs of the same branch can be used. For pull requests, previous runs of the target branch (to be merged into) can also be used. Docs for opportunistic cache hits across different branches/refs.

Once this PR has been around merged into the repository (especially master) for a bit, the chance of cache misses should go down a fair amount, so we should see less bootstrapping.

Some cache misses will still occur if we/upstream modify one or more of the script/vsts/platforms/[linux.yml, macos.yml, windows.yml] files, or any of the [[repo_root], script/, apm/][package.json, package-lock.json] files. But this is necessary, because in those cases it is fairly likely we do need to bootstrap again.

Not a regression in this PR, but a chance to maybe do better which I didn't try yet: (Perhaps the script/vsts/platforms/*.yml files can be taken out of the cache identifier. I think this might just have been a crude way of splitting out the caches by OS, and I copied it in case there was some other reason I'm unaware of. But bootstrapping "in the real world" (outside of CI) doesn't need these vsts files, so maybe we should drop them from the cache identifier for fewer cache misses. Hopefully we/upstream don't modify the vsts files too much, and it would be moot? Worth thinking about, though. The question is: Do changes to the script/vsts/platforms/*.yml files really indicate a need to do a fresh bootstrap? If yes, keep them in the cache identifier. If no, remove them. I suspect the answer is "no", and we can remove them.)

Verification Process

Ran CI with these changes.

As expected: If the build/testing succeeds, cache saving also succeeds. Then on any subsequent run with a similar enough state of the repository (package[-lock].json files and script/vsts/platforms*.yml files are the same) the cache is restored and bootstrapping is skipped.

I anticipate fewer cache misses for a given platform after this hits master and there is at least one successful CI run on master for that platform.

Also, I read the docs fairly closely.

Release Notes

N/A

@aminya
Copy link
Member

aminya commented Jul 9, 2020

Thanks for the PR! Not a biggie, but can we combine the steps together maybe? Like #11?

@aminya
Copy link
Member

aminya commented Jul 9, 2020

Now I saw your comment here. #11 (comment)

If it is not possible we might keep this as it is.

@aminya
Copy link
Member

aminya commented Jul 9, 2020

As of now, the cache is only used for MacOS builds since they are in separate jobs. But since we have the cache for Windows too, we might separate its tests like MacOS.

Cache on Linux might be useless at this point (?)

@DeeDeeG
Copy link
Member Author

DeeDeeG commented Jul 9, 2020

I believe the cache persists across CI runs.

It's saving it somewhere in Azure's servers at the end of the run. And then restoring it toward the beginning of the next run from that.

The Linux one might be a bit slow with electron-mksnapshot being so large... But it would otherwise have to download that huge file from GitHub's servers. Better to download it from one Microsoft server to another. Might even be in the same data center, hopefully. I think it's still faster to restore cache and skip bootstrapping, compared to doing a full bootstrap again.

Example: We change some small javascript file somewhere in Atom's own code. Atom's dependencies (its node_modules folders) don't need to be rebuilt. Only Atom itself needs to be rebuilt. So we restore the dependencies from a previous run, and continue to the next part, building Atom fresh with the updated js code or whatever small change we made.

@aminya
Copy link
Member

aminya commented Jul 9, 2020

I believe the cache persists across CI runs.

If this is the case, it might be useful to cache on Windows and Linux too.

Copy link
Member

@aminya aminya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Should I merge it? Make sure to make a PR to the upstream repo too.

We can divide Windows tests in a separate PR.

@DeeDeeG
Copy link
Member Author

DeeDeeG commented Jul 9, 2020

I notice the mac config is split up: "build" as one job, "test" as another job that spawns three runners (core test, test 1, test 2).

I'm not sure why it does that. And it does use a lot of cache restoring. I'm not sure why it needs to split build and testing into two jobs, though.

All three platforms can benefit from caching in this PR. And at upstream they all benefit from caching. At least the bootstrap step can be skipped if a usable cache is found and restored.

@aminya
Copy link
Member

aminya commented Jul 9, 2020

I'm not sure why it does that. And it does use a lot of cache restoring. I'm not sure why it needs to split build and testing into two jobs, though.

I think this is a workaround for the agents' unresponsiveness or timeouts. For example, Windows tests sometimes stop responding because it is a long task. This is my guess only.

@DeeDeeG
Copy link
Member Author

DeeDeeG commented Jul 9, 2020

This looks good to me. Should I merge it? Make sure to make a PR to the upstream repo too.

Yeah, I think it's ready to merge. I'm not optimistic that upstream will want it ("why fix what isn't broken?", from their perspective) but it can't hurt to try.

We can divide Windows tests in a separate PR.

I'll look at it or discuss any proposal and see if it makes sense. FWIW I think the mac is doing some unique tests that aren't done on the other platforms. And I still can't understand why it's split into "build" then "test", so I hope we have a good reason (maybe quicker builds, higher chance of saving a good cache) to show for it if we split the other two platforms.

I think this is a workaround for the agents' unresponsiveness or timeouts. For example, Windows tests sometimes stop responding because it is a long task. This is my guess only.

That makes sense. It does make me nervous how long these CI runs are and any little error causes them to halt and need to be restarted.

@DeeDeeG
Copy link
Member Author

DeeDeeG commented Jul 9, 2020

Do you mind if I post PRs to upstream from my personal fork DeeDeeG/atom? I technically can't make branches here. I suppose it would put forward a more professional/organized appearance to send PRs from this fork/organization.

But on the other hand, I don't care so much about optics as I care about doing good work and being relaxed, so it suits me ok to post PRs from my personal fork. 😆 anyhow... posting the PR there soon.

@aminya
Copy link
Member

aminya commented Jul 9, 2020

Not at all. That's OK.

@aminya aminya added the CI label Jul 9, 2020
@aminya aminya merged commit 4bf419d into atom-community:master Jul 9, 2020
@aminya aminya mentioned this pull request Jul 9, 2020
@aminya aminya linked an issue Jul 9, 2020 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting up Azure pipelines
2 participants