-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "remove incorrect end-to-end implementation" #77
Revert "remove incorrect end-to-end implementation" #77
Conversation
You don't have to merge this right away if it was causing a short-term problem, but I'm staging it here for your convenience. An alternative: I can adjust this so it doesn't run automatically but instead runs on-demand when someone wants to run it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts:
- We want the end-to-end tests to be structured in a way where we have a clean file consuming the latest version of our training library. The tests should live in this repo so that they can be adjusted, and the proper advisories can be made WRT updating the upstream CLI repo.
- We are not planning on using the spell-checker going forward.
So while we might have chosen to break things into separate repos, I don't think we agreed to run them as completely independent projects with their own processes and governance. We still need to share and build off of each other IMO. |
@n1hility I don't disagree, and would like to introduce e2e tests that don't break consuming libraries. The issue with this implementation is that these tests live beyond our repo, and therefore are very fragile. We would like to keep everything within the same repo. Happy to implement this myself after the 15th. |
Relevant note: this test script has not been updated to use the new training code. It was changed to use legacy mode when new training was merged into the CLI. At the moment, the test coverage would be ensuring that everything can still be installed together, including whatever changes are present in a training PR. Once this issue is resolved, it would also ensure that |
Can you elaborate how it's fragile? This is a pretty common practice for e2e testing the combination of a library into a primary consumer, you run a smoke test of the consumer in your project so that you catch compatibility problems. |
Having end-to-end tests in a separate repo means that if we make a breaking change, we would need to fix the tests in the upstream to get them working on our end. It makes it so we are coupled to the upstream repo, and if these tests are to break for any reason, this would impact our PR status downstream. On the other hand, defining an end-to-end test suite within this repo allows us to ensure that it's reflective of how we intend this library to be used. This also makes it much more straightforward to develop the tests as we build out the library. In either case we'd be able to detect breakages, but one allows us much more freedom with how we go about testing the library. |
I'll just add to the debate of the value of the E2E that we've added it to the Eval repo and it has already helped identify the need for a dependency bump that @danmcp implemented: instructlab/eval#14 (comment) |
Thanks for sharing your thinking. What you describe is really mock integration testing than a form e2e testing. Both have their uses and are complementary, but mock testing is trading real world accuracy for speed, since you are only testing what you anticipate the interaction to be vs what it actually is. This makes a lot of sense when e2e is too slow for PR testing, since better to run it nightly and unit + mock tests in PR. However in this case, it's a fast test suite (only 30 mins mostly from build), and the alternative we have right now is no testing. Granted this is only testing the legacy training path, so its value is more limited than what it could be. However it will catch API breakage and dependency issues and some interaction. While a planned API change will cause these to fail, it's pretty easy to suspend them, or alternatively sustain compatibility until you update the dependent ilab project. |
A few things:
|
This is a pretty standard (if not fast) time for an e2e, I have seen ones in OCP that take hours. Sorry to jump into the convo super late here, but in all honesty I think what this entire conversations boils down to is: speed of development vs quality of code We have been struggling to find a balance here. So this is what I would suggest. I think we should add back the e2e, but with the full understanding that it might break sometimes with immediate or pretty quick fixes. As long as we can use the test to pinpoint which changes introduced an issue, that is better than no e2e AND it is better than stalling necessary development. We need to find a middle ground here, adding the test but understanding the circumstances seems pretty measured to me. I understand both perspectives here and I am open to any feedback on this stance! |
f9bfcee
to
5f4d371
Compare
Updated the PR to drop re-adding spellcheck. It was included in the commit I was reverting, but is not in scope for what I really care about here. |
So after a thorough discussion among the training team, our decision is as follows:
We feel that this is only fair considering these tests live beyond our repository. |
This reverts commit ebc5e31 from instructlab#76. Here is my comment on instructlab#76 that better explains how this workflow fits in to the bigger picture: > Testing the user end-to-end instructlab workflow is the point of > this workflow. It will give you an early indication if your changes to > the library break the CLI. Sometimes that might be on purpose, but in > case it's not, it will let you know before you publish a release. > > This is not intended to replace other functional testing aimed at the > library more directly, but the full end-to-end workflow is important. > > I'm OK if you want to disable it for the moment, but it needs to go > back on at some point. A better way to do that is to just edit the > workflow to turn it off instead of removing it from the repo. The previous commit also removed spellcheck (though it wasn't mentioned). I have left it out in this PR. Issue instructlab#63 Signed-off-by: Russell Bryant <[email protected]>
5f4d371
to
eb71ef7
Compare
After further discussion with related parties, we came to the conclusion that these e2e tests will be going in. |
This PR can be merged after instructlab/instructlab#1494 |
@RobotSail can this merge now? |
@russellb Can could you please update this test so that it's running with the full training flag, e.g. |
- name: Run e2e test | ||
run: | | ||
. venv/bin/activate | ||
./instructlab/scripts/basic-workflow-tests.sh -cm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run this with -fT
for -f
fulltrain and -T
training library.
This has gone stale at this point. I would suggest considering an EC2-based workflow instead of this that ran on a github runner. That might as well be a new PR. |
This reverts commit ebc5e31 from #76.
Here is my comment on #76 that better explains how this workflow fits
in to the bigger picture:
Issue #63
Signed-off-by: Russell Bryant [email protected]