-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up a basic automated e2e smoke test #102
Conversation
Update: This issue is fixed later in this PR. It is related to p4lang/tdi#107. I updated TDI dependency at the time, and couldn't find the Original comment below: I'm encountering a new issue that could use some help. Short descriptionThe
Relevant logs shown in
How to reproduce?Use the latest commit in this PR. Follow the instructions in For the convenience of a quick look, below I copied the tdi_python
# Add ports
for port_id in range(4):
tdi.port.port.add(
DEV_PORT=port_id,
PORT_TYPE="BF_DPDK_TAP",
PORT_DIR="PM_PORT_DIR_DEFAULT",
PORT_IN_ID=port_id,
PORT_OUT_ID =port_id,
PIPE_IN="pipe",
PIPE_OUT="pipe",
MEMPOOL="MEMPOOL0",
PORT_NAME=f"TAP{port_id}",
MTU=1500
)
# Add entries
from netaddr import IPAddress
control = tdi.main.pipe.MainControl
table = control.forwarding
table.add_with_mark_and_forward(
dst_addr=IPAddress('192.168.1.101'),
marker=0x01,
port=1
)
table.add_with_mark_and_forward(
dst_addr=IPAddress('192.168.1.102'),
marker=0x02,
port=2
) The error occurs on the last two |
Are you able to create at least one simple test that passes, that is not blocked by the issue you have found and reported in this comment? #102 (comment) If so, it might be nice to focus on creating that first simple passing test first, and file a separate issue for the bug you have found. |
Thanks Andy for your suggestion! I'll try to bypass this and come to a working test first. |
@satish153 @swaroopsarma @jamescchoi Can you suggest someone who can (a) review this? and (b) do whatever approval is required for the Github "1 workflow awaiting approval" message I see on this conversation? |
I am interested in running this script as part of CI. How much time is spent on setting up dependencies? Realistically, do we require a docker container to run this as part of CI? |
Thanks @fruffy for offering help! That would be great! I just tried building the Docker image again. It takes 45 min on my computer, and the final image is 4.78 GB. I feel the building time might be too long, and the image size too large to be usable in CI. Without a Docker container, as long as all the dependencies are installed, the testing scripts shall work. Maybe that's an easier solution on the CI side. I still think providing a Dockerfile is useful as a reproducible environment to try things out locally, even if it's not used in the CI. I feel the best middle ground might be what's been done in the p4c repo, that the Dockerfile is simply invoking the ci-build.sh script, which is also used in CI to build required dependencies. One challenge I faced in setting up dependencies is how to best bring in p4c. Building p4-dpdk-target itself doesn't require p4c, but in the e2e tests p4c is needed. My current solution is to use the public p4lang/p4c image as the base image, so that p4c is automatically available. This is probably not ideal, because the p4c version is not fixed, but sometimes that might matter. If we do want to fix p4c version in testing, then it seems building p4c from source is the only option? I'd like to continue with more detailed discussions on this and some other issues, but I feel getting them all resolved in this PR might be too much. So, I'm proposing limiting the scope of this PR to a single working test that is automated locally, and expect more work to be done before it is fully automated in CI. Hopefully this could also make reviewing this PR easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apart from return status removing in dal_init.c, rest all looks fine
The final image is 5.3GB. That seems too large. There should be room for optimization.
The testing scripts are not finished yet. But I ran into some issues that might need to be resolved before I can move on.
TDI is updated to the latest commit (on Mar 1, 2023). DPDK is updated to the latest release (v23.03).
There are still lots of issues to solve before this smoke test becomes meaningful enough. But this could be a good starting point.
This brings back the enable() method that seems to be a crucial step in setting up the pipeline. But this also results in some errors. Will try to fix them.
This is a temporary solution though.
It works this time!
Just rebased the pull request |
Sound great! 45 is also tolerable and we can likely accelerate this.
Hmm, if we use ccache, building p4c from scratch can take around 5-8 minutes. We might consider that. We can also disable all back ends except the DPDK back end.
Yes, this was not a request, mostly curiosity. It will take a bit of time until we get to an automated testing run. |
@swaroopsarma Can you take another look at this to see if it looks good to you now? If I understand correctly, with the latest commit (number 11) on this PR, there are only new log messages output in the dal_init.c code, but no other functional changes. |
@jafingerhut Thanks for the ping and sorry about the premature pushes. I added 2 more commits and now they should be ready for another review. Also thank you @swaroopsarma for your previous review. Wish to hear more from you soon. |
HI @qobilidop This is easy to solve, you can docker pull a specific SHA of any docker image instead of using e.g.
This is how you can pin a version. Just document in any scripts what the SHA corresponds to, e.g. BTW you can also use this to build your own dockerfiles using a specific base image, I do this in the SONiC-DASH project. Here's an example Dockerfile which I base on a specific p4c docker imaghe. I then make a new docker image which is smaller by throwing away unneeded backends. Then I publish this smaller custom p4c image to another container registry which a CI pipeline or user dev environment can pull down. I saved a few hundred MB this way. https://github.com/sonic-net/DASH/blob/main/dash-pipeline/dockerfiles/Dockerfile.p4c-bmv2 |
To avoid long docker build times, why don't you automate the docker build and publish workflow similar to many other p4lang repos, e.g. https://github.com/p4lang/p4c/blob/main/.github/workflows/ci-container-image.yml ? Everything gets built and published to Dockerhub daily. This will save hours, and perhaps attract more users because the initial user experience can be a deal-breaker. Strive for 5 Minutes TFD (Tine for First Dopamine). |
Thanks @chrispsommers! Those are great suggestions. I didn't realize it could be that easy. The example Dockerfile you linked is especially useful. I see there is also a As to publishing Docker images using GitHub CI automatically, I completely agree with you that this would be a great thing to have. I'm just not sure if I am the right person to set it up. Because it seems this would require more permissions (both on GitHub and DockerHub) than I currently have. I'm willing to do the work, but it might be easier for someone already having those permissions to set this up in a follow-up PR. @swaroopsarma @jafingerhut @fruffy What do you think? |
@qobilidop while developing it in your forked dev branch, you can take a few temporary measures to develop it as far as possible w/o involving others, e.g.:
Actually puling the trigger to publish to p4lang/p4-dpdk docker registry would probably require the p4-dpdk account admin to install the dockerhub account credentials in the repo secrets (unless they are global for all or p4lang GitHub site). |
Got it. I'll try these initial steps then. Thanks again @chrispsommers! |
One more piece of advice - any changes you make to this branch will trigger CI actions in the p4lang/p4-dpdk GitHub pipeline, because the PR is already opened. If you want to try out a new CI script or other potentially messy experiments without sharing with the entire world, you can make a scratch branch off of it in your workspace ( |
Iiirc there is no formal process. The docker images are maintained on a voluntary basis. I believe @antoninbas has access to the p4lang Dockerhub. |
Size is reduced from 4.78GB to 1GB! Thanks @chrispsommers for the suggestion.
@chrispsommers I've adopted the multi-stage builds technique from the examples you linked to. It helps reducing docker image size from 4.78GB to 1GB, and building time from 45min to 13min. Very impressive! For the docker image CI setup, I still plan to do the work, but after second thoughts, I again propose to leave it to a follow-up PR, so we can leave this already-long PR more focused (and potentially easier to review). I can work on this in my fork without waiting for this PR to be merged. So that's not a blocker. @swaroopsarma I've marked this PR as ready for review again. The main change needing your attention (based on your last review comments) is the diff in @jafingerhut @fruffy The bmv2 ptf testing work you mentioned in another meeting sounds pretty relevant here. I'm interested to learn how it works if you can share a link, and see if I can help with integrating with it in a follow-up PR. Thanks everyone! |
I believe Andy did a writeup here. The main advantage is the direct use of Protobuf messages. I do not think this setup requires the shell. Correct me if I am wrong, Andy. I am also working with a student at NYU (@Hoooao) on an automated setup for p4c, which uses P4Testgen to generate PTF tests for the SoftNIC and runs them. We were thinking of using your automated workflow and then combining it with the PTF tests setup. But this is still work in progress. |
@fruffy Interesting coincidence, I'd just messaged @jafingerhut about a proposal to include p4tools in all nightly p4c docker build/publish CI runs. It would make using p4testgen more convenient, just pull a pre-built docker. I'll file a PR shortly. |
@fruffy I've submitted PR p4lang/p4c#4049 to incorporate p4tools (including p4testgen) into Dockerfile builds which will eliminate the need to build from scratch unless you're developing the backend itself. Consumers can just |
My PTF tests use a Python library called p4runtime-shell just out of personal choice: https://github.com/p4lang/p4runtime-shell Any method of creating+sending, or receiving+parsing, the required P4Runtime Protobuf messages in Python will work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
@qobilidop and @jafingerhut can we merge this PR ? |
I'm okay with merging it. And thanks for reviewing! |
@swaroopsarma I have no objections to merging it. Please do not count on me for a technical review of these changes -- that would be a job best requested of others who know this repo contents better than I do. I am not sure if you were asking me to do perform the merge, but if you were, I do not have permissions on this repository to do that. |
Summary
This PR sets up a basic automated e2e smoke test:
Background
In our current experimental usage of the P4 DPDK stack, we sometimes identify issues that could be caught early by a simple e2e smoke test. We think adding these tests to this public repo, and setting up GitHub CI to run them automatically, could be a useful addition to the existing unit tests.
Demo
Build a Docker image with the P4 DPDK stack (can take up to hours):
$ ./tools/docker_build.sh
Run the smoke test in a container:
For more details, see
e2e-test/README.md
.Current status and further plans
The test added in this PR is very basic, but shall serve as a good starting point for test organization.
@lzhzero and I will keep working on adding more tests in follow-up PRs, and reporting issues if we encountered any, until we get a fully GitHub CI automated representative sets of e2e tests.