Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Grid] Support auto downloads in Grid #11702

Merged
merged 7 commits into from
Mar 15, 2023

Conversation

krmahadevan
Copy link
Contributor

Fixes #11656 #11658

Following has been done:

  • Specify the default base directory into which all downloads at a node will go into via the flag “—-base-dir-downloads”. If this flag does not have a value then we default to user’s home.
  • Turn ON managing download folders via the flag “-—enable-manage-downloads”
  • Enabled support for Chrome|Edge|Firefox browsers.
  • File downloads will be done only in a session aware directory for a given web driver session. After session is killed, the directory gets cleaned up as well.

Thanks for contributing to Selenium!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (e6a7987) 54.89% compared to head (4cc48e7) 54.89%.

❗ Current head 4cc48e7 differs from pull request most recent head 5b2503d. Consider uploading reports for the commit 5b2503d to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@           Coverage Diff           @@
##            trunk   #11702   +/-   ##
=======================================
  Coverage   54.89%   54.89%           
=======================================
  Files          85       85           
  Lines        5684     5684           
  Branches      231      231           
=======================================
  Hits         3120     3120           
  Misses       2333     2333           
  Partials      231      231           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@krmahadevan krmahadevan force-pushed the revamp_downloads_folder branch 2 times, most recently from 8264ba5 to b646a06 Compare March 7, 2023 02:16
@titusfortner
Copy link
Member

I really think Selenium should manage the directory (in $HOME/.cache directory) and not use —-base-dir-downloads

@krmahadevan
Copy link
Contributor Author

I really think Selenium should manage the directory (in $HOME/.cache directory) and not use —-base-dir-downloads

@titusfortner - I had to define the ability to specify the home directory because when creating tests, bazel would not allow me to write into the home directory and as such the tests were failing. The only option I had was to alter the spawn-strategy to standalone for bazel, but @shs96c suggested that it wasn't a good idea to do that and tests should not be run in standalone mode.

The primary intent of allowing a user to specify this flag was to facilitate the tests. This parameter is not a must. If its not specified it defaults to the current user's home directory ($HOME/.cache). I use this flag to specify a temp directory for the tests to run.

Here's the slack message link which has more details https://seleniumhq.slack.com/archives/CBH302726/p1676899777428079

@titusfortner
Copy link
Member

Yeah, we really shouldn't need to implement a feature that only serves to test another feature, though. 😕

@diemol do you think it would make sense for the grid to return a se:downloadDirectory capability with the path on the remote machine? If we had that, tests would know where to look/update things to make sure features were working. Or does that not work either? Or is that overkill?

@krmahadevan
Copy link
Contributor Author

@titusfortner the current implementation doesnt need the client to know where the download directory is. Its all managed internally. I have just given the user a flexibility to specify a different base directory thats all. Everything else remains the same. Users would not need to know this directory for them to be able to download a file.

@krmahadevan
Copy link
Contributor Author

@titusfortner

I forgot to respond to this

do you think it would make sense for the grid to return a se:downloadDirectory capability with the path on the remote machine? If we had that, tests would know where to look/update things to make sure features were working. Or does that not work either? Or is that overkill?

The limitation with bazel is not with tests not knowing where the downloads directory is, its with tests being able to write anywhere outside of the temp directory when driven by bazel in a non standalone mode. What this means is that a node that was spun off by a bazel driven test, will NOT be able to write into the home directory since it's outside the temp folder. The flag helps get past this restriction by allowing the bazel driven test to be spin off a node whose downloads directory points to the temp directory.

Fixes SeleniumHQ#11656 SeleniumHQ#11658

Following has been done:

* Specify the default base directory into which 
all downloads at a node will go into via the flag 
“—-base-dir-downloads”. If this flag does not have 
a value then we default to user’s home.
* Turn ON managing download folders via the flag 
“-—enable-manage-downloads”
* Enabled support for Chrome|Edge|Firefox browsers.
* File downloads will be done only in a session aware 
directory for a given web driver session. After session
is killed, the directory gets cleaned up as well.
@krmahadevan krmahadevan force-pushed the revamp_downloads_folder branch from b646a06 to 406dea6 Compare March 10, 2023 17:01
@titusfortner titusfortner removed their request for review March 10, 2023 17:47
@titusfortner
Copy link
Member

Silly bazel.

Reviewing this is on Diego's plate along with everything else. :-/

@diemol
Copy link
Member

diemol commented Mar 13, 2023

Apologies for the delay on reviewing this. The PR is really large and I will need quite some time to go through it. In the future, please let's have smaller PRs whenever possible.

@diemol
Copy link
Member

diemol commented Mar 13, 2023

Yeah, we really shouldn't need to implement a feature that only serves to test another feature, though. 😕

@diemol do you think it would make sense for the grid to return a se:downloadDirectory capability with the path on the remote machine? If we had that, tests would know where to look/update things to make sure features were working. Or does that not work either? Or is that overkill?

I find —-base-dir-downloads useful, @titusfortner, because I can imagine sysadmins who want to control where the Grid can write and read files. These type of things should be configurable. Having this flag being useful for testing is a happy side effect from my point of view.

EDIT: I do not think we need a se:downloadDirectory capability.

@diemol
Copy link
Member

diemol commented Mar 13, 2023

I have a few comments but I due to the delays in the review I will add the changes to the PR. In general, I will change the "auto downloads" phrase to "downloads enabled", which I believe reflects better what this feature is.

and removing out of scope logic to determine
Node match and client side validations
@krmahadevan
Copy link
Contributor Author

I have a few comments but I due to the delays in the review I will add the changes to the PR. In general, I will change the "auto downloads" phrase to "downloads enabled", which I believe reflects better what this feature is.

@diemol - Thanks for taking the time to make the changes. I noticed that you have:

  • removed the edit check at the RemoteWebDriver level which ensures that this capability can be requested only for supported browsers
  • removed the support method in abstract options that helps to turn on this capability without the user having to remember the actual capability name.

So we don't need this is it ?

On a side note, I also noticed formatting changes. Is there a formatter that is being used by us to ensure that all code submissions adhere to the same formatting (or) is there any bazel target that I can run to ensure that the code gets auto formatted (I know that maven has such a plugin). That will ensure that my PRs adhere to the formatting asks as well.

@diemol
Copy link
Member

diemol commented Mar 14, 2023

I have a few comments but I due to the delays in the review I will add the changes to the PR. In general, I will change the "auto downloads" phrase to "downloads enabled", which I believe reflects better what this feature is.

@diemol - Thanks for taking the time to make the changes. I noticed that you have:

* removed the edit check at the `RemoteWebDriver` level which ensures that this capability can be requested only for supported browsers

* removed the support method in abstract options that helps to turn on this capability without the user having to remember the actual capability name.

So we don't need this is it ?

On a side note, I also noticed formatting changes. Is there a formatter that is being used by us to ensure that all code submissions adhere to the same formatting (or) is there any bazel target that I can run to ensure that the code gets auto formatted (I know that maven has such a plugin). That will ensure that my PRs adhere to the formatting asks as well.

The formatting changes were a mistake, I only noticed after the commit. I will try to revert them. We had(have?) some IntelliJ configuration checked in the tree but it seems to not work in all cases. So I guess we will deal with it when we have loads of developers sending pull requests.

I removed the check and support because this is still in beta, and this might create issues in different scenarios:

  • Users with this feature and a lower Grid version, they will report issues.
  • Users with this feature and a different Grid implementation (e.g. Selenoid)
  • Users with this feature and a cloud vendor that has not implemented this feature, they will report issues to their customer support.

So I think adding this to the bindings it is a good idea, but we should do it later, and for all bindings. This PR should only take care about Grid changes. I hope that makes sense.

@krmahadevan
Copy link
Contributor Author

@diemol

So I think adding this to the bindings it is a good idea, but we should do it later, and for all bindings. This PR should only take care about Grid changes. I hope that makes sense.

Yep. That makes sense. I forgot about having to be backward compatible :)

diemol added 4 commits March 14, 2023 17:57
With this, it will be transparent for the user
where files are written, and since we use the
caches then the deletion happens when the session
is closed.

Also, we do not need the `--base-dir-downloads`
parameter.
@krmahadevan
Copy link
Contributor Author

@diemol - I noticed that as part of this commit we got rid of the need to house the files downloaded for a given session into the user directory and moved it into temp folder.

I personally dont have any opinion on the location of the downloads directory and am fine with it. Just wanted to mention that I had to work with the home directory because that was one of the asks :)

I also noticed that you are having to spend more time on re-working on the PR and add additional commits. I am getting to learn more nuances of the codebase from your commits so thank u so much for that. But just that I also feel bad that I am perhaps taking up additional time from you. My apologies if it's not matching the expectations.

@diemol
Copy link
Member

diemol commented Mar 15, 2023

@diemol - I noticed that as part of this commit we got rid of the need to house the files downloaded for a given session into the user directory and moved it into temp folder.

I personally dont have any opinion on the location of the downloads directory and am fine with it. Just wanted to mention that I had to work with the home directory because that was one of the asks :)

I also noticed that you are having to spend more time on re-working on the PR and add additional commits. I am getting to learn more nuances of the codebase from your commits so thank u so much for that. But just that I also feel bad that I am perhaps taking up additional time from you. My apologies if it's not matching the expectations.

@krmahadevan that was a "last minute" change. I was going through the code and I noticed we use the temp file system to store the uploads already, and most of the code was already there, so I decided to reuse it to reduce code maintenance on the Node. I understand this was one of the "requirements", but things are not written on stone here. If we find a better way to implement something that leads to being less intrusive and reduce maintenance, it should be weighted in.

No need to apologize, your code was the initial step to get this done, without your contributions we would not be able to build this. Thank you.

Copy link
Member

@diemol diemol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @krmahadevan!

@diemol diemol merged commit 2de5561 into SeleniumHQ:trunk Mar 15, 2023
@krmahadevan krmahadevan deleted the revamp_downloads_folder branch March 15, 2023 08:39
alpatron pushed a commit to alpatron/selenium that referenced this pull request Mar 15, 2023
* [Grid] Support auto downloads in Grid

Fixes SeleniumHQ#11656 SeleniumHQ#11658

Following has been done:

* Turn ON managing download folders via the flag 
“-—enable-manage-downloads”
* Enabled support for Chrome|Edge|Firefox browsers.
* File downloads will be done only in a session aware 
directory for a given web driver session. After session
is killed, the directory gets cleaned up as well.

* [grid] Renaming to manage downloads enabled

and removing out of scope logic to determine
Node match and client side validations

* [grid] Using the temp file system utility

With this, it will be transparent for the user
where files are written, and since we use the
caches then the deletion happens when the session
is closed.

Also, we do not need the `--base-dir-downloads`
parameter.

* [grid] Adding a cleanup executor for downloaded files

* [grid] Adding e2e test and fixing bug found while adding test

* [grid] Removing test

---------

Co-authored-by: Diego Molina <[email protected]>
Co-authored-by: Diego Molina <[email protected]>
@karanjeetbirdeye
Copy link

does this support deleting files present in /session/{sessionId}/se/files ? I'm trying to make a DELETE request but it gives me 404
` try (HttpClient client = HttpClient.Factory.createDefault().createClient(new URL( "http://localhost:4444/" ) )) {

		 HttpRequest request = new HttpRequest( org.openqa.selenium.remote.http.HttpMethod.DELETE, uri);
		 HttpResponse response = client.execute(request);
		 String text = string(response);
		 log.info( "response is {}", text );
	 }`

404: {"value":{"error":"unknown command","message":"unknown command: unknown command: session/041ee5c716ac3bd31a7c270080700fcc/se/files","stacktrace":"Backtrace:\n\tGetHandleVerifier [0x007ADCE3+50899]\n\t(No symbol) [0x0073E111]\n\t(No symbol) [0x00645588]\n\t(No symbol) [0x00688311]\n\t(No symbol) [0x00687ADE]\n\t(No symbol) [0x00625391]\n\t(No symbol) [0x0062598E]\n\t(No symbol) [0x00625D6A]\n\tGetHandleVerifier [0x00A13EAE+2566302]\n\tGetHandleVerifier [0x00A492B1+2784417]\n\tGetHandleVerifier [0x00A4327C+2759788]\n\tGetHandleVerifier [0x00845740+672048]\n\t(No symbol) [0x00748872]\n\t(No symbol) [0x00624FEA]\n\t(No symbol) [0x006249EF]\n\tGetHandleVerifier [0x00A6BBCC+2926012]\n\tBaseThreadInitThunk [0x75B80099+25]\n\tRtlGetAppContainerNamedObjectPath [0x77177B6E+286]\n\tRtlGetAppContainerNamedObjectPath [0x77177B3E+238]\n"}}

@diemol
Copy link
Member

diemol commented May 3, 2023

Why do you need to delete the files? Files are deleted automatically when the session stops.

@krmahadevan
Copy link
Contributor Author

@karanjeetbirdeye its automatically taken care of. Ou dont need to be doing any explicit deletes and there is no endpoint for it as well. That explains the 404

@karanjeetbirdeye
Copy link

actually I have requirement where in one session multiple files are downloaded, so I fetch one -> delete it -> download another -> delete it. I can loop over a list of files but their names are similar so I was looking to delete them after processing.

@ShadowLNC
Copy link
Contributor

We have a similar scenario where the test will repeatedly download, so being able to delete old files would be great - I know timestamps were also discussed at one point but ended up not being added.

For now we're listing previous files before triggering the download, then using glob patterns to check if something matches since it will append (2) or a similar counter if the filename is a duplicate.

@karanjeetbirdeye
Copy link

thanks @ShadowLNC In our case the downloads just have random strings so it doesn't have appended values

@karanjeetbirdeye
Copy link

@diemol @krmahadevan can we have DELETE endpoint implementation for this?

@diemol
Copy link
Member

diemol commented May 3, 2023

@karanjeetbirdeye please create a feature request and share as much information as possible in it.

@karanjeetbirdeye
Copy link

hey @diemol requested feature here #11986

@aajron
Copy link

aajron commented Jun 26, 2023

@krmahadevan thanks for your hard work, here is my settings, the download pdf in C://Downloads will not be auto-delete when the browser is closed, and my Chrome. driver version is 4.10.
can i know which setting was wrong?
var options = new ChromeOptions();
options.AddUserProfilePreference("download.default_directory", "C://Downloads");

            //disable pop up for block download
            options.AddUserProfilePreference("download.prompt_for_download", false);
            options.AddUserProfilePreference("download.directory_upgrade", true);
            options.AddUserProfilePreference("plugins.always_open_pdf_externally", true);

            options.AddUserProfilePreference("safebrowsing.enabled", "false");
            options.AddUserProfilePreference("safebrowsing.disable_download_protection", "true");
            options.AddArguments("--disable-web-security");
            options.AddArgument("--base-dir-downloads");
            options.AddArgument("--enable-manage-downloads");

@diemol
Copy link
Member

diemol commented Jun 26, 2023

@aajron this is a feature only for Selenium Grid. For questions please use StackOverflow or https://www.selenium.dev/support/

@Bohdan-Khylia
Copy link

Bohdan-Khylia commented Jan 21, 2025

Hi
I have a problem with the functionality. In my case, I need to open certain file in the browser, so I need to know the file location. If I turn on the functionality, the storage folder will not be possible to set as a path and if the functionality is turned off, I don't have the possibility to delete files on driver.quit() from the node. Could you make a mechanism to do it? Or provide me with more information?
cc: @diemol @krmahadevan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[🚀 Feature]: Grid implement capability to set download directory
8 participants