-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement zip extract #158
Conversation
Some future consumers will need to know the expected fileSize depending on implementation (e.g. unzip). This wires up basic support for adding the fileSize as an argument to Consume; the value is already available at the time Consume is called.
|
multiReader is a reader that implements the ReadAt functionality needed for some future consumers (e.g. unzip). The multiReader at a basic level consumes a mutltiChanReader via the NewmultiReader() function and returns an io.ReaderAt implementation. bufferedReader now has a .len() calculation that will report the content-length once that header is received. Since we do not know the actual content length until the download starts, there is a new signal channel to indicate the download has started and allows us to read the size of the bufferedReader. This means that there is the real likelihood that reading from multiReader may block more often than chanmultiReader. MultiReader may be able to implement Seek() and other related functions for reading the data out of strict order.
Implement ZipExtractor consumer
If the consumer is not File or tar-extractor when -x is used, log a warning that the tar-extractor supersedes the specified consumer.
Make the consumer handle overwriting explicitly. This addresses edge cases with tar and zip consumer when extracting files.
Move the ConsistentHashingStrategyKey to client not config.
e7d58cf
to
9ca65bd
Compare
'unzip' is the binary used in linux to extract from a zip file, lets stick with names that are more aligned with the CLI tools we otherwise use.
Fetch now returns contentType and consumers take ContentType as an argument. This is in preperation of multifile being able to direct differnt contentTypes to different consumers in the case of tar/zip extraction.
Multifile can now extract tar and zip files based upon the content-type. The -u and -t flags for multifile command control unzip and untar capabilities respectively.
* Implement `-u` short hand for `--unzip` * `--unzip` option for invoking the unzip consumer added * multifile mode utilizes `--unzip/-u` and `--extract/-x` for tar and unzip modes * Improved Debugging logs for tar and unzip * Update README
PreRun and PreRunE are mutually exclusive. This moves the extraction and unzip consumer handling via short-hand options to PreRunE where we validate that -x and -u are not consurrently used.
|
||
for _, file := range zipReader.File { | ||
err := handleFileFromZip(file, destPath, overwrite) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extracted files do not end up outside the intended destination directory?
I assume ZIP code checks archive's size, structure for signs of corrupted/junk archive...
same for other arch types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Malicious Tar and Zip checking should be added. I am not wanting to support non-standard zip (read: extensions) unless there is a real need.
Implements zip extraction consumer. This is require to handle dreambooth updates.
In golang extraction of tar and zip are within ~300ms for equivalent files (test case for dreambooth processing), lets enable lightening the load and directly handle zip files in pget.
--unzip
(-u
) option for setting unzip consumer--unzip
and--extract
options now