Skip to content

Commit

Permalink
🐛 Fix rerun for entries that came from zips
Browse files Browse the repository at this point in the history
Previously, when rerunning an entry that came from a zip file, the rerun
would fail because it would look for the CSV in an assumed location
which is based on it's last importer ID.  Since this was a rerun, it
does not do an unzip into the assumed location so the directory does not
exist.

This commit will first check if the assumed location exists, and if not,
it will look for the location of the last unizpped files and use that
for the rerun.

This does cause an interesting behavior where if the entry is a work
with a file attached, it will add the file again resulting in duplicate
files.  I feel this is such an edge case though because typically if the
entry is successful, the user will not rerun it.  I added a hint text to
the importer to let the user know this is a possibility.

Ref:
- notch8/palni_palci_knapsack#210
  • Loading branch information
kirkkwang committed Jan 28, 2025
1 parent 0e8de61 commit 3b89489
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 1 deletion.
17 changes: 16 additions & 1 deletion app/parsers/bulkrax/csv_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,8 @@ def path_to_files(**args)
@path_to_files = File.join(
zip? ? importer_unzip_path : File.dirname(import_file_path), 'files', filename
)

Dir.exist?(@path_to_files) ? @path_to_files : File.join(real_importer_unzip_path, 'files', filename)
end

private
Expand All @@ -379,7 +381,7 @@ def unique_collection_identifier(collection_hash)
# We expect a single CSV at the top level of the zip in the CSVParser
# but we are willing to go look for it if need be
def real_import_file_path
return Dir["#{importer_unzip_path}/**/*.csv"].reject { |path| in_files_dir?(path) }.first if file? && zip?
return Dir["#{real_importer_unzip_path}/**/*.csv"].reject { |path| in_files_dir?(path) }.first if file? && zip?

parser_fields['import_file_path']
end
Expand All @@ -389,5 +391,18 @@ def real_import_file_path
def in_files_dir?(path)
File.dirname(path).ends_with?('files')
end

# If we don't have an existing unzip path, we'll try and find it.
# Just in case there are multiple paths, we sort by the number at the end of the path and get the last one
def real_importer_unzip_path
return importer_unzip_path if Dir.exist?(importer_unzip_path)

Dir.glob(base_importer_unzip_path + '*').sort_by { |path| path.split(base_importer_unzip_path).last[1..-1].to_i }.last
end

def base_importer_unzip_path
# turns "tmp/imports/tenant/import_1_20250122035229_1" to "tmp/imports/tenant/import_1_20250122035229"
importer_unzip_path.split('_')[0...-1].join('_')
end
end
end
1 change: 1 addition & 0 deletions app/views/bulkrax/importers/_edit_item_buttons.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
<h5>Options for Updating an Entry</h5>
<hr />
<p>Rebuild metadata and files.</p>
<p class="text-warning">Files may be duplicated if this option is used on a successful entry. Consider using <em>Remove and then Build</em> instead.</p>
<%= link_to 'Build', item_entry_path(item, e), method: :patch, class: 'btn btn-primary' %>
<hr />
<p>Remove existing work and then recreate the works metadata and files.</p>
Expand Down

0 comments on commit 3b89489

Please sign in to comment.