Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyrax 4 valkyrie support #872

Merged
merged 117 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
374e126
create an object factory that supports Valkyrie
bkiahstroud Jun 27, 2023
09de2cc
temp gem conflict workaround
bkiahstroud Jul 7, 2023
28875a8
:gear: upgrade dry-monads dependency to ~> 1.5.0
Aug 21, 2023
df96de6
:broom: Add extra parameter for fill_in_blank_source_identifiers
Aug 24, 2023
bae61a7
Revert ":broom: Add extra parameter for fill_in_blank_source_identifi…
Aug 24, 2023
fe51a43
:broom: delegate create_parent_child_relationships from importer to p…
Aug 25, 2023
3dac0f5
allow ruby 3 syntax in migrations
orangewolf Aug 29, 2023
a9a90ba
Merge remote-tracking branch 'origin/i672-valkyrie-support' into hyra…
Aug 29, 2023
86adf9a
:broom: change exists? to exist? to support Ruby 3.2
Aug 30, 2023
ba359f6
:construction: add support for Hyrax 5, valkyrie and ruby 3.2
Aug 30, 2023
e8677bb
add temp workaround for blank title and creator
Aug 31, 2023
f6fb201
:gear: Switch find methods with custom queries for Valkyrie
Sep 7, 2023
1def082
Merge branch 'main' into hyrax-4-valkyrie-support
orangewolf Sep 12, 2023
03544c7
hyrax 4 permission service does both valk and non-valk
orangewolf Sep 12, 2023
2bf6024
new bagit
orangewolf Sep 13, 2023
56101af
handle validation failure
orangewolf Sep 15, 2023
759a481
better failure detection for vaklyrie object
orangewolf Sep 15, 2023
9724643
fix validation message
orangewolf Sep 15, 2023
ed49dc2
importer failure helpers
orangewolf Sep 15, 2023
ba7a071
improve multiple detection in matchers
orangewolf Dec 12, 2023
784798d
fix matcher on missing field
orangewolf Dec 15, 2023
c2ee9bc
rob cant remember that its include?
orangewolf Dec 15, 2023
2f65930
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Jan 24, 2024
b346c74
Appeasing rubocop
jeremyf Jan 24, 2024
028069c
♻️ Handle exist? and/or exists? for finding objects
jeremyf Jan 24, 2024
a2cca06
Add dry/monads require for specs
jeremyf Jan 25, 2024
837ab8a
I897 Bulkrax readiness for Hyku 6 and Hyrax 4 & 5 (#898)
ShanaLMoore Jan 25, 2024
0cabf79
📚 Adding documentation for configuration (#896)
jeremyf Jan 25, 2024
8311fe0
♻️ Extract Bulkrax::FactoryClassFinder (#900)
jeremyf Jan 25, 2024
2d33420
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Jan 26, 2024
61111d8
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Jan 26, 2024
2ada4ec
:bug: [i134] - Fix missing translations
Jan 26, 2024
91583af
Renaming method for parity
jeremyf Jan 29, 2024
687906e
♻️ Favor Bulkrax's persistence layer
jeremyf Jan 31, 2024
cb3b7c5
♻️ Favor Bulkrax.persistence_adapter over ActiveFedora::Base
jeremyf Jan 31, 2024
756768d
Moving methods to adapter pattern
jeremyf Jan 31, 2024
afcfc3d
use find_by_source_identifier instead of find_by_bulkrax_identifier (…
ShanaLMoore Feb 1, 2024
4844893
🧹 Make CreateRelationshipJob work for Valkyrie (#908)
kirkkwang Feb 2, 2024
0b2212e
Add todo comment
Feb 2, 2024
6a81ac1
Merge branch 'hyrax-4-valkyrie-support' of github.com:samvera-labs/bu…
Feb 2, 2024
c8f87ad
🎁 Switch transaction to listener
Feb 5, 2024
89bdb37
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Feb 7, 2024
47a42c8
♻️ Migrate persistence layer methods to object factory (#911)
jeremyf Feb 7, 2024
67fbf9d
🧹 Get exporters to work
Feb 7, 2024
e9a527b
make updates work
Feb 7, 2024
22eb48e
🧹 Make DeleteJob work wth new class method .find (#912)
kirkkwang Feb 8, 2024
6d5517d
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Feb 23, 2024
166d4b1
♻️ Remove constant
jeremyf Mar 8, 2024
0b1077b
♻️ Reworking structure
jeremyf Mar 8, 2024
30dc16d
Adding index to schema
jeremyf Mar 8, 2024
828beca
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Mar 8, 2024
3c2d625
♻️ Favor asking about model_name over class (#934)
jeremyf Mar 8, 2024
01ff728
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Mar 8, 2024
6f50c15
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Mar 8, 2024
8c97ba6
Favor object factory for find
jeremyf Mar 8, 2024
fb8e944
♻️ Fix return value of transaction create
jeremyf Mar 8, 2024
8f8482b
Correct Hyrax.object_factory -> Bulkrax.object_factory
jeremyf Mar 11, 2024
7420b9c
Download cloud files later (#930)
kirkkwang Mar 8, 2024
64cf314
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Mar 12, 2024
0cfd934
yMerge branch 'main' into hyrax-4-valkyrie-support
jeremyf Mar 13, 2024
e219b22
♻️ Favor configuration over hard-coding and reaching assumptions
jeremyf Mar 13, 2024
4d164c8
♻️ Extract Bulkrax.collection_class_method
jeremyf Mar 13, 2024
590d6e2
♻️ Favor Bulkrax.collection_model_class
jeremyf Mar 13, 2024
a053e9c
♻️ Favor Bulkrax.object_factory.find
jeremyf Mar 13, 2024
a8bdf43
♻️ Extract Bulkrax.object_factory.save! method for
jeremyf Mar 13, 2024
849ed13
♻️ Favor using object_factory for save!
jeremyf Mar 13, 2024
a374ed9
♻️ Extract Hyrax.object_factory.search_by_property
jeremyf Mar 14, 2024
cc2dd29
♻️ Extract method for Valkyrization
jeremyf Mar 14, 2024
c4e9a1c
🎁 Adding query for find_by_model_and_property_value
jeremyf Mar 14, 2024
da716d1
♻️ Remove custom Valkyrie search_by_identifer
jeremyf Mar 14, 2024
8880a77
♻️ Favor internal_resource definitions (when available)
jeremyf Mar 14, 2024
bca5afb
♻️ Extract internal_resources method for curation concerns
jeremyf Mar 14, 2024
aff40de
♻️ Favor Bulkrax.object_factory and add fault tolerance
jeremyf Mar 14, 2024
e56e63a
Addressing TODO and minor refactoring
jeremyf Mar 14, 2024
bfb6bdb
I161 import collection resources (#933)
ShanaLMoore Mar 15, 2024
6a89949
♻️ Extract logic for add_user_to_collection_permissions
jeremyf Mar 15, 2024
6ff917a
📚 Tidying documentation
jeremyf Mar 18, 2024
2f161e6
♻️ Refactor Object Factories to leverage more inheritance
jeremyf Mar 18, 2024
3e78e82
♻️ Extract abstract class for ObjectFactory
jeremyf Mar 18, 2024
471c872
♻️ Move method to interface
jeremyf Mar 18, 2024
99adc92
♻️ Organizing code for Valkyrie Object Factory
jeremyf Mar 18, 2024
809d581
Refactoring method names for sorting order
jeremyf Mar 18, 2024
f9e10d7
♻️ Handle Valkyrie::Resource situation
jeremyf Mar 18, 2024
a9bf883
♻️ Puzzling through implementation details
jeremyf Mar 18, 2024
4557b0a
♻️ Extract method to enable removal of conditionals
jeremyf Mar 18, 2024
ea92705
♻️ Extract FileFactory::InnerWorkings
jeremyf Mar 18, 2024
d296aec
♻️ Refactor to extract local variable
jeremyf Mar 18, 2024
5695435
Adding class attribute for Bulkrax::FileFactory
jeremyf Mar 18, 2024
2c7b042
♻️ Adding inner methods for file factory interaction
jeremyf Mar 18, 2024
11b4517
🐛🏳️ post Big refactor fixes
Mar 19, 2024
c476ac6
fix typo
Mar 19, 2024
2e57d21
🧹 Add case for `'collectionresource'`
Mar 19, 2024
25276b9
reload the object before calling persisted? on it
Mar 19, 2024
b542326
Merge branch 'hyrax-4-valkyrie-support' of github.com:samvera-labs/bu…
Mar 19, 2024
32781a0
:lipstick: rubocop fix
Mar 19, 2024
6d2cc56
🐛 Add return in ObjectFactory if valkyrie
Mar 20, 2024
f6a0fb9
save parent object to establish relationships
Mar 20, 2024
fdc3ea3
Add FileSet branch to coercer conditional
Mar 20, 2024
dd277e6
Add commit to clarify casecmp in CsvParser
Mar 20, 2024
a01e786
🎁 Add ability to use tar.gz files
Mar 21, 2024
04b256b
🐛 Changing guard to #respond_to?(:where)
Mar 21, 2024
cab3fb6
🎁 Change glyphicon to font awesome
Mar 21, 2024
8f04984
Add require ruby-progressbar (#942)
kerchner Mar 14, 2024
ba3d2d7
🐛 Ensure we include visibility and other keywords for collection
jeremyf Mar 25, 2024
f53dbbc
🐛 Fix visibility check on the object
Mar 26, 2024
d52800a
🐛 Save provided visibility from CSV
Mar 28, 2024
8b8082e
♻️ Extract methods for better composition
jeremyf Mar 29, 2024
ad54816
♻️ Extracting object factory methods
jeremyf Mar 29, 2024
785b793
💄 endless and ever appeasing of the coppers
jeremyf Mar 29, 2024
c726754
♻️ Favor object factory over hard-coded
jeremyf Apr 1, 2024
fd02e06
Amend the see/refer documentation for parser
jeremyf Apr 1, 2024
dcb9f9b
💄 endless and ever appeasing of the coppers
jeremyf Apr 1, 2024
5b75766
Merge branch 'main' into hyrax-4-valkyrie-support
jeremyf Apr 1, 2024
de69e7e
Updating test schema
jeremyf Apr 1, 2024
f6bb1a2
Remove transactions from initialization
jeremyf Apr 1, 2024
88ac373
♻️ Remove explicit calls to AdminSet
jeremyf Apr 2, 2024
3d81421
📚 Adding TODO items
jeremyf Apr 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion app/controllers/bulkrax/exporters_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def edit
end

# Correctly populate export_source_collection input
@collection = Collection.find(@exporter.export_source) if @exporter.export_source.present? && @exporter.export_from == 'collection'
@collection = Bulkrax.object_factory.find(@exporter.export_source) if @exporter.export_source.present? && @exporter.export_from == 'collection'
end

# POST /exporters
Expand Down
298 changes: 135 additions & 163 deletions app/factories/bulkrax/object_factory.rb
Original file line number Diff line number Diff line change
@@ -1,153 +1,171 @@
# frozen_string_literal: true

module Bulkrax
class ObjectFactory # rubocop:disable Metrics/ClassLength
extend ActiveModel::Callbacks
# rubocop:disable Metrics/ClassLength
class ObjectFactory < ObjectFactoryInterface
include Bulkrax::FileFactory
include DynamicRecordLookup

# @api private
#
# These are the attributes that we assume all "work type" classes (e.g. the given :klass) will
# have in addition to their specific attributes.
#
# @return [Array<Symbol>]
# @see #permitted_attributes
class_attribute :base_permitted_attributes,
default: %i[id edit_users edit_groups read_groups visibility work_members_attributes admin_set_id]
##
# @!group Class Method Interface

# @return [Boolean]
#
# @example
# Bulkrax::ObjectFactory.transformation_removes_blank_hash_values = true
#
# @see #transform_attributes
# @see https://github.com/samvera-labs/bulkrax/pull/708 For discussion concerning this feature
# @see https://github.com/samvera-labs/bulkrax/wiki/Interacting-with-Metadata For documentation
# concerning default behavior.
class_attribute :transformation_removes_blank_hash_values, default: false
##
# @note This does not save either object. We need to do that in another
# loop. Why? Because we might be adding many items to the parent.
def self.add_child_to_parent_work(parent:, child:)
return true if parent.ordered_members.to_a.include?(child_record)

define_model_callbacks :save, :create
attr_reader :attributes, :object, :source_identifier_value, :klass, :replace_files, :update_files, :work_identifier, :work_identifier_search_field, :related_parents_parsed_mapping, :importer_run_id
parent.ordered_members << child
end

# rubocop:disable Metrics/ParameterLists
def initialize(attributes:, source_identifier_value:, work_identifier:, work_identifier_search_field:, related_parents_parsed_mapping: nil, replace_files: false, user: nil, klass: nil, importer_run_id: nil, update_files: false)
@attributes = ActiveSupport::HashWithIndifferentAccess.new(attributes)
@replace_files = replace_files
@update_files = update_files
@user = user || User.batch_user
@work_identifier = work_identifier
@work_identifier_search_field = work_identifier_search_field
@related_parents_parsed_mapping = related_parents_parsed_mapping
@source_identifier_value = source_identifier_value
@klass = klass || Bulkrax.default_work_type.constantize
@importer_run_id = importer_run_id
def self.add_resource_to_collection(collection:, resource:, user:)
collection.try(:reindex_extent=, Hyrax::Adapters::NestingIndexAdapter::LIMITED_REINDEX) if
defined?(Hyrax::Adapters::NestingIndexAdapter)
resource.member_of_collections << collection
save!(resource: resource, user: user)
end
# rubocop:enable Metrics/ParameterLists

# update files is set, replace files is set or this is a create
def with_files
update_files || replace_files || !object
def self.update_index_for_file_sets_of(resource:)
resource.file_sets.each(&:update_index) if resource.respond_to?(:file_sets)
end

def run
arg_hash = { id: attributes[:id], name: 'UPDATE', klass: klass }
@object = find
if object
object.reindex_extent = Hyrax::Adapters::NestingIndexAdapter::LIMITED_REINDEX if object.respond_to?(:reindex_extent)
ActiveSupport::Notifications.instrument('import.importer', arg_hash) { update }
else
ActiveSupport::Notifications.instrument('import.importer', arg_hash.merge(name: 'CREATE')) { create }
end
yield(object) if block_given?
object
##
# @see Bulkrax::ObjectFactoryInterface
def self.export_properties
# TODO: Consider how this may or may not work for Valkyrie
properties = Bulkrax.curation_concerns.map { |work| work.properties.keys }.flatten.uniq.sort
properties.reject { |prop| Bulkrax.reserved_properties.include?(prop) }
end

def run!
self.run
# Create the error exception if the object is not validly saved for some reason
raise ActiveFedora::RecordInvalid, object if !object.persisted? || object.changed?
object
def self.field_multi_value?(field:, model:)
return false unless field_supported?(field: field, model: model)
return false unless model.singleton_methods.include?(:properties)

model&.properties&.[](field)&.[]("multiple")
end

def update
raise "Object doesn't exist" unless object
destroy_existing_files if @replace_files && ![Collection, FileSet].include?(klass)
attrs = transform_attributes(update: true)
run_callbacks :save do
if klass == Collection
update_collection(attrs)
elsif klass == FileSet
update_file_set(attrs)
else
update_work(attrs)
end
end
object.apply_depositor_metadata(@user) && object.save! if object.depositor.nil?
log_updated(object)
def self.field_supported?(field:, model:)
model.method_defined?(field) && model.properties[field].present?
end

def self.file_sets_for(resource:)
return [] if resource.blank?
return [resource] if resource.is_a?(Bulkrax.file_model_class)

resource.file_sets
end

def find
found = find_by_id if attributes[:id].present?
return found if found.present?
return search_by_identifier if attributes[work_identifier].present?
##
#
# @see Bulkrax::ObjectFactoryInterface
def self.find(id)
ActiveFedora::Base.find(id)
rescue ActiveFedora::ObjectNotFoundError => e
raise ObjectFactoryInterface::ObjectNotFoundError, e.message
end

def find_by_id
klass.find(attributes[:id]) if klass.exists?(attributes[:id])
def self.find_or_create_default_admin_set
# NOTE: Hyrax 5+ removed this method
AdminSet.find_or_create_default_admin_set_id
end

def find_or_create
o = find
return o if o
run(&:save!)
def self.publish(**)
return true
end

def search_by_identifier
query = { work_identifier_search_field =>
source_identifier_value }
# Query can return partial matches (something6 matches both something6 and something68)
# so we need to weed out any that are not the correct full match. But other items might be
# in the multivalued field, so we have to go through them one at a time.
match = klass.where(query).detect { |m| m.send(work_identifier).include?(source_identifier_value) }
##
# @param value [String]
# @param klass [Class, #where]
# @param field [String, Symbol] A convenience parameter where we pass the
# same value to search_field and name_field.
# @param search_field [String, Symbol] the Solr field name
# (e.g. "title_tesim")
# @param name_field [String] the ActiveFedora::Base property name
# (e.g. "title")
# @param verify_property [TrueClass] when true, verify that the given :klass
#
# @return [NilClass] when no object is found.
# @return [ActiveFedora::Base] when a match is found, an instance of given
# :klass
# rubocop:disable Metrics/ParameterLists
#
# @note HEY WE'RE USING THIS FOR A WINGS CUSTOM QUERY. BE CAREFUL WITH
# REMOVING IT.
#
# @see # {Wings::CustomQueries::FindBySourceIdentifier#find_by_model_and_property_value}
def self.search_by_property(value:, klass:, field: nil, search_field: nil, name_field: nil, verify_property: false)
return nil unless klass.respond_to?(:where)
# We're not going to try to match nil nor "".
return if value.blank?
return if verify_property && !klass.properties.keys.include?(search_field)

search_field ||= field
name_field ||= field
raise "You must provide either (search_field AND name_field) OR field parameters" if search_field.nil? || name_field.nil?
# NOTE: Query can return partial matches (something6 matches both
# something6 and something68) so we need to weed out any that are not the
# correct full match. But other items might be in the multivalued field,
# so we have to go through them one at a time.
#
# A ssi field is string, so we're looking at exact matches.
# A tesi field is text, so partial matches work.
#
# We need to wrap the result in an Array, else we might have a scalar that
# will result again in partial matches.
match = klass.where(search_field => value).detect do |m|
# Don't use Array.wrap as we likely have an ActiveTriples::Relation
# which defiantly claims to be an Array yet does not behave consistently
# with an Array. Hopefully the name_field is not a Date or Time object,
# Because that too will be a mess.
Array(m.send(name_field)).include?(value)
end
return match if match
end
# rubocop:enable Metrics/ParameterLists

def self.query(q, **kwargs)
ActiveFedora::SolrService.query(q, **kwargs)
end

# An ActiveFedora bug when there are many habtm <-> has_many associations means they won't all get saved.
# https://github.com/projecthydra/active_fedora/issues/874
# 2+ years later, still open!
def create
attrs = transform_attributes
@object = klass.new
object.reindex_extent = Hyrax::Adapters::NestingIndexAdapter::LIMITED_REINDEX if defined?(Hyrax::Adapters::NestingIndexAdapter) && object.respond_to?(:reindex_extent)
run_callbacks :save do
run_callbacks :create do
if klass == Collection
create_collection(attrs)
elsif klass == FileSet
create_file_set(attrs)
else
create_work(attrs)
end
end
def self.clean!
super do
ActiveFedora::Cleaner.clean!
end
object.apply_depositor_metadata(@user) && object.save! if object.depositor.nil?
log_created(object)
end

def log_created(obj)
msg = "Created #{klass.model_name.human} #{obj.id}"
Rails.logger.info("#{msg} (#{Array(attributes[work_identifier]).first})")
def self.solr_name(field_name)
if defined?(Hyrax)
Hyrax.index_field_mapper.solr_name(field_name)
else
ActiveFedora.index_field_mapper.solr_name(field_name)
end
end

def self.ordered_file_sets_for(object)
object&.ordered_members.to_a.select(&:file_set?)
end

def self.save!(resource:, **)
resource.save!
end

def log_updated(obj)
msg = "Updated #{klass.model_name.human} #{obj.id}"
Rails.logger.info("#{msg} (#{Array(attributes[work_identifier]).first})")
def self.update_index(resources: [])
Array(resources).each(&:update_index)
end
# @!endgroup Class Method Interface
##

def log_deleted_fs(obj)
msg = "Deleted All Files from #{obj.id}"
Rails.logger.info("#{msg} (#{Array(attributes[work_identifier]).first})")
def find_by_id
return false if attributes[:id].blank?
# Rails / Ruby upgrade, we moved from :exists? to :exist? However we want to continue (for a
# bit) to support older versions.
method_name = klass.respond_to?(:exist?) ? :exist? : :exists?
klass.find(attributes[:id]) if klass.send(method_name, attributes[:id])
rescue Valkyrie::Persistence::ObjectNotFoundError
false
end

def delete(_user)
find&.delete
end

private
Expand Down Expand Up @@ -238,52 +256,6 @@ def handle_remote_file(remote_file:, actor:, update: false)
update == true ? actor.update_content(tmp_file) : actor.create_content(tmp_file, from_url: true)
tmp_file.close
end

def clean_attrs(attrs)
# avoid the "ArgumentError: Identifier must be a string of size > 0 in order to be treeified" error
# when setting object.attributes
attrs.delete('id') if attrs['id'].blank?
attrs
end

def collection_type(attrs)
return attrs if attrs['collection_type_gid'].present?

attrs['collection_type_gid'] = Hyrax::CollectionType.find_or_create_default_collection_type.to_global_id.to_s
attrs
end

# Override if we need to map the attributes from the parser in
# a way that is compatible with how the factory needs them.
def transform_attributes(update: false)
@transform_attributes = attributes.slice(*permitted_attributes)
@transform_attributes.merge!(file_attributes(update_files)) if with_files
@transform_attributes = remove_blank_hash_values(@transform_attributes) if transformation_removes_blank_hash_values?
update ? @transform_attributes.except(:id) : @transform_attributes
end

# Regardless of what the Parser gives us, these are the properties we are prepared to accept.
def permitted_attributes
klass.properties.keys.map(&:to_sym) + base_permitted_attributes
end

# Return a copy of the given attributes, such that all values that are empty or an array of all
# empty values are fully emptied. (See implementation details)
#
# @param attributes [Hash]
# @return [Hash]
#
# @see https://github.com/emory-libraries/dlp-curate/issues/1973
def remove_blank_hash_values(attributes)
dupe = attributes.dup
dupe.each do |key, values|
if values.is_a?(Array) && values.all? { |value| value.is_a?(String) && value.empty? }
dupe[key] = []
elsif values.is_a?(String) && values.empty?
dupe[key] = nil
end
end
dupe
end
end
# rubocop:enable Metrics/ClassLength
end
Loading
Loading