You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently use a Ruby Hash to map keys to tuples in Bud collections. This works fine, but there is room to optimize Bud collections by using a custom hash-like type:
We currently store [key_cols] => tuple mappings. This is redundant, since the key column values can be extracted from the tuple when needed -- you'd just want custom hash and equality functions that would consult the collection schema to find the columns to hash on.
Extracting the key columns on every insertion could then also be avoided.
The current code extracts the key columns, checks for a matching entry in the hash table, and only then does an insertion. If the underlying storage provided an "upsert"-like method, we could avoid the need to probe the storage twice.
In the grand scheme of things, this isn't a priority but might be fun.
The text was updated successfully, but these errors were encountered:
Custom hash and equality functions on the tuple don't work because the tuple doesn't know its keycols. It cannot know its keycols because we pass a tuple around by ref.
Custom hash and equality functions in the collection don't work well; I experimented with a set data structure written in Ruby with schema-specific hash and equals support, but it doesn't match the built-in Ruby hash for performance.
The fastest method I have seen so far (staying in pure Ruby) is to wrap the tuple with an object with schema-specific hash and equals, but which doesn't extract the fields. This is faster than our current method of creating an array: on jruby it is twice is fast, but marginal (10%) on ruby 1.8.7 and 1.9.3
Given,
table :foo, [:k1, :k2] => [:v]
generate a key class as follows
class Foo_Key
def initialize(tup)
@tup = tup
end
def hash
@hash = @tup.k1.hash * 31 + @tup.k2.hash // good to cache the hashcode.
end
def ==(other)
other.class <= Foo_Key and @tup.k1 = other.tup.k1 and @tup.k2 == other.tup.k2
end
end
By "custom hash and equality functions", I was assuming we'd be able to control the hash/equality function APIs, so it would be easy to arrange to pass a reference to the hash table/collection when invoking the function. That's what I did in C4.
Right -- to beat the performance of the builtin Hash, we'd probably want a C extension.
We currently use a Ruby Hash to map keys to tuples in Bud collections. This works fine, but there is room to optimize Bud collections by using a custom hash-like type:
In the grand scheme of things, this isn't a priority but might be fun.
The text was updated successfully, but these errors were encountered: