-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow modification commands which don't contain unpure functions #240
Comments
Comment by gmcquillan Similarly, as mentioned in the pg_shard-users list, subqueries might be another work around to the function issue, but they are also not supported yet. (e.g. perform the function on data from a different table on the master and insert only constant values into the sharded table). |
Comment by jasonmp85
Are you sure this issue is due to the This might be an argument that the error message should be clarified to say which expression is non-constant. Is the HLL extension installed on the master? Does it know the details of these functions? |
Comment by jasonmp85
At the moment, yes. So unless we can fold an expression into a constant expression, modifications are unsafe if replication is in play. |
Comment by gmcquillan To usefully combine or update HLL datastructures, you need to call I didn't think about replication. That's interesting. I assumed that each shard was responsible for its own replication (streaming WAL logs), not that it was something that pg_shard handled for me (if I'm understanding you correctly). Thanks for taking the time to correct my misunderstanding. |
Comment by jasonmp85 Streaming replication replicates the entire database which is incompatible with our use of many small "logical" shards. Because of this we've been keeping an eye on BDR/UDR, but don't have a timeline or even any concrete designs at this point. We could probably expand the use of functions to those which are |
Comment by gmcquillan Yeah, one really, really nice property of storing data as HLL data types is that mutation is idempotent, which has a nice resiliency in distributed systems. |
Comment by jasonmp85 Hm… so I just checked We find the HLL extension very useful for a number of our customers, so I want to make sure it's working well with Could you shoot me an email at engage at citusdata dot com to have a quick chat about the problem you're working on? |
Comment by jasonmp85 OK, so I've investigated what's going on here. Previously we had the assumption that This is obviously true when transforming something like I think we can relax this check by replacing it with a call to Because We'll look into this during a future cycle. |
@jasonmp85 - how long would it take to incorporate (Also, we're tracking performance / locking improvements for |
I think we should fix this by evaluating all functions on the master, |
Ran into this during a customer engagement. The use case was: UPDATE table SET updated_at = now() WHERE .... |
@samay-sharma: I'm not sure that applies here, as |
@samay-sharma: I'm not sure that applies here, as `now` is decidedly `VOLATILE`.
Agreed that it doesn't apply, but `now` is actually stable, not
volatile. It returns the transaction timestamp, not the clock time
(that'd be `clock_timestamp` which is indeed volatile).
|
@jasonmp85 : Oops, Should I move my comment to #213, then ? |
This just came up in engage@, someone wanted to run a query like
The workaround I suggested was to use an upsert:
Which obviously isn't ideal. |
I assume upsert doesn't actually check this part of the query? Can you use |
Upsert uses !contain_mutable_functions whereas update uses !IsA(targetEntry->expr, Const).
|
Ah, ok, good. Yeah, we should just fix this. |
I walked to Sumedh and he seemed okay with fixing this given that it Does that sound like a reasonable estimate to someone who knows more than 21 Nis 2016 Per 20:03 tarihinde Jason Petersen [email protected]
|
Sounds reasonable. It's probably 5 lines of code plus a bunch of testing. |
Yes, sounds reasonable here. On April 21, 2016 1:57:58 PM CDT, Brian Cloutier [email protected] wrote:
Sent from my Android device with K-9 Mail. Please excuse my brevity. |
Once we solve this issue we can also provide a sneaky workaround for #211 by defining a PL/pgSQL wrapper for nextval that is marked as immutable. In that case it will get evaluated on the master as if it was a constant expression. |
Issue by gmcquillan
Tuesday May 05, 2015 at 22:53 GMT
Originally opened as citusdata/pg_shard#108
I'm experimenting with using HyperLogLog (HLL) data types in some columns. One problem with these is that they take up quite a lot more space than a BIGINT. pg_shard potentially allays a lot of those issues. The extension that provides the HLL datatype is this one by aggregateknowledge. These two extensions seem complimentary for warehousing purposes.
Surprisingly, these data types work with sharded tables for most types of reads, but not for writes (see below). When I attempt an update like so:
I get:
I can sort of work around this by setting the literal bytes in this field. Which works fine -- but adding HLL values requires a read and a write. Since pg_shard (understandably) doesn't allow for more than a single statement transaction, this leaves my use-case vulnerable to race conditions in multi-writer environments.
Since this function is available on the workers, and is deterministic based on the value of the existing row and the new HLL value to be added, there shouldn't be any issue with dispatching this expression through to the workers.
Is there a hard limitation preventing pg_shard from dispatching modifications for non-constant expressions?
The text was updated successfully, but these errors were encountered: