-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support RETURNING #578
Support RETURNING #578
Conversation
I think handling of node failures is the part requiring most testing, since I've whacked that around; and it's pretty much not covered in the regression tests. |
Thanks for splitting the commits, it was very useful. I've one high-level question on the change. We execute the RETURNING on the workers, and use rows that return from one of the workers. Also, there seems to be a bug, which I couldn't find where to write on the changes, so I add to here. When we define cursors on a router query, I observe an unexpected behaviour once the data finishes on the cursor: SELECT *
FROM multiple_hash
WHERE category = '0'
ORDER BY category;
category | data
----------+------
0 | 1-1
0 | 2-1
0 | 3-1
0 | 4-1
0 | 5-1
0 | 6-1
(6 rows)
-- now go with the cursors
BEGIN;
DECLARE test_cursor CURSOR FOR
SELECT *
FROM multiple_hash
WHERE category = '0'
ORDER BY category;
FETCH FORWARD 5 test_cursor;
category | data
----------+------
0 | 1-1
0 | 2-1
0 | 3-1
0 | 4-1
0 | 5-1
(5 rows)
FETCH FORWARD 5 test_cursor;
category | data
----------+------
0 | 6-1
(1 row)
FETCH FORWARD 5 test_cursor;
ERROR: scan directions other than forward scans are unsupported The scan direction is appeared as |
ExecuteSingleShardSelect(QueryDesc *queryDesc, uint64 tupleCount, Task *task, | ||
EState *executorState, TupleDesc tupleDescriptor, | ||
DestReceiver *destination) | ||
static int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: It looks like we need to return uint
. Also, this function does not use executorState
parameter. You may want to remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*/
-static void
-ExecuteSingleShardSelect(QueryDesc *queryDesc, uint64 tupleCount, Task *task,
EState *executorState, TupleDesc tupleDescriptor,
+static int64DestReceiver *destination)
Minor: It looks like we need to return
uint
.
There's a related discussion (#554), and it looks like it's going to be
int64.
Also, this function does not use
executorState
parameter. You may
want to remove that.
Sound sensible.
It's going to come from the master.
Hm, that's rather likely not related to this change. Afaics that |
Just a note, the issue with Cursors I mentioned here doesn't show up on the |
* distributed table is successful. ExecuteDistributedModify returns the number | ||
* of modified rows in that case and errors in all others. This function will | ||
* also generate warnings for individual placement failures. | ||
* ExecuteTaskAndStoreResults executes the task on the remote node, retrieves |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExecuteTaskAndStoreResults: It looks like the name and comment are obsolete. This function may not store the results anymore in case of ConsumeQueryResults
. So, maybe change the name to ExecuteRouterTask
or so?
@anarazel sorry for being late on the review. I added couple of notes for naming etc.. Other than that, the only issue I came across is the cursor issue mentioned above. (As you mentioned node failure testing, I did lots of failure testing. All looks good.) After you consider my comments, I'd like to do a final review and testing. |
bfbe4c8
to
3ff1eda
Compare
Hi Onder. I can actually reproduce the CURSOR issue on master without any problem. Which isn't surprising, given that the relevant code didn't change. Could you verify that you can also reproduce it there, and open an issue if so? |
3ff1eda
to
4922432
Compare
@onderkalaci I think this should be ready to go, what do you think? |
Hi @anarazel, CREATE TABLE multiple_hash
(
category text NOT NULL,
data text NOT NULL
);
INSERT INTO multiple_hash VALUES ('1', '1');
INSERT INTO multiple_hash VALUES ('1', '2');
INSERT INTO multiple_hash VALUES ('1', '3');
INSERT INTO multiple_hash VALUES ('1', '4');
INSERT INTO multiple_hash VALUES ('1', '5');
INSERT INTO multiple_hash VALUES ('1', '6');
INSERT INTO multiple_hash VALUES ('1', '7');
BEGIN;
DECLARE test_cursor CURSOR FOR
SELECT *
FROM multiple_hash
WHERE category = '0'
ORDER BY category;
-- the same error exists on both `master` and `feature/returning` branches:
FETCH FORWARD 4 test_cursor;
FETCH FORWARD 4 test_cursor;
ERROR: scan directions other than forward scans are unsupported
BEGIN;
DECLARE test_cursor CURSOR FOR
SELECT *
FROM multiple_hash
WHERE category = '1'
ORDER BY category;
FETCH FORWARD 4 test_cursor;
FETCH FORWARD 4 test_cursor;
-- for the following commands error happens on `feature/returning` branch
-- and not in `master` branch
FETCH FORWARD 4 test_cursor;
FETCH FORWARD 4 test_cursor; Though I couldn't locate exactly which change has led to the cursor issue, postgres changes the scan direction at this point. And portal->atEnd is set to true just below it. Reading this comment makes me think that we should only error out when scan direction is Since this patch introduces a small error that doesn't exist in |
* placements. | ||
*/ | ||
category = ERRCODE_TO_CATEGORY(ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION); | ||
raiseError = SqlStateMatchesCategory(sqlStateString, category); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was doing some failure testing, I realized that sqlStateString
becomes empty when two nodes are down but the connections have already been cached (i.e., status == PGRES_FATAL_ERROR
). This makes SqlStateMatchesCategory()
to crash.
Steps to reproduce:
CREATE TABLE multiple_hash
(
category text NOT NULL PRIMARY KEY,
data text NOT NULL
);
SELECT master_create_distributed_table('multiple_hash', 'category', 'hash');
SELECT master_create_worker_shards('multiple_hash', 4, 2);
INSERT INTO multiple_hash VALUES ('1', '1');
-- now terminate both worker nodes nodes
/usr/local/pgsql/bin/pg_ctl -D /Users/onderkalaci/Documents/data_dir/worker_9700 -o "-p 9700" -l /tmp/logfile_9700 stop -m f
/usr/local/pgsql/bin/pg_ctl -D /Users/onderkalaci/Documents/data_dir/worker_9701 -o "-p 9701" -l /tmp/logfile_9701 stop -m f
INSERT INTO multiple_hash VALUES ('1', '1');
WARNING: terminating connection due to administrator command
CONTEXT: while executing command on localhost:9701
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, this behavior does not exist on master
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slight change in your recipe shows the same result in master. If you kill -9/use -m immediate, instead of a normal restart, the same crash happens in master. I opened #634 for that issue. I think we just need a NULL check in SqlStateMatchesCategory().
On 2016-06-28 04:46:24 -0700, Önder Kalacı wrote:
I don't think this has anything to do with the patch, the conditions |
4922432
to
9f8db7d
Compare
I pushed two fixups (which will be squashed before merge), first adding a regression test showing the bug, and then a fix. |
#if (PG_VERSION_NUM < 90600) | ||
|
||
/* before 9.6, PostgreSQL used a uint32 for this field, so check */ | ||
Assert(currentAffectedTupleCount <= PG_UINT32_MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this line does not compile on 9.4.6. How could travis pass the tests :)
executor/multi_router_executor.c:695:40: error: use of undeclared identifier 'PG_UINT32_MAX'
Assert(currentAffectedTupleCount <= PG_UINT32_MAX);
Hi @anarazel , I only want to ask you about my comment on the Other than that, as I noted, a single line does not compile on 9.4.6 when asserts are enabled. After the code compiles, I'll run check-full on 9.4.6, and then it's a |
I don't really agree. Your suggested function name leaves the separation of concerns between RouterExecutorStart and ExecuteRouterTasks less clear imo. And not having any rows to return is still computing the return value. I'll update the comment though. |
This unfortunately requires adding a new table, triggering renumbering of a number of shard ids.
…d scans. The targetlist contains TargetEntrys containing expressions, not expressions directly. That didn't matter so far, but with the upcoming RETURNING support, the targetlist is inspected to build a TupleDesc. ExecCleanTypeFromTL hits an assert when looking at something that's not a TargetEntry. Mark the entry as resjunk, so it's not actually used.
The old targetlist wasn't used so far, but the upcoming RETURNING support relies on it. This also allows to get rid of some crufty code in multi_executor.c:multi_ExecutorStart(), which used the worker query's targetlist instead of the main statement's (which didn't have one up to now).
9f8db7d
to
038b485
Compare
The upcoming RETURNING support would otherwise require too much duplication. This contains most of the pieces required for RETURNING support, except removing the planner checks and adjusting regression test output.
038b485
to
4549e06
Compare
@onderkalaci I've reworded the comment. If you're not happy with that, we can continue refining that. But I'm merging, because that unblocks other PRs (and you gave your The 9.4 issue worked on travis, because that compiles without assertions :(. See #619 |
Most of the code changes aren't really about returning itself, but about making it easy to support RETURNING. I've tried to break it down to individually review-able commits. There's one bigger commit "Combine router executor paths for select and modify commands." which I previously had split up further; I can do that again.
Fixes #242