Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postgres_fdw does not work when pg_shard is on shared_preload_libraries #236

Closed
citus-github-bot opened this issue Feb 4, 2016 · 6 comments · Fixed by #329
Closed

postgres_fdw does not work when pg_shard is on shared_preload_libraries #236

citus-github-bot opened this issue Feb 4, 2016 · 6 comments · Fixed by #329
Assignees
Labels
Milestone

Comments

@citus-github-bot
Copy link

Issue by onderkalaci
Wednesday Mar 25, 2015 at 12:31 GMT
Originally opened as citusdata/pg_shard#96


@samay-sharma discovered that postgres_fdw does not work when shared_preload_libraries is set to 'pg_shard'.

When the instructions on this link is followed for postgres_fdw, we get the following error on INSERT statements to postgres_fdw table:

WARNING:  Connection failed to ~�:30098872
DETAIL:  Remote message: invalid port number: "30098872"
ERROR:  connection pointer is NULL

The root of the problem seems to be about linking/loading of shared libraries. postgres_fdw extension calls a function GetConnection() in somewhere in the code. However, since pg_shard is also loaded, pg_shard's GetConnection() function is called.

We might consider this comment in the Makefile.

Also note that, call to DeterminePlannerType(), returns PLANNER_TYPE_POSTGRES, and standard_planner() is called. This error is thrown after standard_planner() is called.

@citus-github-bot
Copy link
Author

Comment by amitlan
Tuesday Apr 14, 2015 at 13:29 GMT


Wow, looks like there's an avalanche of independent discoveries of errors related to shared library(ies) symbol resolution. Just a couple days ago, was looking at a segfault where a library loaded with shared_preload_libraries had a yet-unfixed-for-PG-9.4+ copy of a function that a later loaded library also had defined. And since the later loaded library has already fixed the function for-PG-9.4+, it took a while to figure this out since yet-unfixed copy was being used in all places.

We've discussed this a little bit and it seems you are planning to use -Bsymbolic on extension side. Have you considered downsides of using it though, as I encountered (or you may have) a few Bsymbolic-Considered-Harmful posts around web[1]. Maybe a little more reliable would be -Bsymbolic-functions?

[1] https://software.intel.com/en-us/articles/performance-tools-for-software-developers-bsymbolic-can-cause-dangerous-side-effects

@citus-github-bot citus-github-bot added this to the Next milestone Feb 4, 2016
@citus-github-bot
Copy link
Author

Comment by marcocitus
Thursday Nov 19, 2015 at 23:12 GMT


I tried to see what happens if we rename GetConnection in pg_shard to something else to avoid this conflict. I was able to create a postgres_fdw table that pointed to a pg_shard table, which allowed me to do joins between a local table and a distributed table.

However, when I ran an INSERT command I got the following error:

LOG:  statement: INSERT INTO customer_reviews_fdw VALUES
          ('HN802', '2004-01-01', 1, 10, 4, 'B00007B5DN',
           'Tug of War', 133191, 'Music', 'Indie Music', 'Pop', '{}');
LOG:  statement: SET search_path = pg_catalog
LOG:  statement: SET timezone = 'UTC'
LOG:  statement: SET datestyle = ISO
LOG:  statement: SET intervalstyle = postgres
LOG:  statement: SET extra_float_digits = 3
LOG:  statement: START TRANSACTION ISOLATION LEVEL REPEATABLE READ
ERROR:  unrecognized node type: 2100
STATEMENT:  INSERT INTO public.customer_reviews(customer_id, review_date, review_rating, review_votes, review_helpful_votes, product_id, product_title, product_sales_rank, produ)
ERROR:  unrecognized node type: 2100
CONTEXT:  Remote SQL command: INSERT INTO public.customer_reviews(customer_id, review_date, review_rating, review_votes, review_helpful_votes, product_id, product_title, product)
STATEMENT:  INSERT INTO customer_reviews_fdw VALUES
          ('HN802', '2004-01-01', 1, 10, 4, 'B00007B5DN',
           'Tug of War', 133191, 'Music', 'Indie Music', 'Pop', '{}');
LOG:  statement: ABORT TRANSACTION
TRAP: FailedAssertion("!(IsTransactionState())", File: "relcache.c", Line: 1650)
WARNING:  server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
CONTEXT:  Remote SQL command: ABORT TRANSACTION
LOG:  server process (PID 9804) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: ABORT TRANSACTION
LOG:  terminating any other active server processes

@ozgune
Copy link
Contributor

ozgune commented Feb 5, 2016

I'm adding fresh feedback from a user who tried this on pg_shard.

I had a bit of a scare yesterday trying to use postgres_fdw on our new Citus master. Your documentation asks that pg_shard be included in PostgreSQL's list of libraries to preload on start -- but pg_shard also has been written such that the C function GetConnection() collides with the names of postgres_fdw. Details of this are already written up at:

citusdata/pg_shard#96

Considering that both pg_shard and postgres_fdw are distributed by you all in the citusdb-contrib-4.0 package, I'd expect for both of them to work. The behavior I saw with postgres_fdw was thankfully just a failed connection to the FDW server on SELECT, but it looks like for other modules (such as pg_repack), pg_shard can end up crashing the process. That sounds bad.

@jasonmp85
Copy link
Contributor

Hey @ozgune, I edited your comment to put the user's text in quotes. I proposed a fix for this to @anarazel the other week, and it's been in my backlog for some time now (it's pretty simple), so I think we know how to go about this when we do decide to tackle it.

@ozgune
Copy link
Contributor

ozgune commented Feb 8, 2016

Hey @jasonmp85 -- could the fix for this be as simple as renaming the function on our end?

If it is, I'll chat with @sumedhpathak to see if we can squeeze this into our 5.0 release.

@jasonmp85
Copy link
Contributor

@ozgune that sounds fine as a quick fix. I'll audit symbol names to ensure there aren't any other obvious conflicts.

Long-term we should use visibility modifiers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants