Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UUID support #42

Open
mkgrgis opened this issue Aug 5, 2023 · 4 comments
Open

UUID support #42

mkgrgis opened this issue Aug 5, 2023 · 4 comments
Assignees

Comments

@mkgrgis
Copy link
Contributor

mkgrgis commented Aug 5, 2023

Hello, @ibarwick!

You have powerful inspired me for C FDW contributing. Some hours ago I have implemented UUID support for sqlite_fdw and check SQLite to Firebird data transport. Unfortunately there is some problems. Let's test.

Firebird gives us some UUID functions. Recommended special storage mode is CHAR(16) CHARACTER SET OCTETS.

Firebird

CREATE TABLE "UUID" ("UUID" CHAR(16) CHARACTER SET OCTETS NOT NULL);
-- Many times, 15-20 examples
INSERT INTO "UUID" ("UUID")  
SELECT gen_uuid() FROM rdb$database;

PostgreSQL

CREATE FOREIGN TABLE "uuid" (
  "UUID" uuid
)
SERVER firebird_server OPTIONS (table_name 'UUID', quote_identifier 'true');

First problem
During select * from uuid; there is error with malformed input.

Note: for text stored UUIDs there is no SELECT problems
Firebird

CREATE TABLE "UUIDt" ("UUID" VARCHAR (40) NOT NULL);
-- Many times, 15-20 examples
INSERT INTO "UUIDt" ("UUID")
SELECT uuid_to_char(gen_uuid()) FROM rdb$database;

PostgreSQL

CREATE FOREIGN TABLE "uuid text"(
  "UUID" uuid
)
SERVER firebird_server
OPTIONS (table_name 'UUIDt', quote_identifier 'true');

Sample listing

select * from "uuid text";
                 UUID                 
--------------------------------------
 91f8b0f1-db13-44f0-91c8-71a1acc402cf
 da89d679-8d3a-4a81-8027-f1a0af0dd38a
 29138623-a57f-44eb-9dc1-25641c8e66a1
 09e570c0-367a-4249-af30-62ff4667afe0
(4 行)

Second problem
Let's SELECT from text stored UUID column

select * from "uuid text" where "UUID" = '91f8b0f1-db13-44f0-91c8-71a1acc402cf';
 UUID 
------
(0 行)
select * from "uuid text" where "UUID" = '91F8B0F1-DB13-44F0-91C8-71A1ACC402CF';
 UUID 
------
(0 行)

Normalized FirebirdSQL form for UUID look like
91F8B0F1-DB13-44F0-91C8-71A1ACC402CF
Normalized PostgreSQL form for UUID look like
91f8b0f1-db13-44f0-91c8-71a1acc402cf

Before WHERE PostgreSQL normalize UUID value. Hence firebird_fdw should make upper transformation.

My help
I have resolved a problem with malformed UUID input from SQLite blob affinity and binding PostgreSQL UUID value to SQLite blob. Some solution there is in https://github.com/mkgrgis/sqlite_fdw/blob/draft_uuid/sqlite_query.c You can borrow this code.

Also you can point me to some lines in firebird_fdw code where i can try to add UUID support.

Many thanks for the long and hard work with firebird_fdw code!

ibarwick added a commit that referenced this issue Aug 20, 2023
Firebird does not have an explicit UUID type, but provides a function
"gen_uuid()" which generates a UUID as a series of raw byte values.
These can either be stored in a column defined with "CHARACTER SET
OCTETS", or converted to a string with "uuid_to_char()" for storage
in a normal text column.

However, if a PostgreSQL UUID value is pushed down as a string
literal, any UUID comparisons will fail unless the corresponding
value in Firebird is normalised in the same way as PostgreSQL
(which is likely not the case, as "uuid_to_char()" produces
UUIDs formatted with upper-case characters, whereas PostgreSQL
normalizes to lower-case values, per RFC4122 [1]).

To avoid this, we can simply not push down UUID values, so
comparisons can be performed on the PostgreSQL side, which
accepts a range of formats as valid UUID input values [2].

This is less efficient than performing comparisons on the Firebird
side; this could be implemented later, but would require knowledge of
how the UUID is stored in Firebird (expressed as a column-level
option or similar).

Note that currently no provision exists for inserting UUID
values into Firebird "CHARACTER SET OCTETS" columns.

[1] https://datatracker.ietf.org/doc/html/rfc4122
[2] https://www.postgresql.org/docs/current/datatype-uuid.html

This patch assumes a libfq version with support for
"CHARACTER SET OCTETS".

Addresses issues in GitHub #42.
@ibarwick ibarwick self-assigned this Aug 20, 2023
@ibarwick
Copy link
Owner

Firebird gives us some UUID functions. Recommended special storage mode is CHAR(16) CHARACTER SET OCTETS.

Firebird

CREATE TABLE "UUID" ("UUID" CHAR(16) CHARACTER SET OCTETS NOT NULL);
-- Many times, 15-20 examples
INSERT INTO "UUID" ("UUID")  
SELECT gen_uuid() FROM rdb$database;

PostgreSQL

CREATE FOREIGN TABLE "uuid" (
  "UUID" uuid
)
SERVER firebird_server OPTIONS (table_name 'UUID', quote_identifier 'true');

First problem During select * from uuid; there is error with malformed input.

OK, this is an issue with libfq, which didn't know about CHARACTER SET OCTETS. I've committed an update which resolves this by converting the raw data into hex values (which is what isql does); see: bb4ae5a.

(...)

Second problem Let's SELECT from text stored UUID column

select * from "uuid text" where "UUID" = '91f8b0f1-db13-44f0-91c8-71a1acc402cf';
 UUID 
------
(0 行)
select * from "uuid text" where "UUID" = '91F8B0F1-DB13-44F0-91C8-71A1ACC402CF';
 UUID 
------
(0 行)

Normalized FirebirdSQL form for UUID look like 91F8B0F1-DB13-44F0-91C8-71A1ACC402CF Normalized PostgreSQL form for UUID look like 91f8b0f1-db13-44f0-91c8-71a1acc402cf

Before WHERE PostgreSQL normalize UUID value. Hence firebird_fdw should make upper transformation.

I've had a look at the issue; it's a bit tricky because Firebird does not have an explicit UUID data type, so pushing down UUID text values to Firebird for comparison requires some knowledge of how the UUID is stored (i.e. as octets, or as text, and if the latter what format - it's conceivable that an app might store them in a different way to the uuid_to_char() format). This could be done, but would probably require some sort of column-level configuration.

However as a simple, initial solution, we can simply avoid pushing down the UUID value, so any comparisons are performed on the PostgreSQL side, which accepts various input formats for UUID values. In many cases this will be less efficient than performing the comparison in Firebird, but better than nothing.

Many thanks for the long and hard work with firebird_fdw code!

Thanks for the support and feedback over the years :).

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Aug 21, 2023

Thanks, @ibarwick ! CHAR(16) CHARACTER SET OCTETS works fine. This look like BLOB affinity with 16 bytes in SQLite or Oracle RAW(16). In all this cases we have 1⇔1 transformation and no problems. Text case is harder.

I've had a look at the issue; it's a bit tricky because Firebird does not have an explicit UUID data type, so pushing down UUID text values to Firebird for comparison requires some knowledge of how the UUID is stored (i.e. as octets, or as text, and if the latter what format - it's conceivable that an app might store them in a different way to the uuid_to_char() format). This could be done, but would probably require some sort of column-level configuration.

1. Around of CHARACTER SET OCTETS for UUID

comparison requires some knowledge of how the UUID is stored

I think no problems. In Firebird as RDBMS getting C bool for CHARACTER SET OCTETS with 16 bytes length only is more easier than in SQLite where in 1st row UUID can be stored with BLOB affinity and in 2nd row will have text affinity for example (not normal and very ugly but really implemented in SQlite conception). In Firebird in case of CHARACTER SET OCTETS we have strong 1⇔1 transformation. This is fully pushdownable. Also you can implement length control like https://github.com/mkgrgis/sqlite_fdw/blob/939b6bc79033b74f7c24fd50ac22590e33bed9b0/sqlite_query.c#L378 Maybe pg_uuid_t with ->data will helpful for your implementation. In my case using pg_uuid_t cause increasing data transfer speed in both directions not less than 20-30% (pessimistic overview).

However as a simple, initial solution, we can simply avoid pushing down the UUID value, so any comparisons are performed on the PostgreSQL side, which accepts various input formats for UUID values. In many cases this will be less efficient than performing the comparison in Firebird, but better than nothing.

I think it will be not very hard to pushdown UUID comparsion for CHARACTER SET OCTETS case only.

2. Simple text case is more harder.

... or as text, and if the latter what format - it's conceivable that an app might store them in a different way to the uuid_to_char() format

Let's discuss. I have no preferred implementation. What about examples?

  • In oracle_fdw there is only RAW(16) as UUID supported, author follows common Oracle practice where there are no text UUIDs, hence 1⇔1.
  • In current mysql_fdw code there is no UUIDOID or UUID or uuid at all. Only binary(16) described at https://dev.mysql.com/blog-archive/mysql-8-0-uuid-support/ with tiny group of UUID functions: UUID (generating function), UUID_TO_BIN, BIN_TO_UUID, IS_UUID.
  • In current postgres_fdw code there is no UUIDOID or UUID or uuid at all.
  • In sqlite_fdw there was no UUID support at all before my PR.

It seems there are no examples. Let's collect some potentially helpful facts.

  • Both PostgreSQL and FirebirdSQL have normalized text forms for UUIDs;
  • The forms are different with upper and lower 1⇔1 transformation;
  • Both PostgreSQL text::uuid and FirebirdSQL char_to_uuid can read not only normalized text input.

What about deparsing to char_to_uuid if there is text (not CHARACTER SET OCTETS 16b) case? This means we need only fast CHAR(16) CHARACTER SET OCTETSpg_uuid_t in C for all cases. What if no need to support text UUIDs recognised by PostgreSQL but not recognised by Firebird char_to_uuid. How do you think, @ibarwick ?

Notes about char_to_uuid, sample errors:

  • SQL Error [335544606] [42000]: expression evaluation not supported; Human readable UUID argument for CHAR_TO_UUID must be of exact length 36 [SQLState:42000, ISC error code:335544606]
  • SQL Error [335544606] [42000]: expression evaluation not supported; Human readable UUID argument for CHAR_TO_UUID must have hex digit at position 5 instead of "- (ASCII 45)" [SQLState:42000, ISC error code:335544606]

Hence this function needs stable input length (no {}) and only one scheme of - in text (8-4-4-4-12).

P. S. What about my PR #38 ? Your documentation was selected as something like reference implementation for most of FDWs from https://github.com/pgspider. This PR is my feedback, because I think there is some usefully traditions for this README.md in https://github.com/pgspider documentation.

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Nov 21, 2023

Ping, @ibarwick ! My PR with UUID support in SQLite-FDW was merged. Thanks for inspiration! How can I help you here? Maybe test something about text transformation, see previous message?

@mkgrgis
Copy link
Contributor Author

mkgrgis commented Apr 27, 2024

Ping, @ibarwick . After pgspider/sqlite_fdw@a272452 in sqlite_fdw there is full and unified support of both text and binary UUIDs against all supported PostgreSQL versions. How can I help you with with adopting of this code in Firebird FDW context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants