-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove CqlType in favor of ColumnType #1166
Conversation
4d30a3c
to
6a9be35
Compare
See the following report for details: cargo semver-checks output
|
b905b51
to
1eee540
Compare
1eee540
to
f629f77
Compare
Rebased on main |
Again known failure with "Failed to apply group 0 change due to concurrent modification"... If Core is not willing to address this problem then maybe we should introduce client-side retries for those queries? |
scylla/src/cluster/metadata.rs
Outdated
/// matches the datatype used by the java driver: | ||
/// <https://github.com/apache/cassandra-java-driver/blob/85bb4065098b887d2dda26eb14423ce4fc687045/core/src/main/java/com/datastax/oss/driver/api/core/type/DataTypes.java#L77> | ||
dimensions: i32, | ||
dimensions: u16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about (u16::MAX, i32::MAX]
values? I know that such values are impractical and probably will never appear there, but according to the specification it is possible, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which specification?
- CQLv5 binary protocol only mentions vectors once:
5.25 vector
For a vector of n dimensions of a fixed-length type, a sequence of those n elements.
For a vector with variable-length elements, the size of the elements will preced
each element. Each element is the [bytes] representing the serialized value. The
number of dimensions is not encoded, since it's part of the type definition.
There is no defined binary type encoding for vectors (which would give is the type to use for dimensions) - it is encoded as text (custom type) which is why #1165 needs to do so many changes.
- Cassandra docs say that the maximum amount of elements in a vector is 8K (2^13) which fits in u16: https://cassandra.apache.org/doc/latest/cassandra/reference/vector-data-type.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Cassandra docs say that the maximum amount of elements in a vector is 8K (2^13) which fits in u16: https://cassandra.apache.org/doc/latest/cassandra/reference/vector-data-type.html
Ok, this convinces me. I based my assumption on the code snippet (https://github.com/apache/cassandra-java-driver/blob/85bb4065098b887d2dda26eb14423ce4fc687045/core/src/main/java/com/datastax/oss/driver/api/core/type/DataTypes.java#L77). I forgot that Java does not have unsigned integral data types... But then, I believe that short
data type in Java should be enough to store a value up to 8k. I wonder why they chose int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think they pay much attention to such things. If they did then CQL protocol wouldn't have such obvious issues unfixed for as long as it does now.
scylla/src/client/session_test.rs
Outdated
@@ -6,7 +6,7 @@ use crate as scylla; | |||
use crate::batch::{Batch, BatchStatement}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit Remove CqlType and related types
- I expected this commit to contain a lot of changes. To my surprise, there are very few of them. Great job of splitting this PR into the commits - thanks to that the final replacement was trivial.
f629f77
to
84f856e
Compare
Adrressed the comments |
This sounds like a good workaround. |
84f856e
to
6a7db44
Compare
Addressed all the review comments. I'll rebase on main in a separate push. |
As part of an effort to unify CqlType with ColumnType we need to make those types compatible enough. First part of this is extracting simple types from ColumnType into their own enum.
We don't really support this type, and this removes is required for unification of CqlType and ColumnType.
As part of an effort to unify CqlType with ColumnType we need to make those types compatible enough. This commit extracts collection types into their own sub-type.
As part of an effort to unify CqlType with ColumnType we need to make those types compatible enough. CqlType stored UserDefinedType as a separate struct under Arc, in order to share it between different metadata structures. This commit does the same for ColumnType. Note that the field for UDT name is now "name" instead of the "type_name", because the old one was a bit redundant.
6a7db44
to
47d1456
Compare
|
This is done to keep this field name the same as in ColumnType.
According to Cassandra documentation maximum size of the vector is 8K, which fits fine in u16. Using unsigned type will be more pleasant to users, because they won't need to deal with negative values.
For now without any capability for serialization or deserialization. This is necessary for unification of CqlType and ColumnType.
This is the last step towards making CqlType and ColumnType (mostly) identical. After this change the only difference is that ColumnType may use borrowed data, and thus has a lifetime parameter.
This will make it easier to remove CqlType completely.
Removed types: CqlType, CollectionType, NativeType. Their usages are replaced with the relevant types from scylla_cql. Fixes: scylladb#691
This is more consistent after removing CqlType in favor of ColumnType.
It is less ugly, and matches the convention in other parts of the driver.
It is less ugly, and matches the convention in other parts of the driver.
47d1456
to
79c74f6
Compare
Fixed CI errors. I'll run |
It is less ugly, and matches the convention in other parts of the driver.
It is less ugly, and matches the convention in other parts of the driver.
It is less ugly, and matches the convention in other parts of the driver.
It is less ugly, and matches the convention in other parts of the driver.
Those were the only remaining occurrences of `type_` identifier in the driver.
79c74f6
to
59091a0
Compare
@@ -1269,109 +1342,116 @@ pub fn deser_cql_value( | |||
let v = Some(FrameSlice::new_borrowed(buf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📌 When removing the legacy deserialization framework, please check if it's possible to remove the workaround in FrameSlice
: the new_borrowed
constructor.
59091a0
to
cfeb8b9
Compare
This PR removes
CqlType
(used in schema metadata) and replaces its usages withColumnType
(used in statement metadata).Initially those two structures were quite different, so most of the commits are done to make them mostly identical, which made the unification possible.
Layout
CqlType was split into subtypes (native type, collection type, tuple, UDT) as defined by https://cassandra.apache.org/doc/stable/cassandra/cql/types.html
ColumnType was fully inlined - it had a variant for each possible type.
I decided to change ColumnType in this regard to follow CqlType, because CqlType's approach seems to be much more elegant.
Frozen flag
CqlType contained frozen flag, because the schema contains information about whether the collection/UDT is frozen or not.
ColumnType did not contain such flag, because this information is not present in result metadata / arguments metadata.
There wasn't any real choice - we have to store the frozen flag to accurately represent the schema metadata.
For types used in statements, this flag is simply always set to false.
UDT shared ownership
ColumnType stored the UDT data inline, this is how the variant looked:
CqlType stored it in a separate struct, which was held under Arc:
This is because it may be stored in a few places:
Keyspace::user_defined_types
This requires either cloning, or shared ownership.
Custom type
ColumnType had a variant for Custom type, because it is defined by the protocol.
However, this type is rarely used, and we don't really have any working support for it, so I decided to remove it.
Minor changes
I decided to change the size type of Vector to u16. It frees any users of this struct from dealing with negative numbers (which are disallowed by the protocol, and make no sense here).
Fixes #691
Pre-review checklist
I added relevant tests for new features and bug fixes.I have provided docstrings for the public items that I want to introduce.I have adjusted the documentation in- Nothing to adjust./docs/source/
.Fixes:
annotations to PR description.