-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clickhouse string ordering and string filtering by UTF8 instead of bytes #6143
base: master
Are you sure you want to change the base?
Conversation
Hey @casab ! Thanks for contributing! Could you please add an integration test for it? |
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 8 Skipped Deployments
|
645512c
to
63b935c
Compare
@casab is attempting to deploy a commit to the Cube Dev Team on Vercel. A member of the Team first needs to authorize it. |
…e_utf8_filter_order
@casab May I kindly ask you to test with the latest release and rebase your changes on top of it? We have migrated to a new ClickHouse client library recently. Thanks in advance! |
@igorlukanin Can you approve running the workflows? |
Hi @casab Could you please sync with the latest master, resolve conflicts and fix warnings/errors? Ping me whenever you need to approve running workflows! |
@KSDaemon of course, done it. |
@casab Hey! Thnx for the updates! Unfortunately, some lint errors still exists:
Maybe thats because of merge... But anyway... Can you fix them, please? |
…add data_type to AliasedColumn and TemplateColumn to be able to use the data type in the templates
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6143 +/- ##
==========================================
- Coverage 83.57% 80.73% -2.84%
==========================================
Files 227 227
Lines 81618 81646 +28
==========================================
- Hits 68210 65916 -2294
- Misses 13408 15730 +2322
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@KSDaemon @mcheshkov Hey, I have fixed all the errors. And implemented all the required parts. I would appreciate it if you can review it. Especially the rust part. |
@@ -123,7 +125,7 @@ export class ClickHouseQuery extends BaseQuery { | |||
.join(' AND '); | |||
} | |||
|
|||
public getFieldAlias(id) { | |||
public getField(id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to make it public? Maybe private would be better?
@@ -168,6 +188,43 @@ export class ClickHouseQuery extends BaseQuery { | |||
return `${fieldAlias} ${direction}`; | |||
} | |||
|
|||
public getCollation() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the same question here: should it be public?
if (R.isEmpty(this.order)) { | ||
return ''; | ||
} | ||
|
||
const collation = this.getCollation(); | ||
|
||
const orderByString = R.pipe( | ||
R.map((order) => { | ||
let orderString = this.orderHashToString(order); | ||
if (collation && this.getFieldType(order) === 'string') { | ||
orderString = `${orderString} COLLATE '${collation}'`; | ||
} | ||
return orderString; | ||
}), | ||
R.reject(R.isNil), | ||
R.join(', ') | ||
)(this.order); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you mind rewriting this in plain JS instead of using Ramda?
Smth like this:
if (R.isEmpty(this.order)) { | |
return ''; | |
} | |
const collation = this.getCollation(); | |
const orderByString = R.pipe( | |
R.map((order) => { | |
let orderString = this.orderHashToString(order); | |
if (collation && this.getFieldType(order) === 'string') { | |
orderString = `${orderString} COLLATE '${collation}'`; | |
} | |
return orderString; | |
}), | |
R.reject(R.isNil), | |
R.join(', ') | |
)(this.order); | |
if (this.order.length === 0) { | |
return ''; | |
} | |
const collation = this.getCollation(); | |
const orderByString = this.order | |
.map((order) => { | |
let orderString = this.orderHashToString(order); | |
if (collation && this.getFieldType(order) === 'string') { | |
orderString = `${orderString} COLLATE '${collation}'`; | |
} | |
return orderString; | |
}) | |
.filter(Boolean) // Analogue `R.reject(R.isNil)` | |
.join(', '); |
Clickhouse defaults to using bytes to order by and string manipulation functions such as
lower
,upper
uses ascii. To overcome this limitation they haveCOLLATE
keyword, andlowerUTF8
,upperUTF8
functions.Check List
Description of Changes Made (if issue reference is not provided)
CONCAT
SQL function with js template literal to prevent unnecessary DB function calllowerUTF8
instead oflower
to support utf8 compatible searchCOLLATE ‘en’
when ordering by strings to order incasesensitive. Clickhouse orders by bytes on default.