-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add pg_trgm to project search #1309
Conversation
@aminlatifi pls review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sembrestels for the PR
Some tests are failing, please take a look and fix them if they are not in conflict with the new expected output.
Also, please add unit tests to prove the correctness of the new query.
src/resolvers/projectResolver.ts
Outdated
@@ -333,21 +333,38 @@ export class ProjectResolver { | |||
searchTerm?: string, | |||
) { | |||
if (!searchTerm) return query; | |||
const SIMILARITY_THRESHOLD = 0.4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While having a default value for threshold, please allow overriding it by an entry in configuration.
src/resolvers/projectResolver.ts
Outdated
}) | ||
.orWhere('project.impactLocation ILIKE :searchTerm', { | ||
qb.where( | ||
'SIMILARITY(project.title, :searchTerm) > :similarityThreshold', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To do efficient searching, it's necessary to index appropriately index target fields. Hopefully, these target fields are not updated frequently, and indexing them won't put a substantial load on the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this explanation:
- If you're looking for approximate matches between strings and don't necessarily care about word boundaries, similarity might be sufficient.
- If you want to consider individual words within the strings and prefer a more granular similarity calculation, word_similarity could be a better option.
- If you want to ensure that matches occur within complete word boundaries and avoid partial word matches, strict_word_similarity might be more appropriate.
It seems work_similarity function would be a better fit for our use case, @sembrestels wdyt?
…mpact-graph into feat/project-search-improvement
The PR has a conflict with the staging branch, please resolve it. |
src/resolvers/projectResolver.ts
Outdated
@@ -331,23 +331,40 @@ export class ProjectResolver { | |||
static addSearchQuery( | |||
query: SelectQueryBuilder<Project>, | |||
searchTerm?: string, | |||
similarityThreshold = 0.4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still hardcoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant something like this:
Number(config.get('NUMBER_OF_UPDATE_RECURRING_DONATION_CONCURRENT_JOB')) || 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to pass it as a function param, so if you want to get it from config as above, it would be better to resolve it once globally somewhere at the top
src/resolvers/projectResolver.ts
Outdated
@@ -331,23 +331,40 @@ export class ProjectResolver { | |||
static addSearchQuery( | |||
query: SelectQueryBuilder<Project>, | |||
searchTerm?: string, | |||
similarityThreshold = 0.4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant something like this:
Number(config.get('NUMBER_OF_UPDATE_RECURRING_DONATION_CONCURRENT_JOB')) || 1; |
This PR updates the project search query to use Postgres' pg_trm extension using SIMILARITY instead of ILIKE to improve the search quality.