-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAPI-571 Fix SolrJ Convert Child Docs Correctly #62
base: bw_branch_8_11_2
Are you sure you want to change the base?
CAPI-571 Fix SolrJ Convert Child Docs Correctly #62
Conversation
The current solrj Document DTO to SolrInputDocument binder treat all child docs as anonymous rather than prefix them with the given path. This will result in the document being indexed without the `_nest_path_` field populated which is required for the ChildDocumentTransformer (`fl=[child]`) to work. In order to fix this if the Document DTO contain any fields which are also a DTO or a List/Array of DTOs, they should be converted as SolrInputDocuments on its own and nested as sub document(s) within the parent document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that's fixed in later versions (e.g. 9)?
Or are we meant to submit the child docs as their own objects - at the same time as the parent?
Or should our calling code just convert from the DTOs to |
No its still a bug that exist on the HEAD - this class hasnt seen any changes for last couple of years, even that its just a formatting change. Probably this is a very underused code path in Solr, hence hadnt had much attention and care.. And yes if we want child docs to be indexed with the correct nested fields (and we do!), we need to submit child docs as their on SolrInputDocs embeded within the parent.. |
I think we should either make binder to convert DTO and generate correct/valid SolrInputDocument for everything OR we convert everything manually without the DTO binder. Ideally not mix and match and get DTO binder to convert some and we convert others to SolrInputDocs manually. From performance POV yes it could be seen as just an extra intermediary step we could get rid, but dont think its a bottleneck as such as.. I guess biggest time sink is the indexing time as oppose to these conversions.. |
Option 2 to address this issue: #63 - this feels a better non breaking approach than this PR.. |
fwiw we do have a direct Avro -> SolrInputDocument converter, auto-generated in our code.. |
The current solrj Document DTO to SolrInputDocument binder treat all child docs as anonymous rather than prefix them with the given path. This will result in the document being indexed without the
_nest_path_
field populated which is required for the ChildDocumentTransformer (fl=[child]
) to work.In order to fix this if the Document DTO contain any fields which are also a DTO or a List/Array of DTOs, they should be converted as SolrInputDocuments on its own and nested as sub document(s) within the parent document.