Implement Kafka scanner #38

dgoldenberg1234 · 2016-04-05T23:33:28Z

This ticket would add a scanner implementation that read documents from a kafka topic as a consumer. When documents are large it would be expected that the item read is a pointer and FetchUrl processor is used to subsequently obtain the content. We should also include the ability to include a content hash in the item read from kafka since we will not be able to inspect the bytes of a document to be fetched further down the pipeline before deciding if we should process it.

nsoft added the enhancement label Apr 19, 2016

nsoft added this to the 0.3 milestone Apr 19, 2016

nsoft modified the milestones: 0.3, 2.0 Sep 11, 2017

nsoft modified the milestones: 2.0, 1.1 Feb 21, 2023

nsoft modified the milestones: 1.1, 1.2 Apr 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Kafka scanner #38

Implement Kafka scanner #38

dgoldenberg1234 commented Apr 5, 2016 •

edited by nsoft

Loading

Implement Kafka scanner #38

Implement Kafka scanner #38

Comments

dgoldenberg1234 commented Apr 5, 2016 • edited by nsoft Loading

dgoldenberg1234 commented Apr 5, 2016 •

edited by nsoft

Loading