You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that DuckdDB might be able to read multiple parquet-files in concurrently -- but not one file concurrently
Thoughts
In theory, we could do this by copy from with exactly the same number of threads & use each thread the location info of the sheetreader thread.
Would it be possible to partition excel sheet in 2048 / (number of threads) rows? + make the buffers that size? Probably tricky, because we would have to know the number of columns before (because buffer size / columns is the numbers of rows, which fit into one buffer)
TODO
A multi-threaded scan would be interesting, since our copy/scan function takes some time.
According to the README, it supports a multi-threaded scan. I suspect that this doesn't need any new implementation, since they are reading the parquet files.
Find out whether this is due to the parquet files
Find out whether DuckDB supports also a multi-threaded scan of Apache Arrow format
Have a look at how the multi-threaded scan is implemented
Find out whether we could copy concurrently -- this might not be possible, because sheetreader-core saves the data in a special way (per thread & some rows are split in multiple threads -- and there is only an implicit order)
The text was updated successfully, but these errors were encountered:
First findings
Thoughts
TODO
A multi-threaded scan would be interesting, since our copy/scan function takes some time.
Have a look at:
https://github.com/duckdb/duckdb_delta/blob/main/src/functions/delta_scan.cpp
According to the README, it supports a multi-threaded scan. I suspect that this doesn't need any new implementation, since they are reading the parquet files.
sheetreader-core
saves the data in a special way (per thread & some rows are split in multiple threads -- and there is only an implicit order)The text was updated successfully, but these errors were encountered: