Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Input file size and chunk size #6

Open
pbjarterot opened this issue Jan 19, 2023 · 2 comments
Open

Issue with Input file size and chunk size #6

pbjarterot opened this issue Jan 19, 2023 · 2 comments

Comments

@pbjarterot
Copy link

pbjarterot commented Jan 19, 2023

Hi,
I've been trying to use build/compress with an example file i generated in python, as below. I keep getting the error

"Input file size is not a multiple of the chuink size"

i've tried many different values for -n but none seem to work, the last being

build/compress -i example_file.txt -o compressed_example_file.txt -n 512 1

Do you have a solution for this?
Also, is it possible to provide a super simple example file to show how the input should look and how the command for said file should look so someone that is new to C++ tools like myself can get some more help

 import numpy as np

a = np.arange(1, 1024000, 1, dtype=float)

with open("example_file.txt", "w") as f:
    for i in a:
        f.write(f"{i}, ")

f.close() 
@fknorr
Copy link
Collaborator

fknorr commented Jan 19, 2023

Hi @pbjarterot, the input is expected to be a file of binary floating-point values, with 4 bytes per float for the default single-precision setting and 8 bytes per float for -t double.

From how I understand the numpy documentation, you should be able to output the desired format through ndarray.tofile().

-n takes one, two or three arguments for a one, two or three-dimensional input array respectively. For the file resulting from your array a this would be -n 1024000.

Hope that helps.

@pbjarterot
Copy link
Author

Thank you, now it works, however, changing different values for n, i still get the ratio 1.0 every time, is this due to the type of data or any input argument?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants