Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in Read5 #40

Closed
vadimkantorov opened this issue Jul 7, 2016 · 30 comments
Closed

Segfault in Read5 #40

vadimkantorov opened this issue Jul 7, 2016 · 30 comments

Comments

@vadimkantorov
Copy link

vadimkantorov commented Jul 7, 2016

Hi, I'm reading a MatLab file with Torch wrapper for matio and getting a Segfault at https://github.com/tbeu/matio/blob/master/src/mat5.c#L4331.

Program received signal SIGSEGV, Segmentation fault.
Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4331
4331                    fields[i]->internal->fp = mat;
(gdb) bt
#0  Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4331
#1  0x00007fffa425d3a0 in Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4332
#2  0x00007fffa425d3a0 in Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4332
#3  0x00007fffa4268714 in Mat_VarRead (mat=0x722ef0, name=<optimized out>) at mat.c:1987
#4  0x000000000048ce29 in lj_vm_ffi_call ()
#5  0x000000000045d4e0 in lj_ccall_func ()
#6  0x000000000045f7b6 in lj_cf_ffi_meta___call ()
#7  0x000000000048ae6a in lj_BC_FUNCC ()
#8  0x000000000047a6dd in lua_pcall ()
#9  0x000000000041131f in pmain ()
#10 0x000000000048ae6a in lj_BC_FUNCC ()
#11 0x000000000047a757 in lua_cpcall ()
#12 0x000000000040f234 in main ()

The file I'm loading is a few hundred megabytes, but I could provide it if needed. Thanks!

@tbeu
Copy link
Owner

tbeu commented Jul 7, 2016

Was the MAT file created using MATLAB? Which version of matio did you use?
In order to reproduce and debug the problem I'd need the MAT file causing the segfault. It would be nice if you could reduce its size to a minimum. Thanks for your report.

@tbeu
Copy link
Owner

tbeu commented Jul 9, 2016

Any file to share?

@vadimkantorov
Copy link
Author

vadimkantorov commented Jul 11, 2016

Sorry for the delay. You could get the file (350 Mb) from my OneDrive: https://1drv.ms/u/s!Apx8USiTtrYmoJRncwpfATtnCZLvuA

Yes, the file is created from MATLAB, I tried reinstalling the latest matio, but same error.

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Thanks. Can reproduce and will investigate it. Do you have some MCOS objects inside the struct? It looks like it when watching the binary stream but I did not find it when opening it in MATLAB R14SP3.

@vadimkantorov
Copy link
Author

vadimkantorov commented Jul 11, 2016

Don't know what MCOS objects are. I produced this file running R2015a with code from this repo: https://github.com/hbilen/WSDDN/blob/master/scripts/prepare_wsddn.m#L18

The produced saved model is fine if I run this code as is and fails if I remove drop6 and drop7 from the list (the objects from the list get saved in the produced file in the end).

That's the code for producing this part of the object to be saved: https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Yes, that is exactely what I mean by MCOS (MATLAB Class Object System). These class objects are currently not supported.

@vadimkantorov
Copy link
Author

In my understanding is that matconvnet's code converts the DagNN object to plain struct before saving:
https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m#L3

matio reads the saved file OK when it's saved with '-v6'.

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

But

$ ./matdump -f whos bug_matiov6.mat
Name                       Size           Bytes          Class

net                        1x1           515410578  mxSTRUCT_CLASS
stats                      0x0                   0  mxDOUBLE_CLASS
                           1x88                 88  mxUINT8_CLASS

gives some strange trailing 88 bytes variable for the MCOS.

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Anyway, I will investigate further. Done this a few times previously. It is a tedious byte-by-byte comparison of the zlib-inflated v7 and the v6 MAT-file which usually gives the hint where matio uncompressing is faulty.

@vadimkantorov
Copy link
Author

vadimkantorov commented Jul 11, 2016

Would you like me to upload the working v6 file? (though I'm not sure it would be bitwise equivalent, the weights in the model could be slightly different)

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Thanks. I already created it myself using old MATLAB R14SP3.

load bug_matio.mat
save bug_matiov6.mat net stats -v6

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

The/one error is reproducable if only net.layers is saved. Reduces the file size significantly.

@vadimkantorov
Copy link
Author

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Oops, there is class(block). Not sure if this is supported. Does not look like plain vanilla.

@vadimkantorov
Copy link
Author

"ClassName = class(object) returns a string specifying the class of object." (http://fr.mathworks.com/help/matlab/ref/class.html)

And it's indeed a byte vector when I read the file saved with -v6...

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Right. Nothing to blame.

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

net.layers(1,22).block.levels is strange. What type is it?

@vadimkantorov
Copy link
Author

It should be just an integer.

2016-07-11 21:36 GMT+02:00 tbeu [email protected]:

net.layers(1,22).block.levels is strange. What type is it?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#40 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AA_lWNB-fSrbRK77_Pnqu1WdhEkMK-zUks5qUptbgaJpZM4JHQwj
.

Vadim Kantorov
+33 6 03 29 27 69

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Well, it looks like an empty field.

@vadimkantorov
Copy link
Author

When I load the buggy file, it's somehow a gpuArray:

class(a.net.layers(1, 22).block.levels)

ans =

gpuArray

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Yes, confirmed. Where is it saved (in m source)?

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

Both the v6 and the v7 MAT-file have MCOS ... gpuArray in the binary stream there.

@vadimkantorov
Copy link
Author

Yep, what's happening is that in the model there's a custom layer (a MCOS) that gets serialized at the block.save() call https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m#L42

What's surprising is that if I keep drop6 and drop7 objects in the graph, then both v6 and v7 version produce a readable file.

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

At least it should not crash on such data. Going to try to handle this exception.

@vadimkantorov
Copy link
Author

Thanks!

By the way, can this github project be considered as the official matio home?

@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

This is my mirror of the official matio from sf.net. Once I file a new release I push to the sf.net repo and prepare the release there.

tbeu added a commit that referenced this issue Jul 11, 2016
…e class

* Skip unknown/undocumented opaque class
* As reported by #40
@tbeu
Copy link
Owner

tbeu commented Jul 11, 2016

You wanna test if 3b6dc8f works for you where the unknown MCOS class is skipped now.

@vadimkantorov
Copy link
Author

I'll check it later tonight!

@vadimkantorov
Copy link
Author

The file doesn't break anymore, LGTM.

@tbeu
Copy link
Owner

tbeu commented Oct 19, 2016

Since reading works I am going to close this issue and move the existing writing problems for #47.

@tbeu tbeu closed this as completed Oct 19, 2016
papadop pushed a commit to papadop/matio that referenced this issue Nov 29, 2017
…e class

* Skip unknown/undocumented opaque class
* As reported by tbeu#40
papadop pushed a commit to papadop/matio that referenced this issue Nov 29, 2017
papadop pushed a commit to papadop/matio that referenced this issue Nov 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants