Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sheets not found due to sheets being referenced by absolute path. #171

Closed
luke-clifton opened this issue Jan 12, 2024 · 4 comments
Closed

Comments

@luke-clifton
Copy link
Contributor

I have an xlsx file that won't parse because:

*** Exception: MissingFile "/xl/worksheets/sheet.xml"
CallStack (from HasCallStack):
  error, called at src/Codec/Xlsx/Parser.hs:94:22 in xlsx-1.1.1-054dc7745149d1a0a8c7f44a692517d728629241f887be768ba785b6fb33905d:Codec.Xlsx.Parser

That file is in the archive, but it seems that findEntryByPath returns Nothing with the leading slash.

file <- note (MissingFile filePath) $ Zip.fromEntry <$> Zip.findEntryByPath filePath ar

@qrilka
Copy link
Owner

qrilka commented Jan 12, 2024

Hi @luke-clifton could you attach a test file maybe? I'd try to find some time over the weekend

@luke-clifton
Copy link
Contributor Author

If I make the below change, it can parse my file. (I've added the dropWhile (== '/') in Codec.Xlsx.Parser

extractSheet ar sst contentTypes caches wf = do
  let filePath = dropWhile (== '/') $ wfPath wf
  file <- note (MissingFile filePath) $ Zip.fromEntry <$> Zip.findEntryByPath filePath ar

The file is the Profile.xlsx found in the Garmin FIT SDK

I've attached it here as well.

Profile.xlsx

@qrilka
Copy link
Owner

qrilka commented Jan 12, 2024

@luke-clifton I've checked the spec and according to the section "7.3.4 Mapping logical item names to ZIP item names" of Part 2 “Open Packaging Conventions” at https://ecma-international.org/publications-and-standards/standards/ecma-376/ we need to

Remove the leading forward slash (“/”) from the logical item name or, in the case of interleaved
parts, from each of the logical item names within the complete sequence.

I guess the part of interleaved parts is not applicable to xlsx (at least I don't remember such a thing)
Would you like to create a PR stripping any leading "/" in

lookupRelPath fp rels rId =
and with a comment why we do this?

@qrilka
Copy link
Owner

qrilka commented Jan 14, 2024

Fixed in #172

@qrilka qrilka closed this as completed Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants