You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a warc.gz file fails with an error - i can't see the detailed errors message:
e.g.
/home/prod/warc validate ./470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz
Total time: 22.941703205s, files: 1, records: 25546, processed: 25546, errors: 1, duplicates: 0
Is that not possible?
When I use jwat i get following:
Summary of '/home/release_software_dist/PROD/har-013/470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz'
Exception while processing '/home/release_software_dist/PROD/har-013/470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz'
StartOffset: 96236201 (0x5bc72a9)
Offset: 96280576 (0x5bd2000)
java.io.IOException: java.util.zip.DataFormatException: Data missing!
at org.jwat.gzip.GzipReader$GzipEntryInputStream.read(GzipReader.java:645)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at org.jwat.common.FixedLengthInputStream.read(FixedLengthInputStream.java:103)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at java.security.DigestInputStream.read(DigestInputStream.java:161)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:119)
at org.jwat.archive.ManagedPayload.manageRecord(ManagedPayload.java:254)
at org.jwat.archive.ManagedPayload.manageWarcRecord(ManagedPayload.java:219)
at org.jwat.tools.tasks.test.TestFile2.apcWarcRecordStart(TestFile2.java:199)
at org.jwat.archive.ArchiveParser.parse(ArchiveParser.java:154)
at org.jwat.tools.tasks.test.TestFile2.processFile(TestFile2.java:58)
at org.jwat.tools.tasks.test.TestTask$TaskRunnable.run(TestTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.zip.DataFormatException: Data missing!
at org.jwat.gzip.GzipReader.readInflated(GzipReader.java:566)
at org.jwat.gzip.GzipReader$GzipEntryInputStream.read(GzipReader.java:641)
... 33 more
-->
The text was updated successfully, but these errors were encountered:
When a warc.gz file fails with an error - i can't see the detailed errors message:
e.g.
/home/prod/warc validate ./470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz
Total time: 22.941703205s, files: 1, records: 25546, processed: 25546, errors: 1, duplicates: 0
Is that not possible?
When I use jwat i get following:
Summary of '/home/release_software_dist/PROD/har-013/470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz'
GZip.Warnings: 0
Warc.isValid: true
Warc.Records: 25546
Warc.Errors: 0
Warc.Warnings: 0
Job summary
GZip files: 0
Arc files: 0
Warc files: 0
Errors: 0
Warnings: 0
RuntimeErr: 1
Skipped: 0
Time: 00:00:37 (37507 ms.)
TotalBytes: 91.8 mb
AvgBytes: 2.4 mb/s
Exception while processing '/home/release_software_dist/PROD/har-013/470779-431-20241101123640123-00000-kb-prod-har-013.kb.dk.warc.gz'
StartOffset: 96236201 (0x5bc72a9)
Offset: 96280576 (0x5bd2000)
java.io.IOException: java.util.zip.DataFormatException: Data missing!
at org.jwat.gzip.GzipReader$GzipEntryInputStream.read(GzipReader.java:645)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at org.jwat.common.FixedLengthInputStream.read(FixedLengthInputStream.java:103)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at java.security.DigestInputStream.read(DigestInputStream.java:161)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:124)
at org.jwat.common.ByteCountingPushBackInputStream.read(ByteCountingPushBackInputStream.java:119)
at org.jwat.archive.ManagedPayload.manageRecord(ManagedPayload.java:254)
at org.jwat.archive.ManagedPayload.manageWarcRecord(ManagedPayload.java:219)
at org.jwat.tools.tasks.test.TestFile2.apcWarcRecordStart(TestFile2.java:199)
at org.jwat.archive.ArchiveParser.parse(ArchiveParser.java:154)
at org.jwat.tools.tasks.test.TestFile2.processFile(TestFile2.java:58)
at org.jwat.tools.tasks.test.TestTask$TaskRunnable.run(TestTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.util.zip.DataFormatException: Data missing!
at org.jwat.gzip.GzipReader.readInflated(GzipReader.java:566)
at org.jwat.gzip.GzipReader$GzipEntryInputStream.read(GzipReader.java:641)
... 33 more
-->
The text was updated successfully, but these errors were encountered: