Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjectStorage without chunk-encoding support corrupts files (the action_runner log cannot be viewed) #33638

Closed
OAMchronicle opened this issue Feb 18, 2025 · 30 comments
Labels
type/bug type/upstream This is an issue in one of Gitea's dependencies and should be reported there

Comments

@OAMchronicle
Copy link

Description

actions.ReadLogs, zstd NewSeekableReader: failed to parse footer [52 98 55 51 101 13 10 13 10]: footer reserved bits 25 != 0
Gitea版本: 1.23.3

Gitea Version

1.23.3

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

Image

Git Version

No response

Operating System

No response

How are you running Gitea?

use system

Database

MySQL/MariaDB

@OAMchronicle
Copy link
Author

My S3 provider is seaweed

@wxiaoguang
Copy link
Contributor

Might be related to Support compression for Actions logs #31761

@wolfogre
Copy link
Member

Does it happen for any logs of action runs, or only this log?

It seems that the file is broken for some reasons.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Feb 19, 2025

It seems to be a plain-text file? The foot bytes (from log) are 4b73e\r\n\r\n

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

// file gitea/modules/actions/log_test.go
This is the test case I wrote

package actions

import (
	"code.gitea.io/gitea/modules/zstd"
	"fmt"
	"io"
	"os"
	"testing"
)

func TestTransferLogs(t *testing.T) {
	name := "1131.log"
	f, err := os.Open(name)
	if err != nil {
		return
	}
	defer f.Close()
	var reader io.Reader = f

	r, w := io.Pipe()
	reader = r
	zstdWriter, err := zstd.NewSeekableWriter(w, logZstdBlockSize)
	if err != nil {
		fmt.Errorf("zstd NewSeekableWriter: %w", err)
	}
	go func() {
		defer func() {
			_ = w.CloseWithError(zstdWriter.Close())
		}()
		if _, err := io.Copy(zstdWriter, f); err != nil {
			_ = w.CloseWithError(err)
			return
		}
	}()
	file, err := os.OpenFile("1131.log.zst", os.O_CREATE|os.O_RDWR, 0666)
	if err != nil {
		return
	}
	defer file.Close()
	_, err = io.Copy(file, reader)
	if err != nil {
		return
	}
	/*
		minioStorage, err := storage.NewMinioStorage(context.Background(),
			&setting.Storage{
				MinioConfig: setting.MinioStorageConfig{
					Endpoint:           "x.x.x.x:8333",
					AccessKeyID:        "xxx",
					SecretAccessKey:    "xxx",
					Bucket:             "gitea",
					UseSSL:             false,
					InsecureSkipVerify: true,
				},
			})
		if err != nil {
			return
		}
		_, err = minioStorage.Save("logs/1131.log.zst", reader, -1)
		if err != nil {
			return
		}
	*/
}

There is no problem when using local files. This issue will be triggered when uploading to S3

@OAMchronicle
Copy link
Author

Does it happen for any logs of action runs, or only this log?

It seems that the file is broken for some reasons.

Currently, I know it's a log

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Feb 19, 2025

Will the S3 storage auto decompress the compressed zstd files and respond plain-text?

@OAMchronicle
Copy link
Author

I don't know, I don't think it will happen. The S3 I am using is seaweed

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Feb 19, 2025

It seems to be a plain-text file? The first bytes (from log) are 4b73e\r\n\r\n

According to your log, the response is plain-text.

Could you help to confirm the seaweed's behavior? For example: use your test code , and read the logs/1131.log.zst raw content and dump it to see whether it is compressed or decompressed? (or: maybe the response is jut incorrect? just a guess)

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

Image Image

1131.log.obs.zst is directly uploaded to S3 based on the test code。
1131. log.zst is manually uploaded to S3 based on the test code saved locally

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

Image
func TestTransferLogs1(t *testing.T) {
	open, err := os.Open("1131.log.zst")
	if err != nil {
		return
	}
	minioStorage, err := storage.NewMinioStorage(context.Background(),
		&setting.Storage{
			MinioConfig: setting.MinioStorageConfig{
				Endpoint:           "x.x.x.x:8333",
				AccessKeyID:        "xx",
				SecretAccessKey:    "xx",
				Bucket:             "gitea",
				UseSSL:             false,
				InsecureSkipVerify: true,
			},
		})
	if err != nil {
		return
	}
	_, err = minioStorage.Save("logs/1131.log.local.zst", open, -1)
	if err != nil {
		return
	}
}

After testing, it seems that Seaweed encountered an exception during upload

@wxiaoguang
Copy link
Contributor

Thank you for the feedback! I didn't quite use Actions or Seaweed, so at the moment, I have no more idea.

If you could help to figure out "what's wrong" (the key point, for example: is it Gitea's bug? which call/code is wrong; or Seaweed's problem? what's the difference), then maybe there could be some solutions to "fix".

@OAMchronicle
Copy link
Author

Thank you for the feedback! I didn't quite use Actions or Seaweed, so at the moment, I have no more idea.

If you could help to figure out "what's wrong" (the key point, for example: is it Gitea's bug? which call/code is wrong; or Seaweed's problem? what's the difference), then maybe there could be some solutions to "fix".

When I use AWS/S3 SDK to upload, there is no such problem

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

//test code

package S3

import (
	"context"
	"crypto/tls"
	"github.com/aws/aws-sdk-go-v2/aws"
	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/credentials"
	"github.com/aws/aws-sdk-go-v2/service/s3"
	"net/http"
	"os"
	"testing"
	"upfile/src/conf"
)

var Cli *s3.Client

func init() {
	staticResolver := aws.EndpointResolverWithOptionsFunc(func(service, region string, options ...any) (aws.Endpoint, error) {
		return aws.Endpoint{
			PartitionID:       "aws",
			URL:               "http://x.x.x.x:8333",
			SigningRegion:     region,
			HostnameImmutable: true,
		}, nil
	})

	tr := &http.Transport{
		TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
	}
	httpClient := &http.Client{Transport: tr}

	cfg, _ := config.LoadDefaultConfig(context.Background(),
		config.WithRegion("xx"),
		config.WithEndpointResolverWithOptions(staticResolver),
		config.WithCredentialsProvider(credentials.NewStaticCredentialsProvider("xx", "xx", "")),
		config.WithHTTPClient(httpClient),
	)

	Cli = s3.NewFromConfig(cfg)
}

func puttest(key string, obsClient *s3.Client) bool {
	file, err := os.Open(key)
	if err != nil {
		conf.Logger.Error(err)
		return false
	}
	defer file.Close()
	_, err = obsClient.PutObject(context.TODO(), &s3.PutObjectInput{
		Bucket: aws.String("gitea"),
		Key:    &key,
		Body:   file,
	})

	if err != nil {
		return false
	}

	return true
}
func TestUp(t *testing.T) {

	puttest("1131.log.zst", Cli)
}
Image

@OAMchronicle
Copy link
Author

So could this be an issue between Minio and AWS/S3 SDK

@wxiaoguang
Copy link
Contributor

When using Minio SDK, what's the content of uploaded file? Decompressed? Truncated? Or randomly corrupted? Could you provide the file samples?

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

1131.log
1131.log.zst.txt

You need to rename 1131.log.zst.txt to 1131.log.zst (GitHub cannot upload. zst files)

1131.log.zst is compressed from 1131.log

@wxiaoguang
Copy link
Contributor

I have checked your samples: 1131.log.zst.txt (1131.log.zst) is right, after decompression, it matches 1131.log.

And I would reiterate my previous comment:

It seems to be a plain-text file? The foot bytes (from log) are 4b73e\r\n\r\n
Will the S3 storage auto decompress the compressed zstd files and respond plain-text?

We can see the the foot bytes of 1131.log is also something like 9c8ca\r\n\r\n.


So the best guess is: when Gitea tries to read the compressed "zstd" file from your S3, your S3 responds decompressed plain-text file. Then Gitea's zstd reader can't read it.

@OAMchronicle
Copy link
Author

I have checked your samples: 1131.log.zst.txt (1131.log.zst) is right, after decompression, it matches 1131.log.

And I would reiterate my previous comment:

It seems to be a plain-text file? The foot bytes (from log) are 4b73e\r\n\r\n
Will the S3 storage auto decompress the compressed zstd files and respond plain-text?

We can see the the foot bytes of 1131.log is also something like 9c8ca\r\n\r\n.

So the best guess is: when Gitea tries to read the compressed "zstd" file from your S3, your S3 responds decompressed plain-text file. Then Gitea's zstd reader can't read it.

No, the file I sent was manually compressed. It must be the correct file. This issue is that when using Minio's SDK to upload, the file will be damaged

@wxiaoguang
Copy link
Contributor

No, the file I sent was manually compressed. It must be the correct file. This issue is that when using Minio's SDK to upload, the file will be damaged

Then how would you explain these "footer bytes" problem? That file is a plain-text file.

And I have said many times: you need to figure out the Seaweed's behavior (with Minio SDK):

Could you help to confirm the seaweed's behavior? For example: use your test code , and read the logs/1131.log.zst raw content and dump it to see whether it is compressed or decompressed? (or: maybe the response is jut incorrect? just a guess)

If:

  • the uploaded content is right (in zstd compression)
  • the downloaded content is right (in zstd compression)

Then nothing wrong would happen, right?

So, which step is wrong? "the uploaded content"? or "the downloaded content"?

@wxiaoguang
Copy link
Contributor

This issue is that when using Minio's SDK to upload, the file will be damaged

How the file is "damaged"? What's the content of uploaded file? Decompressed? Truncated? Or randomly corrupted? Could you provide the file samples?

Your samples above are ALL right.

@OAMchronicle
Copy link
Author

the uploaded content

@wxiaoguang
Copy link
Contributor

the uploaded content

But your samples above are ALL right.

@wxiaoguang
Copy link
Contributor

What's the content of uploaded file?

I mean, the "UPLOADED" file.

Since you suppose that the uploaded file is damaged, could you provide the uploaded and damaged one? But not the one you used to upload.

@OAMchronicle
Copy link
Author

OAMchronicle commented Feb 19, 2025

//test code

package actions

import (
	"code.gitea.io/gitea/modules/setting"
	"code.gitea.io/gitea/modules/storage"
	"code.gitea.io/gitea/modules/zstd"
	"context"
	"crypto/tls"
	"fmt"
	"github.com/aws/aws-sdk-go-v2/aws"
	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/credentials"
	"github.com/aws/aws-sdk-go-v2/service/s3"
	"io"
	"net/http"
	"os"
	"testing"
)

func TestTransferLogs1(t *testing.T) {
	open, err := os.Open("1131.log.zst")
	if err != nil {
		return
	}
	minioStorage, err := storage.NewMinioStorage(context.Background(),
		&setting.Storage{
			MinioConfig: setting.MinioStorageConfig{
				Endpoint:           "x.x.x.x:8333",
				AccessKeyID:        "xx",
				SecretAccessKey:    "xx",
				Bucket:             "gitea",
				UseSSL:             false,
				InsecureSkipVerify: true,
			},
		})
	if err != nil {
		return
	}
	_, err = minioStorage.Save("1131.log.minio.zst", open, -1)
	if err != nil {
		return
	}
}

var Cli *s3.Client

func init() {
	staticResolver := aws.EndpointResolverWithOptionsFunc(func(service, region string, options ...any) (aws.Endpoint, error) {
		return aws.Endpoint{
			PartitionID:       "aws",
			URL:               "http://x.x.x.x:8333",
			SigningRegion:     region,
			HostnameImmutable: true,
		}, nil
	})

	tr := &http.Transport{
		TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
	}
	httpClient := &http.Client{Transport: tr}

	cfg, _ := config.LoadDefaultConfig(context.Background(),
		config.WithRegion("us-east-1"),
		config.WithEndpointResolverWithOptions(staticResolver),
		config.WithCredentialsProvider(credentials.NewStaticCredentialsProvider("xx", "xx", "")),
		config.WithHTTPClient(httpClient),
	)

	Cli = s3.NewFromConfig(cfg)
}

func puttest(key string, obsClient *s3.Client) bool {
	file, err := os.Open(key)
	if err != nil {
		return false
	}
	defer file.Close()
	_, err = obsClient.PutObject(context.TODO(), &s3.PutObjectInput{
		Bucket: aws.String("gitea"),
		Key:    aws.String("1131.log.s3.zst"),
		Body:   file,
	})

	if err != nil {
		return false
	}

	return true
}

func TestUp(t *testing.T) {
	puttest("1131.log.zst", Cli)
}

Execute TestUp and TestTransferLogs1 separately, and the results are as follows

Image

1131.log.minio.zst.txt
1131.log.s3.zst.txt

@wxiaoguang
Copy link
Contributor

OK, we can simply cat the damaged file, then the problem is clear:

https://stackoverflow.com/questions/55816844/aws-s3-get-after-put-includes-chunk-signature-data

Image

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Feb 19, 2025

SeaweedFS is not S3-compatible, it doesn't work with chunk-encoding.

And maybe also related to this:

@OAMchronicle
Copy link
Author

ok, Thank you very much!

@wxiaoguang
Copy link
Contributor

I guess we could either:

  • Suggest SeaweedFS to work with chunk-encoding, then Minio SDK could work
  • Use some tricks to make Minio SDK do not use chunk-encoding (haven't really looked into it, not sure whether there would be side-effects)
  • Use AWS SDK to replace Minio SDK (well, not sure whether it is worth or good to do so .....)

@wxiaoguang wxiaoguang added the type/upstream This is an issue in one of Gitea's dependencies and should be reported there label Feb 19, 2025
@OAMchronicle
Copy link
Author

At present, it seems that if I want to use it, I can only use AWS SDK instead of Minio SDK。
However, thank you very much for your help

@wxiaoguang wxiaoguang changed the title When using S3 as storage, the action_runner log cannot be viewed ObjectStorage without chunk-encoding support corrupts files (the action_runner log cannot be viewed) Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug type/upstream This is an issue in one of Gitea's dependencies and should be reported there
Projects
None yet
Development

No branches or pull requests

3 participants