Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make programming language default to Go #2821

Closed
cgwalters opened this issue Apr 18, 2022 · 10 comments · Fixed by #2919
Closed

Make programming language default to Go #2821

cgwalters opened this issue Apr 18, 2022 · 10 comments · Fixed by #2919

Comments

@cgwalters
Copy link
Member

It's no secret, coreos-assembler is a big mismash of bash, Python and Go. (Often calling CLI tools written in many other languages)

The Go drop mainly came when we merged in mantle. However since then, the addition of gangplank added a whole other Go subdirectory.

I would like to propose that we canonically make coreos-assembler a Go program, moving the go.mod and build infrastructure to the toplevel. So /usr/bin/coreos-assembler be a Go binary. However, at a practical level we would likely to start out by having this just e.g. execve() to /usr/lib/coreos-assembler/cmd-build and the like. But, it's interesting to consider switching from installing scripts directly to building them in-binary with Go embedding.

There are a bunch of motivations for this, but a specific one right at this moment is that I was trying to do a CentOS Stream 9 coreos-assembler so I could test out coreos/rpm-ostree#3610 and I ran into various miscellaneous python dependencies we have aren't packaged. If we were canonically in Go, we'd be using a consistent mechanism to vendor those.

@bgilbert
Copy link
Contributor

bgilbert commented Jun 8, 2022

I'm not 100% clear on the intended scope of this issue. If you're proposing a widespread rewrite of existing code, I doubt there's a lot of appetite for that. (And IMO you're proposing to standardize on the least pleasant of the three languages. 😏)

But I agree it'd be useful to move go.mod to the top level and treat cosa as a single Go project. (Though see also #2860.) We could then continue to perform incremental changes where they seem useful, including potentially converting the coreos-assembler binary.

@cgwalters
Copy link
Member Author

I'm not 100% clear on the intended scope of this issue. If you're proposing a widespread rewrite of existing code, I doubt there's a lot of appetite for that.

Definitely not proposing we drop everything and try a widespread rewrite. That's almost always a bad idea.

(And IMO you're proposing to standardize on the least pleasant of the three languages. smirk)

Well...one thing I'd like to try out is "shell script in Go"...a bit like https://docs.rs/sh-inline/latest/sh_inline/

But I agree it'd be useful to move go.mod to the top level and treat cosa as a single Go project.

Yep that's mainly what I mean!

We could then continue to perform incremental changes where they seem useful, including potentially converting the coreos-assembler binary.

Yep, exactly that.

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jun 14, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
@cgwalters
Copy link
Member Author

Working on #2907
was another great example of the pain of dealing with both Python and Go.

I actually typo'd the value when working on https://github.com/coreos/coreos-assembler/pull/2907/files#diff-2bd2de86358cc173c1095566872fa3da05357c2d22d49a522efdb97f1da43946R64
and since Python is all dynamic, it appeared to work until I did a cosa run which actually invokes kola qemuexec which actually uses the compiled schema, and failed validation.

If push-container had been in Go from the start, it wouldn't have compiled at all.

@cgwalters
Copy link
Member Author

Another great example in that code base is that we're parsing container image tags with e.g. plain strings.split(':'); I'm sure there are python eggs or whatever for this, but we're already vendoring Go libraries for this which would be better to use.

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jun 16, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jun 22, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 6, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 6, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 6, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 6, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 6, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 7, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 7, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 7, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 8, 2022
This creates an initial skeleton for Go code at the very toplevel
of the project.

What is currently the `coreos-assembler` shell script entrypoint is
changed to be embedded via Go file embedding into `/usr/bin/cosa`.
This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

There's an embryonic `bashexec` internal Go module that is designed
to help with this.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 9, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 9, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 11, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: coreos#2821
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jul 11, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: coreos#2821
@cgwalters
Copy link
Member Author

I want to address a bit more explicitly: Why Go?

  • We have a large Go codebase in mantle (and gangplank) that is simply not going away
  • Go is the main language of the container ecosystem and we are very container oriented
  • Go is also very popular for infrastructure-level services, e.g. there's APIs/SDKs for GCP, AWS, etc. There's Terraform for example which is Go.

Many people on our team need to know all 4 of Go, Rust, Python and shell (at least); and some need to know C and C++. My goal is to ultimately shrink that mainly to Go and Rust (with not more than 10 lines shell scripts here and there), with a few C/C++ people.

Also specifically, many years ago I was a big Python fan and just have gradually been burned by it so many times (mainly the dynamic nature, but also they really messed up with the whole python3 unicode thing) that I just no longer enjoy writing code in it myself. Obviously Python is a popular language with numerous success stories. However, Go is more successful in our particular domain.

That said, some of our python code has load-bearing ties into other python libraries, most notably the koji bits; but hopefully the python code can fork off the Go code and parse JSON or so. So we'll likely need to maintain at least minimal bindings to the Python side for e.g. parsing builds/ and meta.json for the near-term future. I just don't think it should be the default.

@bgilbert
Copy link
Contributor

I agree that being burned by Python is an ongoing and real problem. I don't think we should be shipping Python code in the OS. But there are reasons we continue to use Python and shell in cosa, and I think we should understand those before trying to legislate them away. I'd argue we keep resorting to Python and shell because they have good code density (unlike especially Go), a low barrier to entry, and the team is broadly familiar with them. Also, lots of cosa just makes calls to other programs, which happens most naturally in shell.

I recently encountered some of this with tmpl8. I originally wrote it in Python, then decided to be more rigorous and rewrote it in Rust. I'm glad I did — the result feels more solid — but it was also an Undertaking that produced a Non-Trivial Rust Program, where the original version was a straightforward 100-line script. Go has the same problem but without any of Rust's ergonomics.

As another example: kola is a Real Go Program with reasonably nice layered internal APIs, but the current qemu support bypassed most of that and pasted in a bunch of additional ad-hoc functionality, producing some pretty complex code that's difficult to understand and has had a number of problems such as subtle resource leaks. In essence we treated it like we were hacking on script code, but in a language that demands more rigor. (Because Go code tends to become multithreaded more readily than Python, because of static typing, and perhaps because it's harder to understand complex Go code from scratch.) Relatedly, while I haven't looked at Gangplank, I have the sense that it's not broadly understood on the team. I think it's important that we avoid making similar mistakes in the future.

So, I remain unconvinced. At the least, if we decide we want to pursue this, we should think about ways to make it easier to understand and extend existing code. Is there cosa infrastructure we could build out to help with this?

@jlebon jlebon changed the title Make programming languge default to Go Make programming language default to Go Jul 12, 2022
@jlebon
Copy link
Member

jlebon commented Jul 12, 2022

I'd argue we keep resorting to Python and shell because they have good code density (unlike especially Go), a low barrier to entry, and the team is broadly familiar with them. Also, lots of cosa just makes calls to other programs, which happens most naturally in shell.

I think this is true. But the problem is that we also implement a lot of complex logic in bash and have gotten to a certain level where we're semi-regularly hitting against its idiosyncrasies (recent example). And by its dynamic nature, those are more expensive than golang ones (runtime vs compile time errors).

I also think we're at a point now where cosa is maturing and the development speed advantages afforded by dynamic languages like bash and Python are becoming less valuable in comparison to the increase in correctness we'd get from a statically typed language.

My primary concern with this honestly is just that unless we have some resources assigned to refactoring this codebase, this will just end up being another way to do things in cosa, increasing complexity overall.

@cgwalters
Copy link
Member Author

As another example: kola is a Real Go Program with reasonably nice layered internal APIs, but the current qemu support bypassed most of that and pasted in a bunch of additional ad-hoc functionality, producing some pretty complex code that's difficult to understand and has had a number of problems such as subtle resource leaks.

I think I wrote a lot of that and...you are 100% right. The code there is hacky. I'll be honest...I am not a great Go programmer. Though part of the problem with that specific code I think is that qemu simply ends up being different than clouds too.

But, that specific code is also a good example of something I think would be better to structure in a new internal/pkg/qemu or so, and instead of living in kola it'd be part of cosa proper. I think that would feel much more idiomatic.

We just ended up at this ad-hoc hacky shortcut trying to deduplicate the bash qemu code with the Go qemu code!

IOW, while I agree with your point, I hope as part of this we'll get better Go code 😄

dustymabe pushed a commit that referenced this issue Jul 13, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: #2821
@bgilbert
Copy link
Contributor

But, that specific code is also a good example of something I think would be better to structure in a new internal/pkg/qemu or so, and instead of living in kola it'd be part of cosa proper. I think that would feel much more idiomatic.

Moving the code around doesn't really address the issue, though. Fundamentally it's a mindset question. I don't think Go has proven to be a good fit for the relatively ad-hoc way we maintain this repository, and it doesn't sound like we have concrete plans to mitigate that as part of any trend toward using more Go here. If the team were broadly comfortable with Go programming, I'd have a different view, but I'm not sure that's the case. And for new code we've been trending pretty hard toward Rust, so the team isn't likely to get more Go experience over time.

So, counter-proposal: we encourage new code to be written in Python, and substantially improve the Python linting done in CI. There are a lot of tools available to reduce the dynamic gotchas of the language, if we made a practice of using them.

@cgwalters
Copy link
Member Author

I don't think Go has proven to be a good fit for the relatively ad-hoc way we maintain this repository, and it doesn't sound like we have concrete plans to mitigate that as part of any trend toward using more Go here.

Hmm. For sure, a lot of the repository is ad-hoc - and honestly that's mostly as it should be! This project is an implementation detail - an important one, but "polish" generally should go to user-facing tools and APIs I'd say.

But do think part of this is changing the mindset and improving the Go we write.

If the team were broadly comfortable with Go programming, I'd have a different view, but I'm not sure that's the case.

We have a big mix of language experience, which is normal. What I find very difficult actually is context switching between languages, so my goal is to reduce that.

And for new code we've been trending pretty hard toward Rust, so the team isn't likely to get more Go experience over time.

The Kubernetes/OpenShift Go center of gravity isn't going away, and just speaking for myself I regularly contribute to at least the MCO which is all Go.

So, counter-proposal: we encourage new code to be written in Python, and substantially improve the Python linting done in CI. There are a lot of tools available to reduce the dynamic gotchas of the language, if we made a practice of using them.

Well. Honestly...the momentum shifts involved here are really small. I think we're all agreed there is no major rewrite planned. Particularly for Python - we have a ton of it and it's not going away anytime soon, and really not in the forseeable future with some long-tail load bearing stuff there like Koji. If you want to write new code in Python, I personally will generally have no problem with that!

We do need to be strategic though about code duplication; and ultimately there has to be a kind of clear direction for some core functionality.

dustymabe pushed a commit to dustymabe/coreos-assembler that referenced this issue Aug 11, 2022
- Converts the entrypoint into Go code
- Add an internal library that exposes/wraps `cmdlib.sh`
  because we have a lot of stuff in there that can't be ported
  to Go yet.
- Add an internal library for running inline (named) bash scripts
- Port `clean` to Go

This is a pattern I think we'll use to aid the transition; rather
than trying to rewrite things wholesale in Go, we'll continue
to exec some shell scripts.

Gradually perhaps, we may invert some things and change both
`cmdlib.sh` and `cmdlib.py` to exec the cosa Go process in some
cases too.

Closes: coreos#2821
(cherry picked from commit aa944b8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants