Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap to v4 #703

Closed
andreynering opened this issue Apr 16, 2022 Discussed in #694 · 46 comments
Closed

Roadmap to v4 #703

andreynering opened this issue Apr 16, 2022 Discussed in #694 · 46 comments
Labels

Comments

@andreynering
Copy link
Member

Hi everybody,

There were some discussions at #694 with regard to ideas of a possible future major release. GitHub discussions are not too good for these discussions so I decided to open this issue to be more centralized and visible (issues can be pinned).

Also, that probably needs some direction and organization, as having more people involved means we now have different opinions we need to reconcile.

@ghostsquad started working on some ideas on his fork, but I consider that to be a Proof of Concept ™️, and things will likely be different in its final version in the repository.

This issue is a WIP and will be iterated as time passes.


Principles

  • Minimal breaking changes
    • Most people should be able to change version: '3' to version: '4' and their Taskfile should keep working mostly the same, with minimal changes
    • This means no unnecessary changes to the schema, like renaming cmds to something else, for example. Most changes should be for new features
    • The same applies to avoiding changes to CLI flags
    • There may be exceptions, of course, but they needs a good justification. Variables will probably be the biggest one, as @ghostsquad has a proposal to make they lazy.
    • Also, for Taskfiles with version: '3' everything should keep working as before, with rare exceptions
    • I'm undecided about remove code related to keep version: '2' working
  • Development process
    • Pull requests should be atomic, one for each feature, so they can be reviewed in isolation and discussion can happen with more efficiency about each feature (with regard to UX and implementation)
    • Tests should be written covering the relevant code

Steps

Preparation

  • We should try to do this in the master branch IMO, to avoid future conflicts between master and v3
  • Code re-organization? We have some old directory structure and may want to improve that
  • Tests refactor? Would help as we will write new tests for v3 stuff.

Implementation

This is a ongoing document and will be improved...

@andreynering andreynering pinned this issue Apr 16, 2022
@andreynering
Copy link
Member Author

Anybody is free to comment here about what you expect for v4. The community feedback was always taken into account in the project. 🙂

Some things are still missing, and even the ones mentioned above won't necessarily be implemented.

/cc @ghostsquad @tylermmorton @kerma

@ghostsquad
Copy link
Contributor

@andreynering please help me understand why you do not want breaking changes on v4. Fundamentally, the point of a major version bump is to allow changes that would otherwise be disruptive to occur. Non-breaking changes can & should continue on version: 3.

@andreynering
Copy link
Member Author

@ghostsquad It's not that breaking changes are disallowed, but I want to avoid them unless necessary.

For example, I noticed you renamed some attributes on your PoC branch (cmds became do, summary become long, etc), and that's something I want to avoid unless it's an actual new feature.

Please, let me know if you disagree, I'm open to hear opinions! But I think the above is what most users would expect. If users have to rewrite all tasks, that would prevent many of them to upgrade to v4, the kind of situation I want to avoid.

@andreynering
Copy link
Member Author

@ghostsquad Changing a bit the subject... can you describe how to you see the lazy variables working? Maybe a couple of simple examples? I'm curious to see what is coming. 🙂

That's a situation where a slightly different syntax may be allowed if we have big improvements in that area, as variables/envs always had it shortcomings. Even though, if it was possible to keep the syntax mostly the same, that would certainly have the benefit of helping people to upgrade.

@ghostsquad
Copy link
Contributor

ghostsquad commented Apr 16, 2022

unless it's an actual new feature.

A common misconception is that naming things doesn't matter. In fact, well named fields and values make a ton of difference. So this renaming is a feature.

https://levelup.gitconnected.com/how-choosing-better-names-can-improve-your-code-31a0050c6c93

image

https://blog.thecodewhisperer.com/permalink/a-model-for-improving-names

http://arlobelshee.com/tag/naming-is-a-process/


If users have to rewrite all tasks, that would prevent many of them to upgrade to v4, the kind of situation I want to avoid.

I am thinking about this as well, and in fact, in some cases I have been considering some backwards compatibility (at the expense of not having access to new features without an additional change).

Additionally, like with Kubernetes, since task v3 has a yaml schema, I'm looking at a schema translation mechanism, so that I can write a tool to allow a user to rewrite their v3 Taskfile into v4 format automatically.


can you describe how to you see the lazy variables working? Maybe a couple of simple examples? I'm curious to see what is coming.

My initial testing uses a concept of Lazy values and variable "layers". A lazy looks like this:

type Lazy[T any] struct {
	new   func() (T, error)
	once  sync.Once
	value T
	err   error
}

func (l *Lazy[T]) SetNew(fn func() (T, error)) {
	l.new = fn
}

func (l *Lazy[T]) Eval() (T, error) {
	l.once.Do(func() {
		if l.new != nil {
			v, err := l.new()
			l.value = v
			l.err = err
			l.new = nil // so that f can now be GC'ed
		}
	})

	return l.value, l.err
}

a Lazy Var looks like this:

func newLazyVar(txtVal string, tplData *TplData) *lazy.Lazy[any] {
	tpl := template.New("")
	template.Must(tpl.Parse(txtVal))

	return lazy.New[any](func() (any, error) {
		buf := bytes.NewBuffer([]byte{})
		err := tpl.Execute(buf, tplData)
		if err != nil {
			return nil, err
		}

		return buf.String(), nil
	})
}

The data that is passed to the lazy var template looks like this:

type TplData struct {
	v map[string]*lazy.Lazy[any]
}

func (t *TplData) V(key string) (any, error) {
	return t.v[key].Eval()
}

and some initial experimental tests looked like this:

at 12:31:18 ❯ go run ./cmd/taskv4/main.go '{{printf "%s-%s" (.V "var2") (.V "var2")}}' 'bar'
V: var1
(string) (len=7) "bar-bar"
(interface {}) <nil>
V: var2
(string) (len=3) "bar"
(interface {}) <nil>

at 12:31:32 ❯ go run ./cmd/taskv4/main.go 'bar' '{{printf "%s-%s" (.V "var1") (.V "var1")}}'
V: var1
(string) (len=3) "bar"
(interface {}) <nil>
V: var2
(string) (len=7) "bar-bar"
(interface {}) <nil>

There's extra information in this output, because here's the evaluation loop I'm using for my experiments:

for k, v := range tplData.v {
  fmt.Printf("V: %s\n", k)
  spew.Dump(v.Eval())
}

Layers are like global, task-level, step-level.

In this way, a variable can access any other variable adjacent to it (same layer) and any variable in lower layers. Allowing for the same type of functionality we currently have with saying {{ .FOO | default "bar" }} (if .FOO is in a lower layer).

You can see that how variables are called though is slightly different. {{.V "var1"}}. This is because variables are not fields, but actually are functions, and there are limitations in go text template for how that works. If a struct's field is a "function" type, accessing the field with .Foo simply returns the function, does not call it. To call it you would have to write call .Foo (.Foo() is not valid). But I don't like the verbosity of using call.

But if the field, such as V is a receiver then that receiver is called (just like any standalone template function). Thus, V is the function that evaluates the lazy, thus the syntax is {{.V "var1"}}, as the variable name is a string passed to the .V receiver function.

Because variables are lazy, they are only ever evaluated if they are actually used. This is actually really great I think for task because task doesn't need to eagerly evaluate variables that have no effect on the program (similar to how the go compiler complains to you when you've declared a variable that is not used later).

I'm still working on a way to detect circular references in order to avoid a confusing panic:

at 12:28:02 ❯ go run ./cmd/taskv4/main.go '{{printf "%s-%s" (.V "var2") (.V "var2")}}' '{{ .V "var1" }}'
V: var1
fatal error: all goroutines are asleep - deadlock!

<and a 100+ line stack trace>

@ghostsquad
Copy link
Contributor

ghostsquad commented Apr 16, 2022

For further clarification that in fact order of declaration of variables actually doesn't matter, here's another example:

for _, v := range []string{"1", "2", "3"} {
	varName := "var" + v
	fmt.Printf("evaluating %s\n", varName)
	spew.Dump(tplData.v[varName].Eval())
}
at 13:05:11 ❯ go run ./cmd/taskv4/main.go '{{.V "var3"}}'  '{{printf "%s-%s" (.V "var1") (.V "var1")}}' 'hello'

evaluating var1
(string) (len=5) "hello"
(interface {}) <nil>
evaluating var2
(string) (len=11) "hello-hello"
(interface {}) <nil>
evaluating var3
(string) (len=5) "hello"
(interface {}) <nil>

Noticing that var3 (that last argument) is the only variable with a concrete value.

@max-sixty
Copy link

I'm not sure how helpful "I would like these things" is, but given your question, I've made a brief list below. For context, I'm a huge fan of Taskfile; we now use it extensively at my company, and in some of the OSS projects I manage. I've been a sponsor for a few months (with no expectations ofc).

  • Non-string variables would make many of my workflows more coherent; require less boilerplate.

  • I'm using watchexec a great deal, it's a very efficient watcher, with lots of options. Currently I use watchexec with my own preferences (not committed to the repo), and task for common tasks that anyone might want to do, committed to the repo1. There's some space for a tool that bundles "how to watch & what to run" with the repo — e.g. "If .py files in foo path change, run the foo tests; if .md files in the bar path change, run the bar build". There's a watchexec proposal for this (Configuration file watchexec/watchexec#235) coming from the file watcher side. Could Task allow more watching options from its side?

    Possibly this leads to a path where there is a single system doing all watching / building / testing / incremental / etc. I'm not sure whether that's a good or bad thing...

  • I'm often writing steps to execute in Task and GitHub Actions or GitLab CI, and don't compound or share between them. I'm not sure what the best approach is to reduce the cost of changing between these. Many of the GitHub Actions runners are designed to run independently, so we need to write them twice or lose the benefits of that runner. I don't have a clear solution here (I guess dagger is trying to fix this), but raising it as a pain-point. (I could imagine a system that "compiles to GitHub Actions" using some data in the Taskfile.yml, but that might become quite complicated).

  • I occasionally want to run things in the context of docker or another language. This is something justfile does well.

Thank you again!

Footnotes

  1. Most of my examples are at work so I can't link, but here's an OSS one

@ghostsquad
Copy link
Contributor

I could imagine a system that "compiles to GitHub Actions"

There's https://github.com/nektos/act which is designed to run your the GitHub workflow locally. That's probably the closest you are going to get there.

Regarding managing complexity/differences in general between GHA and what's in Task...

I'm tackling this problem as well, and I've found that it's best to actually treat a GitHub actions workflow just like you would treat a developer sitting next to you.

Can you give me some examples of what you find yourself duplicating between GHA and Task?


I occasionally want to run things in the context of docker or another language.

Me too. The thing to consider with docker is the "environment" changes (inside vs outside of a container) as well as the ability to pass data around. Systems like GHA and ConcourseCI have developed a nice abstraction in the form of an ephemeral filesystem that transparently follows you around. Task is not as sophisticated (yet).


This is something justfile does well.

Time to go take a look there to get some more data. Can you describe specifics about what you like with justfile, and what you don't like?

@ghostsquad
Copy link
Contributor

ghostsquad commented Apr 16, 2022

There's some space for a tool that bundles "how to watch & what to run"

This is interesting. I think this is a similar abstraction in fact to running specific tasks in docker.

Both of these things kind of feel like we need the decorator design pattern.

I'll go take a look at watchexec and the issue specifically you linked.

@max-sixty
Copy link

Thanks for the response @ghostsquad

Me too. The thing to consider with docker is the "environment" changes (inside vs outside of a container) as well as the ability to pass data around. Systems like GHA and ConcourseCI have developed a nice abstraction in the form of an ephemeral filesystem that transparently follows you around. Task is not as sophisticated (yet).

Yeah, I've found just mapping the working directory as a volume can solve most problems here (though maybe there are problems I'm less familiar with)

Time to go take a look there to get some more data. Can you describe specifics about what you like with justfile, and what you don't like?

Great — just is the other "new cool task manager" that I'm aware of. Unfortunately I've hardly used it — I basically picked Taskfile because I thought its yaml files were a clearer way of representing tasks than just's custom files. But the link above is a reference to how they allow for arbitrary languages.

Can you give me some examples of what you find yourself duplicating between GHA and Task?

In the most trivial case, I have a task to build assets that's defined in both the Taskfile and the GHA workflow. (They are slightly different, but that supports my point that the divergence here is awkward!). It's obv not the fault of Taskfile that services like GHA & GitLab have proprietary formats.

There's https://github.com/nektos/act which is designed to run your the GitHub workflow locally. That's probably the closest you are going to get there.

Thanks, this looks cool, I'll try it!

@tylermmorton
Copy link
Contributor

tylermmorton commented Apr 17, 2022

I've been thinking about ways to add wildcards to task calls recently. I don't think it's a breaking change but this could be a good feature candidate in v4. It feels like it makes more sense now that multi-level includes have been introduced.

I'm working in a monorepo where we have a lot of different project folders. We try to structure each project's taskfiles so they all have common workflow tasks like build and clean. Then there's a root taskfile that includes all of them:

version: 4

includes:
  backend_proj:
    taskfile: "backend/Taskfile.yml"
  frontend_proj:
    taskfile: "frontend/Taskfile.yml"

tasks:
  build:
    cmds:
     # runs all `build` tasks in all included files
      - task: *:build
  clean:
    cmds:
      # runs all `clean` tasks in all included files (recursive)
      - task: **:clean

To me it would be useful to use an * like *:build or **:clean and have them act as wildcards when determining which tasks to run.

* covers any single namespace.
** covers nested namespaces.

I think we could do this by refactoring the type taskfile.Tasks from map[string]*Task into a tree structure that we could then traverse in depth first order. When I get some time I can take a swing at the implementation.


There's https://github.com/nektos/act which is designed to run your the GitHub workflow locally. That's probably the closest you are going to get there.

I'm definitely trying this. Thanks for posting!

For example, I noticed you renamed some attributes on your PoC branch (cmds became do, summary become long, etc), and that's something I want to avoid unless it's an actual new feature.

Please, let me know if you disagree, I'm open to hear opinions! But I think the above is what most users would expect. If users have to rewrite all tasks, that would prevent many of them to upgrade to v4, the kind of situation I want to avoid.

Renaming things to me feels like a natural progression of the tool. Maybe some criteria can be set around what we choose to rename and why. If there are features we can try to re-frame to be more intuitive I'm all for it. And as @ghostsquad mentioned, there's always the idea of writing a migration tool.

@ghostsquad
Copy link
Contributor

In the most trivial case, I have a task to build assets that's defined in both the Taskfile and the GHA workflow. (They are slightly different, but that supports my point that the divergence here is awkward!). It's obv not the fault of Taskfile that services like GHA & GitLab have proprietary formats.

Why not just call task in GHA?

@ghostsquad
Copy link
Contributor

I think we could do this by refactoring the type taskfile.Tasks from map[string]*Task into a tree structure that we could then traverse in depth first order. When I get some time I can take a swing at the implementation.

I was thinking of doing this similar actually to the way Make uses "patterns". https://swcarpentry.github.io/make-novice/05-patterns/index.html

https://web.mit.edu/gnu/doc/html/make_10.html#SEC91

The order in which pattern rules appear in the makefile is important since this is the order in which they are considered. Of equally applicable rules, only the first one found is used. The rules you write take precedence over those that are built in. Note however, that a rule whose dependencies actually exist or are mentioned always takes priority over a rule with dependencies that must be made by chaining other implicit rules.

Though, I'm sure there are considerations where, in a Depth-first approach, you want FIFO (so start at the bottom, and work your way towards the current file) or possibly LILO (start closest, and move backwards). I'll have to play around with what makes sense as a default, and how one may choose to configure this.

@ghostsquad
Copy link
Contributor

Renaming things to me feels like a natural progression of the tool. Maybe some criteria can be set around what we choose to rename and why. If there are features we can try to re-frame to be more intuitive I'm all for it. And as @ghostsquad mentioned, there's always the idea of writing a migration tool.

Ya I think when I land on something I really like, I can work on a "migration.md" document, that can explain some of the design rationale, and reasons for renaming. I'd like to also just make a design-rationale.md doc as well that explains the "rules engine" in v4, so that folks can understand not just individual features, but also just fundamentals of how the application is designed, and thus could help to inform current and future behavior, without having to explicitly document everything (though I still will be doing that).

@max-sixty
Copy link

Why not just call task in GHA?

I'll expand on this a bit:

Many of the GitHub Actions runners are designed to run independently, so we need to write them twice or lose the benefits of that runner.

  • GitLab (and somewhat GHA) allows for parallelization, so there will be a Task which loops over n values which should run n different jobs in GitLab. They also allow for arbitrary DAGs, whereas Taskfiles are less flexible (deps are a property of the downstream task, not a workflow)

    I could maaaaybe imagine something which said "when compiling this taskfile into a .gitlab.ci.yml, split this into n jobs" but it gets complicated obv

  • GHA runners offer some encapsulations around environment-specific issues. In the OSS one I linked above, we use https://github.com/actions-rs/cargo a lot, which encapsulates installing rust, the various targets and some OS-specific workarounds.
  • GitLab don't carry their environments through jobs (which are often quite small), so if task isn't installed already, it needs to be installed on each job.
  • For OSS projects, which have a wider range of contributors, task is still unfamiliar to some
  • That said, there's probably more I can do here — if we're writing something to work locally, possibly we've already implemented much of what the runner offers us.

@ghostsquad
Copy link
Contributor

ghostsquad commented Apr 17, 2022

GHA runners offer some encapsulations around environment-specific issues. In the OSS one I linked above, we use https://github.com/actions-rs/cargo a lot, which encapsulates installing rust, the various targets and some OS-specific workarounds.

I haven't looked into this in depth, but I've found asdf to be very nice. There's always going to be a non-gha process for installing dependencies on a developer machine. Encapsulate how that works, then just run that in GHA.

https://github.com/code-lever/asdf-rust

There will be some very minor drift, such as "well how do I install ASDF". On Mac, brew install asdf, on GHA you could use this: https://github.com/asdf-vm/actions. Use this to not only install asdf, rust, cargo, etc, but you can also use it to install task (https://github.com/particledecay/asdf-task).


GitLab don't carry their environments through jobs (which are often quite small), so if task isn't installed already, it needs to be installed on each job.

Similar to GHA - The solution if you want to save time/bandwidth is by declaring the use of a pre-built image that contains tools pre-installed. This is a time/complexity trade off. Complexity is low if you install at runtime. Complexity goes up when you need to decide to build a new image (your toolset changes), and plumb that new version into the workflow. I'll leave this as an exercise for you decide on.


For OSS projects, which have a wider range of contributors, task is still unfamiliar to some

Sounds like those OSS projects need better documentation. The very nature of task is to provide a simpler alternative to make. Many people are also unfamilar with make because it's original intended use is in the compilation of c.


That said, there's probably more I can do here — if we're writing something to work locally, possibly we've already implemented much of what the runner offers us.

Focus on what you need to do to get up and running on your developer machine. Run those same processes (more or less) on GHA/GitLab. Task becomes your abstraction.

You may still end up running specific GHA/Gitlab features, such as matrix in order to improve build time, but each of those jobs should also ideally just be calling task for the actual work they are doing. The "setup" should be as minimal as possible. When it comes to installing your dependencies, you definitely want to unify that into a single process (like asdf) otherwise risk drift between how it works locally vs how it works in CI.

@max-sixty
Copy link

I don't mean to use this issue to go into lots of detail on my specific issues, but I think about some of these issues differently:

  • There are actual tradeoffs — we can't just wave them away. I'm a big fan of task because I think it's often a worthwhile tradeoff, but it's not just a documentation thing — I have worked with very few python / rust / juila / ocaml projects where there's already a layer of indirection like make — so task does introduce an additional layer.
  • In rust's case, rustup already provides the ability to run with multiple versions. So task will always add another layer of indirection. I'm not sure if you're claiming we should require people to install asdf to run the project — that would not be seen as a reasonable request by most contributors (do you know of any projects that require that?).

@ghostsquad
Copy link
Contributor

@max-sixty I'd like to refocus on the problem that you mentioned.

I'm often writing steps to execute in Task and GitHub Actions or GitLab CI, and don't compound or share between them. I'm not sure what the best approach is to reduce the cost of changing between these. Many of the GitHub Actions runners are designed to run independently, so we need to write them twice or lose the benefits of that runner.

This problem, and even the possible solution you mention describes that of an abstraction layer. This is not to be confused with a layer of indirection.

https://www.silasreinagel.com/blog/2018/10/30/indirection-is-not-abstraction/

The hard part is to identify the "overlap" that is desirable, and the risk involved in maintaining the disparate parts.

ci-local-venn drawio

**Note: yes, you can run multiple parallel docker containers in your CI without using your CI's job concept. I know plenty of folks that do this, such as with docker-compose. Depending on your CI, there may be some extra work to get docker-in-docker type of thing working, but besides that, it's totally doable these days.

If you find yourself keeping duplicating code within your CI yaml files, you may want to consider generating that file, and just checking in the generated file. Maybe you can generate both your CI yaml & the Taskfile.yml from the same datasource? Sounds like an interesting thing to do.

@ghostsquad
Copy link
Contributor

One last thing to mention, is that you said that you were already using Makefile, You can of course choose to use task and make simultaneously, but I'd definitely probe into why you aren't trying to collapse those (into using just make or just task). To me, the only reason to use both is because there's not (enough) feature parity between the two programs.

@max-sixty
Copy link

Great, I think we agree there are tradeoffs in generalizing from two modes to the intersection's one mode. If we can lower those costs, then great. Those costs are high enough at the moment that I often write everything twice, despite being a loyal cheerleader for Task. Maybe we disagree on those costs — though I notice that task uses native runners for most of its GHA too: 1, 2. I'd be most persuaded by real examples of this done well.

One last thing to mention, is that you said that you were already using Makefile

I think I must have been unclear — I mean "very few" like "almost never" :)

I have worked with very few python / rust / juila / ocaml projects where there's already a layer of indirection like make — so task does introduce an additional layer.

@ghostsquad
Copy link
Contributor

I'd be most persuaded by real examples of this done well.

I do agree that more examples, use cases, best/recommended practices need to be documented. I will be working on that in v4.

I think I must have been unclear — I mean "very few" like "almost never"

Oh, well then, ignore that then! 😄

I have worked with very few python / rust / juila / ocaml projects where there's already a layer of indirection like make — so task does introduce an additional layer.

The introduction of a task runner is a feature, not a bug. Sometimes people don't do this in their own projects, or wait until later to do it because their build is already very simple. Example, if go build is all I need to do, why would I need anything else? Or maybe they wait until their project iteration speed increases, and thus the "pain" of doing things manually becomes more apparent. Either way, there's reasons to "abstract" away some commands/workflows.

@andreynering
Copy link
Member Author

I still don't think we should rename attributes without a good reason. These are the suggestions from the PoC branch:

  • deps -> needs_of (or needs?)
  • desc -> short
  • summary -> long (or example?)
  • cmds -> do
  • sources -> from
  • generates -> makes

I don't think the suggestions are necessarily better than the current names. Even though I admit some are a slightly improvement over before, the current ones are actually fine (and naming is also a bit of personal preference).

Doing that would mean that basically all core attributes would be renamed, and I can see that being a big frustration to users. They would need to rewrite their entire Taskfiles, they would need to re-learn a whole new schema (they are already used to the current one), the entire documentation would need to be rewritten, etc.

I see a possible automatic conversion as a non-ideal thing: it could not always work (the variable syntax will be different, and that's something it probably wouldn't be able to fix: {{.VAR}} -> {{V "VAR"}}). Also, the format (maybe comments, too) would be lost once you parse and re-generate the YAML file. I see that many would still prefer to migrate it manually (and we gonna have a migration guide, but the shorter it is, the better).

It would also make the implementation itself more difficult. If we mostly add new attributes, we can iterate over the current struct definitions (although the mentioned refactor to make the directory structure better is welcome). With a different (duplicated) struct definition, we'd have to duplicate all the code that handles it as well. Not really practical.

And again, it's not that breaking changes are not allowed, but I really want to avoid them when they don't bring enough benefits.


I hope you understand the reasoning. 🙂

My goal is to be practical and to both reduce the amount of work we need to do in order to reach the objective and also reduce the friction for users to upgrade and keep using their Taskfiles.

@ameergituser
Copy link

ameergituser commented Apr 20, 2022 via email

@danquah
Copy link
Contributor

danquah commented Apr 20, 2022

Strong agree here.

Renaming properties such as deps and desc would for sure improve the readability and make the initial work with Task easier for a newcomer. On the other hand, renaming those properties will break all current users who has a lot more hours invested with Task, and any tooling that interfaces directly with Taskfiles such as the VSCode extension and eg. code that scaffolds Taskfiles for projects.

Adopting a major version update may require users and tools to handle some number of breaking changes, thats ok, thats why we have major versions. But, I really think it is important that the value a given change gives in the long run should not only match, but outweigh the cost you are forcing the use to pay up front to adopt the change.

I quite like some of the new names, I'm just not seeing them make that much of a positive change in my future use of Task that I would be ok with having to commit updated versions of all of my Taskfiles.

@ghostsquad
Copy link
Contributor

Task1 needs Task2
Task1 depends on Task2

I suppose depends on reads easier.

Regarding other renames. I suppose I should verbalize why I want to rename them.

  • desc and summary feel like synonyms, and even as I type this, I can't remember what the difference is without looking at the documentation.

  • cmds aren't commands. They are things to do, which could include other tasks. So I just thought do is both simpler and more accurate.

  • sources & generates are probably ok. I thought maybe shortening these would be beneficial.

  • status is described as run if essentially. So I wanted to make that even simpler with just if.


Ok but let's assume I don't change the field names, I may still end up changing their value types in order to allow for the functionality I need. I'm still considering how/when to allow the schema to accept just a plain string (as a shortcut) vs object.


Regarding the vscode extension, I've already looked that up. Updating the schema here:

https://github.com/schemastore/schemastore/blob/master/src/schemas/json/taskfile.json

To include sub-schemas and conditional logic allows vscode and others to continue to work with v3 and v4 schemas based on the value of the version field in their file.

https://json-schema.org/understanding-json-schema/reference/conditionals.html


Happy to continue talks. I still have the impression that, as Task has grown organically, it is failing to provide consistency, it's hard to use "intuition" to find an answer (I'm constantly referring to the docs), and there features which many people want which are not backwards compatible.

Some of the features I'm working on aim to reduce the amount of stuff you have to write in your Taskfile's as well. So though there may be large task files out there, that might be because of limitations with Task itself. Meaning the burden remains in the user.

@ghostsquad
Copy link
Contributor

@postlund
Copy link

My two cents here, mostly in regards to including multiple Taskfile.yml in a large structure:

  • I have the same scenario as @tylermmorton, where I'm using task in a monorepo and use the same task structure in subdirectories. Would be great with some syntax so simplify writing these tasks.
  • In the same scenario, it makes a lot of sense to run commands relative to where the Taskfile.yml it is defined in. So some way to instruct a task to use that as dir instead of having to hard code it manually.
  • Variables local to the current Taskfile.yml only would be great too when working in a multi-level layout. In many situations I end up with the same token (e.g. path or name) in multiple tasks, which would be great to put in a variable and re-use.
  • Something like a "template task", which is a task that only can be called by other tasks and is not listed amongst regular tasks (e.g. with --list-all). I have lots of software components that are built in the same way, except for the name which varies. So I write a "generic" task and call that, passing name as a variable. That task should never be called manually though, so it just pollutes to output.
  • Not sure if this is already supported or not, but would be great if it was possible to customize the output of "up-to-date-tasks" a bit. As per above, I re-use tasks a lot and a typical output might look like task: Task "buildsystem:external_tool" is up to date, but since buildsystem:external_tool is called with various input, I have no idea what external tool this task refers to.
  • Support for a default task per Taskfile.yaml. So if I have something like this:

foobar/Taskfile.yml:

default:
  cmds:
    echo foobat

And this:

includes:
  foobar: ./foobar

I want task foobar: to mean foobar:default.

@amancevice
Copy link
Contributor

amancevice commented Apr 27, 2022

Just my 2¢, but I definitely prefer keeping the existing attribute naming conventions. A possible compromise is to add aliases for the existing ones so both will work.

My most-wanted breaking change would be to change the order of importance in the variable interpolation rules so that variables passed via the command line (eg, environment vars) take precedence over all. The reason for this is so users have the option to override variables at the command line.

My most-wanted non-breaking change is to allow users to re-color log lines (or disable coloring) via ENV variables or possibly a log: section in the Taskfile (I know I have an open PR attempting to address this; apologies for letting it go stale)

Thanks!

@ghostsquad
Copy link
Contributor

Ya I may just alias the fields.

@ghostsquad
Copy link
Contributor

ghostsquad commented Apr 27, 2022

With ENV Vars it's more complicated. If you look at Make, there are several rules for setting an an environment variable value.

https://www.gnu.org/software/make/manual/html_node/Setting.html#Setting

This functionality is something I sorely miss. I don't necessarily want all vars to be overwritten by the outside, I need choices.

@amancevice
Copy link
Contributor

@ghostsquad I wonder if some nested keywords could help enable this feature. Maybe something like (and I'm just spitballing here):

vars:
  # static var
  FIZZ: buzz

  # dynamic var
  JAZZ:
    sh: echo fuzz

  # static var (with expansion)
  FOO:
    default: bar
    expand: {recursive,simple,if_unset}
  
  # dynamic var (with expansion)  
  GOO:
    sh: echo zoo
    expand: {recursive,simple,if_unset}    

@ghostsquad
Copy link
Contributor

ghostsquad commented May 2, 2022

2022-05-02 Update

The syntax I'm working on continues to diverge from task v3. The more I think about how things should work, the more I try to simplify & make this intuitive, and the more posted issues I investigate, the more my design diverges.

Divergences

Here's some of the major divergences I'm currently experimenting with:

There are issues out there for all these things, but it would take me awhile to link to all them, so I may come back and do this.


Functions


A single task feels like a function. You have (optional) inputs, and there's also desire to have functions produce outputs.

More on this later...


Cmds vs Deps vs Vars


The difference between cmds and deps feels like it should be narrow, but in fact, it's quite hairy as currently implementing in task.

What I see is things running serially or in parallel.

Vars support sh which makes feel a bit like cmd.

Reality is that there's an "evaluation priority", and it's not immediately or intuitively clear what the evaluation priority is, or even should be.

This is fundamentally different and unintuitive when compared to a run-of-the-mill function such as in Go.

Example of some simple functions

func Build() {
  os.Exec("go build .")
}

func Test() {
  Build()
  os.Exec("go test ./...")
}

Task equivalent

tasks:
  build:
    cmds:
    - go build .
  test:
    deps:
    - build
    cmds:
    - go test ./...

  # OR

  test:
    cmds:
    - task: build
    - go test ./...

One recent change I found I really needed is the removal of deps entirely, and expand upon different "step types", like in ConcourseCI.

fns:
  build:
    do:
    - go build .

  test:
    do:
    - fn: build
    - go test ./...

Deciding what to do (Decisions & Memoization)


How would we handle, simply, various scenarios for deciding what to run?


Arbitrary Memoization

Function definition only memoization


Example of preventing the same task from running multiple times

cache := memoize.NewMemoizer()

func Foo() error {
  // Something expensive, e.g.
  time.Sleep(3 * time.Second)
  return nil
}

func FooMemoized() error {
  return cache.Run(Foo)
}

func Complicated() error {
  err := FooMemoized()
  if err != nil {
    return fmt.Errorf("fooing: %w", err)
  }
  return nil
}

The Future

fns:
  # optionally declare a non-memozied function
  # only required if there are cases where
  # you want to do/recalcuate every time it's called
  foo:
    do:
    - sleep 3

  foo:mem:
    memoize: true
    do:
    - fn: foo

  complicated:
    do:
    - fn: foo:mem
    - go test ./...

Sources/Generates


func GenHam() {
  hamRaw := "ham.raw"

  // get last modified time
  hamRawFile, err := os.Stat(hamRaw)
  // handle file doesn't exist
  hamRawModifiedTime := hamRawFile.ModTime()

  hamCookedFile, err := os.Stat(hamRaw)
  if err == nil {  
    hamCookedModifiedTime := hamCookedFile.ModTime()
    if hamRawModifiedTime < hamCookedModifiedTime {
      return
    }
  }

  cook(hamRawFile)
}

The above code can be abstracted into a reusable decision function, which then can be memoized. Noting that memoization takes into account all inputs, include the do function in as a combined cache key.

this is a naive example, since functions cannot be used as a map key. Instead, this would be implemented similar to how the template.FuncMap works, where functions are named, and thus we are caching the name, not the actual function.

func FileModifiedDecision(sources []strings, generates []strings, do func() error) error {
  greatestSourceModified := GetMaxModified(sources)
  greatestGeneratesModifed := GetMaxModified(generates)

  if greatestSourceModified > greatestGeneratesModifed {
    return do()
  }

  return nil 
}

This leads to the following:

fns:
  ham:
    memoize:
      sources:
        ham.raw: {}
      generates:
        ham.cooked: {}
    do:
    - cook ham.raw > ham.cooked

  breakfast:
    do:
    - fn: ham

Inputs & Outputs


// item should be a pointer
func Cook(item interface{}, temp int, timer time.Duration) {
  oven.Load(item)
  oven.SetTemp(temp)
  return oven.Start(timer)
}

cache := memoize.NewMemoizer()

func CookMemoized(item interface{}, temp int, timer time.Duration) {
  cache.Run(Cook, item, temp, timer)
}
fns:
  cook:
    memoize: true
    inputs:
      item:
        description: the food to cook
        required: true
        type: any
      temp:
        description: oven temperature needed
        default: 400
        type: number
      timer:
        description: how long to cook for
        required: true
        type: number
    do:
    - |-
      oven \
        --load {{.inputs.item}} \
        --temp {{.inputs.temp}} \
        --timer {{.inputs.timer}}

Parallelism


// purposefully over-simplified
func AllTheThings(things ...Runner) {
  var things []Runner

  things = append(things, newThing1()) 
  things = append(things, newThing2())
  things = append(things, newThing3())

  for _, thing := range things {
    go thing.Run()
  }
}
fns:
  all-the-things:
    do:
    - parallel:
      - thing1
      - thing2
      - thing3

There's even more I can write, but this post is already getting pretty long. The main point of this writing is that this is significantly diverging from Task, at this point.

Since backwards compatibility is still important for some folks, I've decided to move my work to go-fn/fn. This will allow me open/creative freedom to look at the space, requirements, use-cases, constraints, abstractions, etc and develop something without forcing me into a situation of trying to fit a square peg into a round hole.

@max-sixty
Copy link

My general thoughts — as a cheerleader but nothing more — are whether we can move in this direction without overhauling things we know have worked well, even though we won't get all the way right now. task has been really successful, and we should approach wholesale changes with humility / the principle of Chesterton's fence.

For example: it would indeed be good to have parallel tasks. But can we do that with cmds: being serial and parallel_cmds: / parallel: true / similar — rather than overhauling this into a do:?

(I like the inputs as an option! And the memoize as an advanced option, if we can keep the existing approaches working too!)

And ofc if you don't think task is the right place to start, then very reasonable to start a new fork and evaluate ideas there; I look forward to seeing how it progresses and hope it does well.

@andreynering
Copy link
Member Author

andreynering commented May 2, 2022

Hey everybody!

First of all, sorry for being a bit away. I've been indeed kinda busy recently.


@ghostsquad I hope you understand I'm looking for the best to the project, so please don't take it personally 🙂.

But I'm a bit worried that you're trying to propose too many changes to the project, and this comes with some drawbacks:

  1. This implies breaking changes to existing users (as we already talked about above)
  2. This may make v4 a work that will take too long to be released (I already experienced this in the past with v3 taking about 1 and half year, while I'd prefer to do smaller releases)
  3. With so many discussions, it makes it a bit hard to even decide what we want to do and how

Given the big changes proposed, as an alternative, you could consider a fork or even a brand new project (@max-sixty said a moment before me, but I was about to suggest the same 🙂). There's really no problem in doing that, if you feel like putting the effort into. It could actually make it easier for you to implement your ideas with more freedom.

If you prefer to work on Task, you're more than welcome 🙂, but smaller steps is what I propose to make it viable. Some of the principles I tried to explain in this issue are connected with this objective: avoiding breaking changes, iterating on existing code instead of re-writing, having proposals on separate and small PRs to make it easier to discuss in isolation, etc.


In short: everybody here has a full time job & life and limited time to dedicate to OSS. This means that to make it manageable (and avoid burnout) we need to be practical. In this context, not implementing every idea, and mostly doing small/backward-compatible iterations, is an intentional decision.

@ghostsquad
Copy link
Contributor

ghostsquad commented May 2, 2022

principle of Chesterton's fence.

I believe that everything I'm doing here is being done with respect to this principle. The way I solve problems is by looking at what the end result is, not necessarily how you are getting to that end result. Similar in game design, design elements either support the primary objective/theme, or get in the way. It's important to aggressively cull elements which don't support the primary objective/theme.

With that said, I'll also re-iterate that words matter, and readability matters, design principles matter, not only when it comes to understanding how something works, but by intuitively being able to use the tool. The tool should ideally should not get in the way of a user. It should not be the primary focus. It should get a job done, and get out of the way.

https://99percentinvisible.org/article/norman-doors-dont-know-whether-push-pull-blame-design/

In the book, Norman introduced the term affordance as it applied to design,[3]: 282  borrowing James J. Gibson's concept from ecological psychology.[1] Examples of affordances are flat plates on doors meant to be pushed, small finger-size push-buttons, and long and rounded bars we intuitively use as handles. As Norman used the term, the plate or button affords pushing, while the bar or handle affords pulling.[3]: 282–3 [4]: 9  Norman discussed door handles at length.

I can easily say that cmds aren't commands. They are either tasks or commands. What is a task? Well.. at it's most basic level, it's one or more commands run serially (cmds), but those tasks/commands should not run until zero or more dependent task/commands (deps) run. Oh by the way, deps run in parallel. The current command(s) and/or their dependencies may not even run... <insert complex explanation of when variables are evaluated, and deps are run, etc, etc>

I'd also like to iterate that I started this because I want to leverage all the creative work and interesting use cases that exist from users of Task. A Major Version bump should allow backwards incompatible affordances if they indeed provide value to users, which is why I started this journey as V4. I understand the pain in "migrating", and would absolutely, 100% be willing to make some sort of migration tool, but it likely wouldn't be exact, because in some or many cases it may be impossible to understand intent of the author from the code itself, but I definitely would try.

I also absolutely love the Zen of Python. I use it often when I'm designing something.

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Zen of Python

Trust me when I say, I've put a LOT of thought into this. 😉

In this context, not implementing every idea, and mostly doing small/backward-compatible iterations, is an intentional decision.

Small/backwards-compatible iterations are well-suited for continuing with V3. I've spent a significant amount of time looking at the code base, the issues that come up, and considering what changes could be done that are actually backwards compatible. I can think of many situations (and have posted in several issues) that result in "can't fix/change, it would not be backwards compatible". That sucks. 😞

@ghostsquad
Copy link
Contributor

@ghostsquad I hope you understand I'm looking for the best to the project, so please don't take it personally 🙂.

definitely not taking anything personally @andreynering - as I said, I'm here because task is cool, and I've learned a lot from it. It's simple, and yet powerful! I want to bring it to the next level, that is all. Sometimes that results in... standing on the shoulders of giants! 😉

Anyways, this has become quite a passion project of mine, because it closely aligns to work I do on a day-to-day basis. I'm an infrastructure automation engineer (some people call it devops, but that's a culture, not a title, IMO). So, literally, writing tools and services for use by developers is my day job. 😄

@prodigy-byron-jones
Copy link

I would love there to be a distinction between the kind of information summary provides. Right now if I ask my users to run that command to get the detailed usage instructions I've provided they additionally get all the code. This is entirely redundant to them (and in most cases they would go directly to the code anyways).

I would propose something like:

--summary (This is in essence what we have right now but without the `commands:` component
--detailed-summary (This gives the current functionality in all its glory)

Thank you,

@ghostsquad
Copy link
Contributor

I would love there to be a distinction between the kind of information summary provides. Right now if I ask my users to run that command to get the detailed usage instructions I've provided they additionally get all the code. This is entirely redundant to them (and in most cases they would go directly to the code anyways).

I would propose something like:

--summary (This is in essence what we have right now but without the `commands:` component
--detailed-summary (This gives the current functionality in all its glory)

Thank you,

What I am deriving from this request, is "show me the human-readable documentation without the code". Is this accurate?

@prodigy-byron-jones
Copy link

prodigy-byron-jones commented May 30, 2022

I would love there to be a distinction between the kind of information summary provides. Right now if I ask my users to run that command to get the detailed usage instructions I've provided they additionally get all the code. This is entirely redundant to them (and in most cases they would go directly to the code anyways).
I would propose something like:

--summary (This is in essence what we have right now but without the `commands:` component
--detailed-summary (This gives the current functionality in all its glory)

Thank you,

What I am deriving from this request, is "show me the human-readable documentation without the code". Is this accurate?

Correct. With a suggested implementation approach.

@ivotron
Copy link

ivotron commented Aug 9, 2022

hello! adding my selfish list of requests:

  • allowing to override env variables (Changing PATH env var doesn't seem to work #482)
  • support plugins to allow sharing of tasks. or define a list of built-in tasks to e.g. install tools (mimic what asdf does)
  • natively supporting running tasks in containers

thank y'all!

@amancevice
Copy link
Contributor

@ivotron the way I've gotten around your second point is to keep tasks in users' home directories and reference them there:

includes:
  example: ~/.task/Taskfile.yml

@ivotron
Copy link

ivotron commented Aug 9, 2022

thanks @amancevice. I meant sharing at community level, like ansible's galaxy or homebrew's formulas but for tasks

@amancevice
Copy link
Contributor

oh, ha, yeah that's a great idea

@andreynering
Copy link
Member Author

Hey everyone! 👋

I've been struggling to find enough time to make significant development on a next major version.

Because of this, and having in my own mental health in mind, I've decided to close this issue for now, to avoid any expectations that we'll see a v4 soon.

What this means is that, in the foresable future, I plan to focus mostly on stability and small features that fit the current v3 version (due to not being backwards incompatible).

I may still consider a next major version in the future when I decide it's the right time. When it happens, there's a chance that I reduce its scope to make it easier to ship.

I'm grateful for all the support from the community. In particular people that have been opening PRs with small changes have been a huge help to me. Others have been helping answering questions on GitHub or Discord, which also helps a lot. Thank you!

@andreynering andreynering unpinned this issue Aug 28, 2022
@andreynering andreynering closed this as not planned Won't fix, can't repro, duplicate, stale Aug 28, 2022
@ghostsquad
Copy link
Contributor

@andreynering ya, I haven't even had very much time to work on my own implementation.

@pd93 pd93 removed the v4 label Oct 15, 2022
@ReillyBrogan
Copy link
Contributor

Perhaps #852 would be a good inclusion for a future major version? Being able to have a DAG of tasks between different includes (modules) would be awesome to bring this more in line with other build tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests