Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a new .taskrc.yml to enable experiments #1982

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

vmaerten
Copy link
Member

@vmaerten vmaerten commented Dec 31, 2024

I like this idea !
This file should override the environment variable.

@vmaerten vmaerten linked an issue Dec 31, 2024 that may be closed by this pull request
@vmaerten vmaerten marked this pull request as ready for review December 31, 2024 16:09
website/docs/experiments/experiments.mdx Show resolved Hide resolved
internal/experiments/experiments.go Outdated Show resolved Hide resolved
internal/experiments/experiments.go Outdated Show resolved Hide resolved
@trulede
Copy link

trulede commented Jan 2, 2025

Did you ever consider multidoc yaml files? Then you can have many schemas within a single Taskfile.yml, and solve other interesting problems related with advanced Task integrations (see #1916).

Changes in and around reader.go, but nothing significant.

---
version: 3
tasks: ...
---
task-experiments:
  TASK_X_FOO: BAR
---
foreign-schema:
  ignored-by-task: ...

The current mechanism for enabling experiments is a bit cumbersome. It would be ideal to have the option to also configure this directly in the Taskfile. Additional files is perhaps not helping so much, for instance with remote taskfiles where the remote taskfile would need one of the experiments enabled.

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

Great review as always @pd93 thanks!
Creating the PoC didn’t take me too long, so I am to rework and/or discard if we do not want

Regarding the filename, how would you like to proceed? Should I post a message on Discord to discuss it?

@trulede, about configuring in the Taskfile itself, @pd93 already answered here #1978 (comment)

It can be tricky to manage if, for instance, three includes define the same experiments (e.g., ANY_MAPS) but with different values.

@vmaerten vmaerten requested a review from pd93 January 2, 2025 12:49
@trulede
Copy link

trulede commented Jan 2, 2025

@trulede, about configuring in the Taskfile itself, @pd93 already answered here #1978 (comment)

I was suggesting using a multi-doc yaml file, containing the Taskfile (schema) and other schemas in subsequent documents contained within the same yaml file. It just something to consider as you can place multiple docs/schemas in a single YAML file.

The question I might have in respect to this PR is why not use the existing mechanism dotenv where I can give the "experiments" file any name (and path) I want? Yes, the .env file was problem, and any new name for the file is not going to be any better.

There is a more fundamental problem here; if I design a Taskfile to use an experimental feature, then that feature may no longer optional, so I need a way to ensure that its enabled. Adding a new schema to the YAML file which contains the taskfile schema would be a handy way to achieve that.

Adding a .env or .task-experiments.yml file suggests to me that a more general problem is emerging, and that a more durable solution could also be possible. Embedding new schemas in the Task file, and/or loading those configuration schemas from the traditional locations (e.g. /usr, /usr/local, ~/ or the taskfile.yml itself) would create a more fine grained configuration system ... or at least the potential for one.

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

I was suggesting using a multi-doc yaml file, containing the Taskfile (schema) and other schemas in subsequent documents contained within the same yaml file. It just something to consider as you can place multiple docs/schemas in a single YAML file.

Although putting multiple schemas in a single YAML file is possible, I'm not keen on this approach. Experiments were not designed to be used in production (hence the name) and I have no intention to design large features just to support cases where users have experiments enabled across teams. I would prefer to keep experimental code far away from the schema as experiments are inherently very changeable.

However, I do understand that people will and are using experiments across teams and that there needs to be some way (as @nathanperkins requested) to allow this. .env works fine for this and remains a very simple feature, but may conflict with other config files in a user's repo. This is why a Task-specific alternative was mentioned.

The question I might have in respect to this PR is why not use the existing mechanism dotenv where I can give the "experiments" file any name (and path) I want? Yes, the .env file was problem, and any new name for the file is not going to be any better.

Can you explain (other than file conflicts) why you think the .env file is a problem?

Another reason that using the schema (including the dotenv) is not practical is because the Taskfile is not read/parsed and the dotenvs are not loaded until a significant amount of the code has executed. We may need/want to have experimental flags for things that run before a Taskfile is parsed. This is why the experiment flags are one of the first things to be evaluated in our code.

There is a more fundamental problem here; if I design a Taskfile to use an experimental feature, then that feature may no longer optional, so I need a way to ensure that its enabled. Adding a new schema to the YAML file which contains the taskfile schema would be a handy way to achieve that.

Maybe I'm misunderstanding, but as far as I can tell loading an extra file such as .env or .task-experiments.yml which is committed to the same repo fixes this problem.

Adding a .env or .task-experiments.yml file suggests to me that a more general problem is emerging, and that a more durable solution could also be possible. Embedding new schemas in the Task file, and/or loading those configuration schemas from the traditional locations (e.g. /usr, /usr/local, ~/ or the taskfile.yml itself) would create a more fine grained configuration system ... or at least the potential for one.

Loading from /usr or /usr/local would not solve sharing experimental Taskfiles. I'm open to a more granular configuration scheme with hierarchy etc, but this feels like a separate feature (albeit one that can also enable experiments if we want).

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

Regarding the filename, how would you like to proceed? Should I post a message on Discord to discuss it?

I think we can just leave this issue open for a bit (no rush to merge) and let @andreynering and other members of the community leave some feedback on the change and filenames. Feel free to put a message on Discord that links people here, but not everyone has a Discord account, so GitHub is a more transparent place for this feedback.

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

I think we can just leave this issue open for a bit (no rush to merge) and let @andreynering and other members of the community leave some feedback on the change and filenames.

Of course, no need to rush at all 🙂

For the filename, I can think at :

  1. .task-experiments.yaml
  2. .task-config.yaml
  3. .task-settings.yaml

Feel free to comment and suggest filename

@andreynering
Copy link
Member

Hey guys,

What problem are we solving by having a separate config file? Can't the user just commit the .env file into the repository? I'm not saying I'm against the idea, just trying to understand if it is worth the added complexity.

If we decide to proceed, I strongly suggest .taskrc.yml, becase it's shorter and the rc suffix is a very stablished name for config files in the Unix world. Linters in the JS world also use that (.eslintrc.json, for example).

/cc @vmaerten @pd93

@andreynering
Copy link
Member

Also, a more generic name means we can expand to contain more settings in the future if needed, so having a specific name like .task-experiments.yaml I wouldn't recommend.

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

What problem are we solving by having a separate config file?

The only problem I really have with .env is the fact that people may be using it for something else already and not want to pollute it with our settings. This all spun out of #1978 (comment).

If we decide to proceed, I strongly suggest .taskrc.yml

Not against this given the industry standard, though I always found this precedent weird since rc is supposed to be short for runcom or run commands which a settings file does not do 🤷

Also, a more generic name means we can expand to contain more settings in the future if needed, so having a specific name like .task-experiments.yaml I wouldn't recommend.

The experiments specific file was really just to avoid backwards incompatible changes with schemas. For example, if we include settings in the same file (regardless of filename), we will occasionally be removing valid settings from the config file. Having said that, we could just set the schema to a map[string]int for the experiments and ignore invalid keys.

@andreynering
Copy link
Member

From Claude (pasting just out of curiosity):

In configuration files like .bashrc, what does rc means?

The "rc" suffix in configuration filenames like .bashrc comes from Unix history. It stands for "run commands" or "runtime configuration." This naming convention dates back to the early days of Unix, specifically to the "/etc/rc" script that was used during system startup in 1977's UNIX 7th Edition.

The "rc" suffix became a convention for naming configuration files after that, and you can see it in many other examples:

  • .vimrc (Vim configuration)
  • .zshrc (Zsh shell configuration)
  • .npmrc (NPM configuration)
  • .screenrc (GNU Screen configuration)

This convention has persisted through decades of Unix and Linux development, though it's interesting to note that some modern programs opt for different naming schemes like using .config directories or .toml/.yaml files instead.

It's also used by some linters, right?

Yes, many linters and code quality tools use the "rc" suffix for their configuration files. Some common examples include:

  • .eslintrc - ESLint configuration (JavaScript/TypeScript linting)
  • .prettierrc - Prettier configuration (code formatting)
  • .babelrc - Babel configuration (JavaScript transpiling)
  • .stylelintrc - Stylelint configuration (CSS linting)

Many of these tools also support alternative formats with additional extensions like .eslintrc.json, .eslintrc.js, or .eslintrc.yaml, giving developers flexibility in how they want to write their configurations. However, the base "rc" suffix remains as a nod to the Unix heritage we discussed earlier.


Having said that, we could just set the schema to a map[string]int for the experiments and ignore invalid keys.

Yes, in theory we could just ignore any values. That said, I like the @nathanperkins idea (mentioned here and here) that we would error if an unrecognized / retired experiment is enabled, to avoid surprises to the user.

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

Not against this given the industry standard, though I always found this precedent weird since rc is supposed to be short for runcom or run commands which a settings file does not do 🤷

From ChatGTP, it seems it also mean "runtime configuration" . We also have .npmrc

The term rc in file names like .eslintrc.json stands for "run commands" or "runtime configuration".
It indicates that the file contains configuration settings for a specific tool or application.


Also, a more generic name means we can expand to contain more settings in the future if needed, so having a specific name like .task-experiments.yaml I wouldn't recommend.

I also put this idea on the table here #1978 (comment)

The experiments specific file was really just to avoid backwards incompatible changes with schemas. For example, if we include settings in the same file (regardless of filename), we will occasionally be removing valid settings from the config file. Having said that, we could just set the schema to a map[string]int for the experiments and ignore invalid keys.

Yes, I started type as map[string]int. We could have a more global filename with a key like experiments typed as map[string]int, and ignore irrelevant keys / values

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

Ahah @andreynering I just did the same with ChatGPT 😂

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

@andreynering @vmaerten Interesting to see your AI tools of choice 😆

TIL it stands for runtime configuration too. I had only ever heard the runcom thing. In that case, this makes a lot more sense and I'm totally fine with it.

Sounds like the consensus is .taskrc.yml and/or .taskrc.yaml then? We can reuse this going forwards for other thing if we want and for now it can just contain an experiments key of map[string]int.

I fine with erroring on unrecognised experiments. There is actually a leftover line that does this for the old ANY_VARIABLES experiments anyway.

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

Sounds like the consensus is .taskrc.yml and/or .taskrc.yaml then? We can reuse this going forwards for other thing if we want and for now it can just contain an experiments key of map[string]int.

Yes, at least for me!

I fine with erroring on unrecognised experiments. There is actually a leftover line that does this for the old ANY_VARIABLES experiments anyway.

I'm fine with it but it could / should be done in another PR

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

So for example, to enable map variable second version, we could do :
.taskrc.yml :

experiments:
  MAP_VARIABLES: 2

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

Question: Do we want a schema version like we have in the main Taskfile. If so, should it match the Taskfile schema version? i.e.

version: 3

experiments:
  MAP_VARIABLES: 2

My opinion is probably yes to both of these. Main reason being that it might be confusing to have different versions for different things.

@andreynering
Copy link
Member

Question: Do we want a schema version like we have in the main Taskfile?

Good point. I'm not sure, because I don't expect this schema to change much, if at all.

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

Good point. I'm not sure, because I don't expect this schema to change much, if at all.

Haha, I'm gunna screenshot this comment and save it for a couple of years 😉

@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

Haha, for now, we'll only have experiments in it. Maybe we can delay the decision until we add other things?

@vmaerten vmaerten changed the title feat: add a new .task-experiments.yml to enable experiments feat: add a new .taskrc.yml to enable experiments Jan 2, 2025
@vmaerten
Copy link
Member Author

vmaerten commented Jan 2, 2025

I've updated the code based on what we discussed. Currently, the parsing is done in experiment, but we'll move it to a more global scope if we decide to use this file for other settings.

I've also created a JSON Schema hosted directly in our website.

For now, I did not add version but it can be challenged

@trulede
Copy link

trulede commented Jan 2, 2025

If you put an "experiment" into the Task executable, it's not an experiment, it's a feature. I understand the intention, but in doing so its created a config issue, and a difficult user experience. A plugin mechanism might be better, and open up Task to external developers.

Does Task really need a config? IMO nothing is being achieved with the current mechanism. If I design a task which uses an "experimental" feature, then I want to use that feature. Having some kind of enabling mechanism is just complicating the use of Task. Nothing else.

In any case, I suggest following the traditional patterns for config files (git does config this way). Start with something general (taskconfig) and then consider where to put it. No suggesting to implement it, just as a way of thinking about the problem.

  • /usr/local/etc/taskconfig - system wide
  • ~/.taskconfig - global for the user
  • repodir/.taskconfig - repo only
  • Taskfile - local (to the Taskfile) as additions schema docs ... (git does this too, repo local settings), you should be more open to that ... it would enable some interesting mechanisms.
  • CLI options
  • environment variables - always the highest priority (since CLI options might be baked into a container image)

You can take it a long way ... that is the problem with configs.

@trulede
Copy link

trulede commented Jan 2, 2025

If we decide to proceed, I strongly suggest .taskrc.yml

Not against this given the industry standard, though I always found this precedent weird since rc is supposed to be short for runcom or run commands which a settings file does not do 🤷

Correct. rc is not a great choice.

@pd93
Copy link
Member

pd93 commented Jan 2, 2025

If you put an "experiment" into the Task executable, it's not an experiment, it's a feature.

I don't understand what the alternative to having experiments in the executable would be. Where else would they go? Experiments are experimental features. They are things that we want to get feedback on before they are added without us having to make backwards compatible changes.

I understand the intention, but in doing so its created a config issue, and a difficult user experience.

I don't believe there is a config issue. The way it works today is fine and has been since it was introduced. We're simply introducing an alternative to .env files so that conflicts do not occur. You still haven't clearly articulated what you think the problem with this approach is.

A plugin mechanism might be better, and open up Task to external developers.

Plugins are complex to develop in a cross-platform way. Also, plugins tend to "plug in" to a specific part of a system. Experiments need to be able to modify any aspect of the code. Plugins are not a suitable solution here.

Does Task really need a config? IMO nothing is being achieved with the current mechanism. If I design a task which uses an "experimental" feature, then I want to use that feature. Having some kind of enabling mechanism is just complicating the use of Task. Nothing else.

Experiments aren't always a case of "use it or don't". They are allowed to (and usually do) change behaviour. This means two users with the same task can see different behaviour if they have different experiment flags enabled. Its not complicating anything - It is necessary to stop us from breaking users projects.

In any case, I suggest following the traditional patterns for config files (git does config this way). Start with something general (taskconfig) and then consider where to put it. No suggesting to implement it, just as a way of thinking about the problem.

Yes, if we have a proper settings config, then maybe one day it will be hierarchical like you suggested, but this is a long way out of scope for this issue/PR. We are not implementing settings here. We are implementing an alternative way of enabling experiments which may use the same file as a potential future settings file.

Correct. rc is not a great choice.

As discussed in this thread rc is perfectly acceptable as a "runtime configuration" file. Nothing wrong with it.

Copy link
Member

@pd93 pd93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code based on what we discussed. Currently, the parsing is done in experiment, but we'll move it to a more global scope if we decide to use this file for other settings.

Looks good! I left a couple of minor comments

I've also created a JSON Schema hosted directly in our website.

Once the PR is merged we should submit this to https://github.com/SchemaStore/schemastore

Reference to the main Taskfile schema (It just refers to the one we host on https://taskfile.dev) and the catalogue entry

For now, I did not add version but it can be challenged

I personally lean towards adding it now so that we don't have to maintain support for no version when we inevitably decide to add it later. However, lets get everyone aligned on this before you change anything.

}

type ExperimentConfigFile struct {
Experiments map[string]string `yaml:"experiments"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Currently the only acceptable values for experiments are integers (1, 2). However, technically the value is saved as a string (mostly because env vars are always strings). If we think that we will never use anything other than integers, then maybe the values in the config file should be ints:

Suggested change
Experiments map[string]string `yaml:"experiments"`
Experiments map[string]int `yaml:"experiments"`

Open to ideas about why we'd want to support strings though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pd93 Good question!

Since experiment values are typed as strings (because environment variables are strings), I considered two options:

  1. Define Experiments map[string]string with yaml:"experiments" as map[string]string, allowing the parser to handle the conversion. This approach lets me keep the existing code.
  2. Define Experiments as map[string]int and modify the code to convert all environment variables to integers.

I choose the first one to keep the existing code but you may be right, we could / should define that experiment should be only int

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've converted everything to int :)

internal/experiments/experiments.go Outdated Show resolved Hide resolved
website/docs/experiments/experiments.mdx Show resolved Hide resolved
@trulede
Copy link

trulede commented Jan 2, 2025

If you put an "experiment" into the Task executable, it's not an experiment, it's a feature.

I don't understand what the alternative to having experiments in the executable would be. Where else would they go? Experiments are experimental features. They are things that we want to get feedback on before they are added without us having to make backwards compatible changes.

I understand the intention, but in doing so its created a config issue, and a difficult user experience.

I don't believe there is a config issue. The way it works today is fine and has been since it was introduced. We're simply introducing an alternative to .env files so that conflicts do not occur. You still haven't clearly articulated what you think the problem with this approach is.

The problem with .env as a file name is already clear I would think.

It works, yes, but operationally the use of experiments is a nuisance. I understand the intention, but here is how it goes in a development cycle:

1/ Decide to use a remote task file. Actually, we use versioned Taskfiles (git tags) and associated containers, from remote repos - so this is a very powerful usage pattern.
2/ Implement that the easiest way, which currently is by using the remote taskfile feature.
3/ Overcome the operational complexity of Task - all related to the "experimental" behaviour. We can do this several ways, but for now have a shell script which automates Task (yeah, really ...) and sets environment variables.
4/ Document how to overcome the operational complexity of Task, for other developers and users.
5/ Hope that people only use the shell script, because no one ever reads the docs.
6/ Accept that the behaviour of remote taskfile feature will change in the future, and we will have to adapt to that.

Point 6 is important. We accept the behaviour will change, or disappear.
Point 3 is illustrative as to how we deal with Task operational difficulties. Note, the Taskfile is not in the root of the repo, and we avoided the complexity of a separate config file, or even thinking about it! It's just a mess to even consider - Task can be frustrating like that.

Now you get the feedback, I guess. Feature is great, operation is very un-task-like.

Correct. rc is not a great choice.

As discussed in this thread rc is perfectly acceptable as a "runtime configuration" file. Nothing wrong with it.

It's not a great choice. 'rc' historically stands for "run commands" which is semantically different to configuration.

Also, are you sure about versioning and a schema for a config file? Take a look at how other config files work; git or docker; that is not normally how config files are implemented. Of course, nothing wrong with it either.

It's much easier if switches for behaviour are represented in some kind of normal way. Environment variables, at the root of the taskfile schema, would be ideal. Simple, versioned schema already exists, can simply parse the taskfile early in the process and extract the necessary content. No practical difference to the current .env file parsing, really.

Only seems to be adding complexity to me. But as I explained above, I've avoided the config file, so I have no skin in the game.

@vmaerten
Copy link
Member Author

vmaerten commented Jan 4, 2025

Once the PR is merged we should submit this to SchemaStore/schemastore

Reference to the main Taskfile schema (It just refers to the one we host on taskfile.dev) and the catalogue entry

Noted I'll do it :)

I personally lean towards adding it now so that we don't have to maintain support for no version when we inevitably decide to add it later. However, lets get everyone aligned on this before you change anything.

Actually both are fine to me, but yeah let's wait a bit to get everyone aligned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support experiment enablement in Taskfile schema
4 participants