Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CITATION.cff #891

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ShadowMitia
Copy link
Contributor

@ShadowMitia ShadowMitia commented Oct 20, 2021

To help visualise #848 I've created a branch with a sample file. The idea is to keep modifying it (by me and maintainers directly) until we're satisfied. But because of the visual aspect, we need a place to see what it looks like/generates.
All other contributors can open up PRs on my branch to suggest changes.

You can just view the result on https://github.com/ShadowMitia/algorithm-archive/tree/citation-file

@ShadowMitia
Copy link
Contributor Author

The file documentation is here for those interested https://github.com/citation-file-format/citation-file-format/blob/1.1.0/README.md#identifier-objects

CITATION.cff Outdated Show resolved Hide resolved
@Amaras
Copy link
Member

Amaras commented Oct 20, 2021

I think it looks good.
Of course, every maintainer (definitely) and contributor (maybe) that wants to be cited should have a chance to have their name in the file. (Yes, I want my name in there).
I think there should be another message, but that's still editable, of course

@ShadowMitia
Copy link
Contributor Author

I'm looking into the format for that. From what I understand there are two places we can have maintainers/contributors:

  • as authors obviously, but it might break software or sites? (Especially when it reaches hundreds of contributors).
  • in the references section, we can sort of have a contributor list? But then that list doesn't show up on the github stuff and probs not in the citation

@ShadowMitia
Copy link
Contributor Author

The references section has a lot of useful things for us I think. But I have NO IDEA how they show up in practice. But if I understand it correctly we could reference a lot of things association with AAA, including papers and contributors.

@Amaras
Copy link
Member

Amaras commented Oct 20, 2021

Maintainers and contributors who have a substantial amount of code in there should probably be listed as authors if they want. Let's say at least 10 PRs merged for a reasonable-ish cut-off point. I think maintainers who have less than those 10 PRs should still be included as authors if they have over 5 PRs merged.
However, that is still an estimate that I don't know the scope of.

@Amaras
Copy link
Member

Amaras commented Oct 20, 2021

After checking, the references section is basically for dependencies, so it's not really relevant for us, at least the way I understand it.
By the way, I should probably edit the CONTRIBUTORS.md file if I want my name in your file 😅

@ShadowMitia
Copy link
Contributor Author

Well technicallyt the website code is a dependency for the actual website 😁
It's probably not the best, but the other solution it seems is a huge list of authors.
I can try it just to see what it does x)

@ShadowMitia
Copy link
Contributor Author

Depending on how we finalise this file, it could replace CONTRIBUTORS.md completely.

This format has a validation tool with it. I would suggest adding the corresponding github action for it, or just ask people to run the validation before submitting a new version.

@ShadowMitia
Copy link
Contributor Author

One thing I'm not sure about, it considers that this repo is hosting something of type software (the other possible value being dataset). I think it's fine as is, but maybe it's not the correct way of describing this content?

If it is not the correct way, we can use the references section to more properly describe everything. But I'm not familiar enough with academic citations to understand all the subtleties 😁 😅

@ShadowMitia ShadowMitia marked this pull request as ready for review October 21, 2021 21:11
@ShadowMitia
Copy link
Contributor Author

ShadowMitia commented Oct 21, 2021

I think we're close to a first verison, so undrafting it. I'll clean up all the commits into one once all the reviewing is done (and don't hesitate on reminding me 😁 )

Copy link
Member

@leios leios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, I am a big fan of this! I am a bit weary to put everyone in the citation file without their explicit permission, though. I feel we might need to tag each one and ask for their permission.

If they do not accept, they should stay in the Contributors.md file.

As a note: I think moving forward, we should use the citation.cff exclusively and move away from the contributors.md file

CITATION.cff Outdated
alias: Leios
orcid: "https://orcid.org/0000-0002-3243-8918"
- family-names: Mazzuca
given-names: Nicole
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Nicole said they do not want to be officially part of this project, due to some philosophical differences with how we write idiomatic C++ code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they provided code or help for other things they still should be here 😁 (If they want)

CITATION.cff Outdated Show resolved Hide resolved
CITATION.cff Outdated
Comment on lines 63 to 66
- family-names: Boyles
given-names: William
- family-names: Weinstein
given-names: Max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we organize all of these names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely going to be a big mess 😅

CITATION.cff Outdated Show resolved Hide resolved
CITATION.cff Outdated
- "open research"
- "data structures"
- "collection of data structures"
type: software
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other types? I don't know if software is right, but it's also not wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is where the file format gets a bit weird. Apparently it assumes the repo is either for "software" or "dataset". These are the only allowed types at that level of the file.

In the "references" you can have way more types, but it's just things that reference this file I guess? So I've been assuming that we're talking about the code for the website (which you can think of as a software you can run on your machine as well), but I agree it's not the best thing.

CITATION.cff Outdated
- "data structures"
- "collection of data structures"
type: software
commit: 16fd2180041821f8ee53ece14c535e5b25de34fe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This essentially counts as a release for citations, right? That is to say that when people cite this work, they will be citing this, specific commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was just trying something out. I thought that the file required a "version" to be attached, but it wasn't. But I was thinking maybe it made sense to be able to cite the website when it was in a specific state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we should omit this for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be a good idea to tie this to a certain release (Maybe V 0.2021.1)? That way if people do cite this work, they can easily find the correct version. I also think zenodo links will usually be related to a specific release anyway, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think zenodo links will usually be related to a specific release anyway, right?

Correct.

@ShadowMitia
Copy link
Contributor Author

I never thought about asking permissions 😅
I put everyone there, mainly as a good source to try out the limits of the file.
We should contact everyone, but what do we do if someone doesn't answer for example?

@ShadowMitia
Copy link
Contributor Author

If we have to ask for approval maybe I should empty all the names and everyone adds themselves again?

@Amaras
Copy link
Member

Amaras commented Oct 23, 2021

If we have to ask for approval maybe I should empty all the names and everyone adds themselves again?

That is probably the best idea, even if that can become quite messy indeed.

@ShadowMitia
Copy link
Contributor Author

ShadowMitia commented Oct 23, 2021

I'm looking at the citations.cff repo. I'll probably raise an issue about citing a ressource or website instead of just software and dataset, and see what they recommend.

I'm listing a coupe of issues in the meantime that might be interesting:

CITATION.cff Outdated Show resolved Hide resolved
CITATION.cff Outdated Show resolved Hide resolved
@leios
Copy link
Member

leios commented Oct 24, 2021

To the discussion about what to do with people who do not want to be cited, here's a few solutions:

  1. Only put people in the Citation.cff file if they have ORCID id's. This trims down the author list and makes sure that people who need citations get the citations. It also ensures that people using the AAA for academic work can contact other academics. Everyone else can be in a more readily available contributor's chapter in the AAA, maybe something like this: https://github.com/all-contributors/all-contributors
  2. Have different tiers of contributions. The Citation.cff file could be used for people who have contributed a "significant amount" to the AAA. It's a bit hard to figure out what significant means here, but it's an idea. We could use the dev team, maybe? Again, we need a better contributor's file here as well.
  3. Just have everyone who opts in to be in the citation file

In all the cases, the citation file should be opt-in, in my opinion. Also: for 1 and 2, we need a way to cite "everyone else" and link to the right page, so maybe a ORCID for the AAA organization?

@ntindle
Copy link
Member

ntindle commented Oct 24, 2021

Heavy agree on those who have an orcid and have them self add. Those who care about it would need to do it in a PR if they aren’t a maintainer and we would be able to approve based on past work. The orcid limitation wouldn't really be a problem as I personally made one as did @Amaras to make sure we would appear.

@ShadowMitia
Copy link
Contributor Author

I've heard zenodo mentionned several times, this looks like this could be relevant https://guides.github.com/activities/citable-code/

@leios
Copy link
Member

leios commented Oct 24, 2021

I was mentioning zenodo because it looks like this file creates a zenodo link. I am not sure if that link continually changes with each commit, though

@ShadowMitia
Copy link
Contributor Author

I don't know about each commit but from citation-file:

When you publish your software on Zenodo via the GitHub-Zenodo integration, they will use the metadata from your CITATION.cff file.

(Should've added that sorry 😅)

@leios
Copy link
Member

leios commented Oct 24, 2021

Oh, I didn't know it went the other way as well. Ok, that's nice.

@ShadowMitia ShadowMitia marked this pull request as draft October 30, 2021 16:10
@ShadowMitia
Copy link
Contributor Author

I got a response. The CITATION.cff file is only designed for citing software and datasets. Maybe we could use references inside to point to all the resources of the AAA (site, papers, etc).

There might be something to do with prefered-citation which I'm looking into.

The other solution is to have a bibtex directly in the project. It won't be parsed by github, but it will still show up on the right for quick access. But won't autogenerate anything.

@ntindle
Copy link
Member

ntindle commented Nov 4, 2021

I would say we are software and we can start versioning on every commit to master?

@ShadowMitia
Copy link
Contributor Author

ShadowMitia commented Nov 4, 2021

We are software, but the thing we went to cite is the content in that software. It works as a workaround, but it's still not 100% what the citation file is designed for.

@ShadowMitia
Copy link
Contributor Author

I've added a small CITATION.bib for reference. This would get picked up by github but would only point to that file.

@ntindle
Copy link
Member

ntindle commented Nov 5, 2021

Right on it’s not what it’s designed for but I figure it’s better than nothing? I would prefer we use the cff a bit loosely as they are still releasing new versions that a bib that’s not very flexible

@ShadowMitia
Copy link
Contributor Author

Oh wait hang on, there is a section on github on how to use CITATION.cff for other things : https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files#citing-something-other-than-software....

So actually we should be good to go on CITATION.cff. It probably won't show up perfectly inside github, but then we can get zelendo and zenodo integrations, and that should work fine.

So the only real remaining question is : what do we do about authors/contributors for the citation?
Do we only stick to authors? (And who counts as one)
Do we need to have all the contributors in the file, or do we only cite them as "Contributors of AAA" or something?
If we want them in the citation, until we get a contributor tag, do we put them as authors or do we put them in a reference section somewhere?

Let me know if I'm missing anything else.

@ntindle
Copy link
Member

ntindle commented Nov 15, 2021

I will answer the above based on my opinions but I think the ultimate answer falls to @leios

Do we only stick to authors? (And who counts as one)

Add All existing people from the contributors.md to the list. Ping them all in one big notification on this repo and give them a week to opt out.

Do we only stick to authors? (And who counts as one)

Until we define what an author is, I feel it is unfair to exclude people

Do we need to have all the contributors in the file, or do we only cite them as "Contributors of AAA" or something?

This replaces the contributers.md in my mind

If we want them in the citation, until we get a contributor tag, do we put them as authors or do we put them in a reference section somewhere?

Add them as authors for now, ranked by contribution count. Possibly with maintainers first (selfishly, this is because I wouldn't likely be cited and would be in the et. others. If this isn't a priority, I don't mind being near the bottom)

@ntindle ntindle requested review from leios, berquist and Amaras November 15, 2021 23:27
@Amaras
Copy link
Member

Amaras commented Nov 16, 2021

Do we only stick to authors? (And who counts as one)

Add All existing people from the contributors.md to the list. Ping them all in one big notification on this repo and give them a week to opt out.

I don't feel right making this list opt-out, since I assume consent is not given by default (it *could" be opt-out for maintainers and "authors" once we define that term though)

Do we only stick to authors? (And who counts as one)

Until we define what an author is, I feel it is unfair to exclude people

It feels unfair to exclude people, but it also feels unfair that we "forcefully" include inactive people who don't read their GH notifications.

Do we need to have all the contributors in the file, or do we only cite them as "Contributors of AAA" or something?

This replaces the contributers.md in my mind

Same for me.

If we want them in the citation, until we get a contributor tag, do we put them as authors or do we put them in a reference section somewhere?

Add them as authors for now, ranked by contribution count. Possibly with maintainers first (selfishly, this is because I wouldn't likely be cited and would be in the et. others. If this isn't a priority, I don't mind being near the bottom)

+1 on that: Leios first, then other optionally maintainers ranked by contributions, then authors tanked by contributions. With a possibility to move down on demand, of course

@berquist
Copy link
Member

Do we only stick to authors? (And who counts as one)

Add All existing people from the contributors.md to the list. Ping them all in one big notification on this repo and give them a week to opt out.

I don't feel right making this list opt-out, since I assume consent is not given by default (it *could" be opt-out for maintainers and "authors" once we define that term though)

I am in camp opt-out.

Do we only stick to authors? (And who counts as one)

Until we define what an author is, I feel it is unfair to exclude people

It feels unfair to exclude people, but it also feels unfair that we "forcefully" include inactive people who don't read their GH notifications.

Do we need to have all the contributors in the file, or do we only cite them as "Contributors of AAA" or something?

This replaces the contributers.md in my mind

Same for me.

Authors are those in CONTRIBUTORS.md, and this replaces that file.

If we want them in the citation, until we get a contributor tag, do we put them as authors or do we put them in a reference section somewhere?

Add them as authors for now, ranked by contribution count. Possibly with maintainers first (selfishly, this is because I wouldn't likely be cited and would be in the et. others. If this isn't a priority, I don't mind being near the bottom)

+1 on that: Leios first, then other optionally maintainers ranked by contributions, then authors tanked by contributions. With a possibility to move down on demand, of course

I abstain from this point other than to say one model you can consider that other large (chemistry) software projects adopt is to have a largely alphabetical ranking, then have the first and/or last author(s) be those that are the brainchildren. Example, Example

@ntindle
Copy link
Member

ntindle commented Nov 16, 2021

If we want them in the citation, until we get a contributor tag, do we put them as authors or do we put them in a reference section somewhere?

Add them as authors for now, ranked by contribution count. Possibly with maintainers first (selfishly, this is because I wouldn't likely be cited and would be in the et. others. If this isn't a priority, I don't mind being near the bottom)

+1 on that: Leios first, then other optionally maintainers ranked by contributions, then authors tanked by contributions. With a possibility to move down on demand, of course

I abstain from this point other than to say one model you can consider that other large (chemistry) software projects adopt is to have a largely alphabetical ranking, then have the first and/or last author(s) be those that are the brainchildren. Example, Example

Alphabetical other than X (x to be determined if just @leios) is very reasonable to me.

@ntindle ntindle requested a review from jiegillet November 16, 2021 04:29
Copy link
Member

@jiegillet jiegillet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't follow all the discussions extremely closely, but it looks like something worth doing.

CITATION.cff.old Outdated Show resolved Hide resolved
@leios
Copy link
Member

leios commented Dec 4, 2021

For authorship order, we are currently either considering Alphabetical or Activity-based right?

The other option is to have me first, then the algorithm archivists (maintainers) in alphabetical order, followed by everyone from the CONTRIBUTORS.md file in alphabetical order? I guess maintainers are opt-out, while contributors are opt-in?

For this, I guess we just ping all the maintainers here and ask if they are ok with this. If not, we can take them off the list. We should maybe give them a week to respond.

Afterwards, we ping all the contributors. We might want to give everyone 2 weeks to respond to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants