Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LangProbe Benchmark Updates #1593

Merged
merged 8 commits into from
Oct 7, 2024
Merged

LangProbe Benchmark Updates #1593

merged 8 commits into from
Oct 7, 2024

Conversation

klopsahlong
Copy link
Collaborator

@klopsahlong klopsahlong commented Oct 7, 2024

  • Adding the following tasks to the LangProbe Benchmark: HotpotQA Conditional, Iris, Iris-Typo, HoVer, and Heart Disease
  • Made a few minor improvements to the testing set up including:
    • minor to the README, which now provides instructions using dspy.LM
    • adding in max_output_tokens() function for each task so that these can be set for the task model on a task by task basis
    • improving logging
  • Added to this PR are also minor improvements to the hackercup_utils.py script, which now suppresses print outs when executing generated code

@okhat okhat merged commit 6a00c85 into main Oct 7, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants