Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support returning logprobs in Predictor #1895

Merged
merged 4 commits into from
Dec 10, 2024
Merged

Conversation

veronicalyu320
Copy link
Contributor

@veronicalyu320 veronicalyu320 commented Dec 6, 2024

This PR allows Predict to return logprobs of each token as part of Prediction.

Example usage

If you want logprobs:
Set logprobs=True and optionally top_logprobs (int between 0 and 20, see openai doc for details):

import dspy
from dspy import Predict

predict_instance = Predict("question -> answer")
lm = dspy.LM("gpt-4o-mini", logprobs=True)
dspy.configure(lm=lm)
result = predict_instance(question="Where is the Eiffel Tower located?")
print(result)

Output:

Prediction(
    answer='The Eiffel Tower is located in Paris, France, on the Champ de Mars near the Seine River.',
    logprobs={'content': [{'token': '[[', 'bytes': [91, 91], 'logprob': -2.1008714e-06, 'top_logprobs': []}, {'token': ' ##', 'bytes': [32, 35, 35], 'logprob': -1.9361265e-07, 'top_logprobs': []}, {'token': ' answer', 'bytes': [32, 97, 110, 115, 119, 101, 114], 'logprob': 0.0, 'top_logprobs': []}, {'token': ' ##', 'bytes': [32, 35, 35], 'logprob': -8.180258e-06, 'top_logprobs': []}, {'token': ' ]]\n', 'bytes': [32, 93, 93, 10], 'logprob': -0.00010926496, 'top_logprobs': []}, {'token': 'The', 'bytes': [84, 104, 101], 'logprob': 0.0, 'top_logprobs': []}, {'token': ' Eiffel', 'bytes': [32, 69, 105, 102, 102, 101, 108], 'logprob': 0.0, 'top_logprobs': []}, ...], 'refusal': None}
)

If you don't want logprobs:
The usage and behavior is the same as before:

... # everything else stays the same
lm = dspy.LM("gpt-4o-mini")
...

Output:

Prediction(
    answer='The Eiffel Tower is located in Paris, France, on the Champ de Mars near the Seine River.'
) 

Caveat

If an LM (e.g. o1-mini) doesn't take logprobs as argument and you still set dspy.LM(..., logprobs=True), you will get an error like openai does not support parameters: {'logprobs': True}, for model=o1-mini. Please check the corresponding LM documentation before setting this parameter.

@veronicalyu320 veronicalyu320 marked this pull request as ready for review December 6, 2024 01:19
@veronicalyu320 veronicalyu320 changed the title [WIP] Support returning logprobs in Predictor Support returning logprobs in Predictor Dec 6, 2024
else:
outputs = [
{
"text": c.message.content if hasattr(c, "message") else c["text"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to change the return type of __call__ even if the user doesn't request logprobs? We normally return a list of strings. This seems to return a list of dicts even if logprobs=False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, updated.

@okhat okhat removed their request for review December 10, 2024 15:54
@okhat okhat merged commit e690743 into stanfordnlp:main Dec 10, 2024
4 checks passed
isaacbmiller pushed a commit that referenced this pull request Dec 11, 2024
* support returning logprobs in Predictor

* allow output to be either str or dict

* return outputs as a list of strings if user doesn't set logprobs

* Update lm.py

---------

Co-authored-by: Omar Khattab <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants