-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retries via LiteLLM RetryPolicy #1866
Conversation
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
cc @okhat - feel free to test this out in the meantime while I work with @krrishdholakia on BerriAI/litellm#6916 |
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
@@ -36,7 +37,7 @@ def __init__( | |||
max_tokens: int = 1000, | |||
cache: bool = True, | |||
callbacks: Optional[List[BaseCallback]] = None, | |||
num_retries: int = 3, | |||
num_retries: int = 8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empirically, this provides roughly 1 minute of retries, which is typically necessary to overcome rate limit errors (providers like Azure OpenAI & Databricks support RPM rate limits)
@@ -102,14 +103,13 @@ def __call__(self, prompt=None, messages=None, **kwargs): | |||
outputs = [ | |||
{ | |||
"text": c.message.content if hasattr(c, "message") else c["text"], | |||
"logprobs": c.logprobs if hasattr(c, "logprobs") else c["logprobs"] | |||
"logprobs": c.logprobs if hasattr(c, "logprobs") else c["logprobs"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the linter being itself...
@@ -7,6 +7,7 @@ model_list: | |||
model: "dspy-test-provider/dspy-test-model" | |||
|
|||
litellm_settings: | |||
num_retries: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disable retries on the server side to ensure that server retries don't stack atop client retries, which can result in test failures due to a mismatch between expected and actual # of retries
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
@@ -38,7 +38,7 @@ dependencies = [ | |||
"pydantic~=2.0", | |||
"jinja2", | |||
"magicattr~=0.1.6", | |||
"litellm", | |||
"litellm==1.55.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LiteLLM version 1.55.3 is the only version that correctly supports passing retry_policy
to completion()
while respecting the number of retries specified by the policy
Signed-off-by: dbczumar <[email protected]>
cloudpickle | ||
jinja2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linter reordering
Retries via LiteLLM RetryPolicy
This depends on BerriAI/litellm#6916 being merged and released (may take some coordination)