You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there,
I know that dspy uses LiteLLM and that langchain also has a memory rate limiter that you can pass into the chat model.
However I can't figure what should I do for example to retrict to N requests/second or T tokens/seconds etc.
Any example would be great.
Cheers.
The text was updated successfully, but these errors were encountered:
DSPy doesn't support LiteLLM Router integrations yet, but the mentioned documentation's instructions and an existing PR might be helpful in understanding some examples on how this would work.
Hi @robomotic,
You can start Litellm proxy server using this, which will be running on port 4000. Use UI to add models in the proxy server.
Then in your DSPy program you can simply do:
lm=dspy.LM(
model="openai/LITELLM PROXY PUBLIC MODEL NAME",
api_base="http://localhost:4000/",
api_key="sk-YOUR LITELLM MASTER KEY",
cache=False,
)
Hi there,
I know that dspy uses LiteLLM and that langchain also has a memory rate limiter that you can pass into the chat model.
However I can't figure what should I do for example to retrict to N requests/second or T tokens/seconds etc.
Any example would be great.
Cheers.
The text was updated successfully, but these errors were encountered: