Add test coverage for caching against a LiteLLM test server #1769

dbczumar · 2024-11-06T23:38:13Z

Add test coverage for caching against a LiteLLM test server, preventing caching regressions such as #1760

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-11-06T23:40:44Z

pyproject.toml

@@ -149,6 +149,7 @@ pre-commit = "^3.7.0"
 ipykernel = "^6.29.4"
 semver = "^3.0.2"
 pillow = "^10.1.0"
+litellm = {version = "1.51.0", extras = ["proxy"]}


The [proxy] extra is required in order to run the LiteLLM server. Adding it as a dev dependency here. This resulted in a fairly hefty lockfile update. @okhat let me know if there are concerns here - we can always add it elsewhere (e.g. requirements-dev.txt)

dbczumar · 2024-11-06T23:43:23Z

tests/caching/test_caching.py

+def test_lm_calls_are_cached_across_interpreter_sessions(litellm_test_server, temporary_populated_cache_dir):
+    """
+    Verifies that LM calls are cached across interpreter sessions. Pytest test cases effectively
+    simulate separate interpreter sessions.
+    """
+    api_base, server_log_file_path = litellm_test_server
+
+    lm1 = dspy.LM(
+        model="openai/dspy-test-model",
+        api_base=api_base,
+        api_key="fakekey",
+    )
+    lm1("Example query")
+
+    request_logs = read_litellm_test_server_request_logs(server_log_file_path)
+    assert len(request_logs) == 0


This test fails when #1760 is applied. This would have caught the regression.

_________________________________________________________________________________________ test_lm_calls_are_cached_across_interpreter_sessions __________________________________________________________________________________________ litellm_test_server = ('http://127.0.0.1:50616', '/var/folders/8f/s8n7wq1j2j3440sfljwmmpr80000gp/T/tmpgh8wz779/request_logs.jsonl'), temporary_populated_cache_dir = '/var/folders/8f/s8n7wq1j2j3440sfljwmmpr80000gp/T/tmpitn2cu32' def test_lm_calls_are_cached_across_interpreter_sessions(litellm_test_server, temporary_populated_cache_dir): """ Verifies that LM calls are cached across interpreter sessions. Pytest test cases effectively simulate separate interpreter sessions. """ api_base, server_log_file_path = litellm_test_server lm1 = dspy.LM( model="openai/dspy-test-model", api_base=api_base, api_key="fakekey", ) lm1("Example query") request_logs = read_litellm_test_server_request_logs(server_log_file_path) > assert len(request_logs) == 0 E AssertionError: assert 1 == 0 E + where 1 = len([[{'messages': [{'content': 'Example query', 'role': 'user'}], 'model': 'dspy-test-model'}]]) test_caching.py:90: AssertionError --------------------------------------------------------------------------------------------------------- Captured stdout setup ---------------------------------------------------------------------------------------------------------- Starting LiteLLM proxy server on port 50616 --------------------------------------------------------------------------------------------------------- Captured stderr setup ---------------------------------------------------------------------------------------------------------- INFO: Started server process [90639] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:50616 (Press CTRL+C to quit) ---------------------------------------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------------------------------------- #------------------------------------------------------------# # # # 'I get frustrated when the product...' # # https://github.com/BerriAI/litellm/issues/new # # # #------------------------------------------------------------# Thank you for using LiteLLM! - Krrish & Ishaan Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new LiteLLM: Proxy initialized with Config, Set models: dspy-test-model dspy-test-model-2 INFO: 127.0.0.1:50639 - "POST /chat/completions HTTP/1.1" 200 OK ============================================================================================================ warnings summary ============================================================================================================ ../../../miniconda3/envs/default/lib/python3.10/site-packages/pydantic/_internal/_config.py:291 /Users/corey.zumar/miniconda3/envs/default/lib/python3.10/site-packages/pydantic/_internal/_config.py:291: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.9/migration/ warnings.warn(DEPRECATION_MESSAGE, DeprecationWarning) ../../../miniconda3/envs/default/lib/python3.10/site-packages/wandb/env.py:16 /Users/corey.zumar/miniconda3/envs/default/lib/python3.10/site-packages/wandb/env.py:16: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives from distutils.util import strtobool -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================================================================================================== short test summary info ========================================================================================================= FAILED test_caching.py::test_lm_calls_are_cached_across_interpreter_sessions - AssertionError: assert 1 == 0

dbczumar · 2024-11-06T23:44:22Z

tests/caching/example_cache/cache.db

I've confirmed via inspection with the sqlite3 CLI that it doesn't contain anything sensitive.

Signed-off-by: dbczumar <[email protected]>

dbczumar · 2024-11-06T23:53:42Z

tests/clients/test_lm.py

-# @pytest.mark.parametrize("keys_in_env_vars", [True, False])
-# def test_lm_chat_respects_max_retries(keys_in_env_vars, monkeypatch):
-#     model_name = "openai/gpt4o"
-#     num_retries = 17
-#     temperature = 0.5
-#     max_tokens = 100
-#     prompt = "Hello, world!"
-#     api_version = "2024-02-01"
-#     api_key = "apikey"


Let's remove these unused test cases for now and reintroduce them once BerriAI/litellm#6623 is addressed

dbczumar · 2024-11-06T23:55:58Z

tests/clients/test_lm.py

-#     prompt = "Hello, world!"
-#     api_version = "2024-02-01"
-#     api_key = "apikey"
+def test_lms_can_be_queried(litellm_test_server):


Added this bonus (non-caching) smoke test case:

This would've caught #1753. If we apply that diff, this tests fails with:

================================================================================================================ FAILURES ================================================================================================================ ________________________________________________________________________________________________________ test_lms_can_be_queried _________________________________________________________________________________________________________ litellm_test_server = ('http://127.0.0.1:53056', '/var/folders/8f/s8n7wq1j2j3440sfljwmmpr80000gp/T/tmp40p6i2nc/request_logs.jsonl') def test_lms_can_be_queried(litellm_test_server): api_base, _ = litellm_test_server openai_lm = dspy.LM( model="openai/dspy-test-model", api_base=api_base, api_key="fakekey", ) > openai_lm("openai query") test_lm.py:17: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../dspy/utils/callback.py:202: in wrapper return fn(instance, *args, **kwargs) ../../dspy/clients/lm.py:85: in __call__ response = completion( ../../dspy/clients/lm.py:169: in cached_litellm_completion return litellm_completion( ../../dspy/clients/lm.py:178: in litellm_completion router = _get_litellm_router(model=kwargs["model"], num_retries=num_retries) ../../dspy/clients/lm.py:210: in _get_litellm_router return Router( ../../../miniconda3/envs/default/lib/python3.10/site-packages/litellm/router.py:368: in __init__ self.set_model_list(model_list) ../../../miniconda3/envs/default/lib/python3.10/site-packages/litellm/router.py:3870: in set_model_list self._create_deployment( ../../../miniconda3/envs/default/lib/python3.10/site-packages/litellm/router.py:3789: in _create_deployment deployment = self._add_deployment(deployment=deployment) ../../../miniconda3/envs/default/lib/python3.10/site-packages/litellm/router.py:3951: in _add_deployment InitalizeOpenAISDKClient.set_client( ../../../miniconda3/envs/default/lib/python3.10/site-packages/litellm/router_utils/client_initalization_utils.py:463: in set_client _client = openai.AsyncOpenAI( # type: ignore _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <openai.AsyncOpenAI object at 0x317ce97b0> def __init__( self, *, api_key: str | None = None, organization: str | None = None, project: str | None = None, base_url: str | httpx.URL | None = None, timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN, max_retries: int = DEFAULT_MAX_RETRIES, default_headers: Mapping[str, str] | None = None, default_query: Mapping[str, object] | None = None, # Configure a custom httpx client. # We provide a `DefaultAsyncHttpxClient` class that you can pass to retain the default values we use for `limits`, `timeout` & `follow_redirects`. # See the [httpx documentation](https://www.python-httpx.org/api/#asyncclient) for more details. http_client: httpx.AsyncClient | None = None, # Enable or disable schema validation for data returned by the API. # When enabled an error APIResponseValidationError is raised # if the API responds with invalid data for the expected schema. # # This parameter may be removed or changed in the future. # If you rely on this feature, please open a GitHub issue # outlining your use-case to help us decide if it should be # part of our public interface in the future. _strict_response_validation: bool = False, ) -> None: """Construct a new async openai client instance. This automatically infers the following arguments from their corresponding environment variables if they are not provided: - `api_key` from `OPENAI_API_KEY` - `organization` from `OPENAI_ORG_ID` - `project` from `OPENAI_PROJECT_ID` """ if api_key is None: api_key = os.environ.get("OPENAI_API_KEY") if api_key is None: > raise OpenAIError( "The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable" ) E openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable ../../../miniconda3/envs/default/lib/python3.10/site-packages/openai/_client.py:319: OpenAIError

okhat · 2024-11-07T00:28:42Z

tests/caching/example_cache/cache.db

32 KB is fine but I think the way git keeps files in history permanently could add up, so we may want to treat adding caches to the repo as an exceptional move. Learned the hard way from having the cache/ in the repo's root.

dbczumar added 6 commits November 6, 2024 13:21

initial

9e42103

Signed-off-by: dbczumar <[email protected]>

Impl

c944ea6

Signed-off-by: dbczumar <[email protected]>

fix

a33df84

Signed-off-by: dbczumar <[email protected]>

Add cache file

522e8a0

Signed-off-by: dbczumar <[email protected]>

proxy

eff4094

Signed-off-by: dbczumar <[email protected]>

fix

cd31144

Signed-off-by: dbczumar <[email protected]>

dbczumar commented Nov 6, 2024

View reviewed changes

fix test

502916d

Signed-off-by: dbczumar <[email protected]>

dbczumar commented Nov 6, 2024

View reviewed changes

dbczumar marked this pull request as ready for review November 6, 2024 23:54

dbczumar requested a review from okhat November 6, 2024 23:54

dbczumar commented Nov 6, 2024

View reviewed changes

okhat reviewed Nov 7, 2024

View reviewed changes

okhat merged commit 863ebed into stanfordnlp:main Nov 7, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test coverage for caching against a LiteLLM test server #1769

Add test coverage for caching against a LiteLLM test server #1769

dbczumar commented Nov 6, 2024

dbczumar Nov 6, 2024

dbczumar Nov 6, 2024

dbczumar Nov 6, 2024 •

edited

Loading

dbczumar Nov 6, 2024

dbczumar Nov 6, 2024

okhat Nov 7, 2024

Add test coverage for caching against a LiteLLM test server #1769

Add test coverage for caching against a LiteLLM test server #1769

Conversation

dbczumar commented Nov 6, 2024

dbczumar Nov 6, 2024

Choose a reason for hiding this comment

dbczumar Nov 6, 2024

Choose a reason for hiding this comment

dbczumar Nov 6, 2024 • edited Loading

Choose a reason for hiding this comment

dbczumar Nov 6, 2024

Choose a reason for hiding this comment

dbczumar Nov 6, 2024

Choose a reason for hiding this comment

okhat Nov 7, 2024

Choose a reason for hiding this comment

dbczumar Nov 6, 2024 •

edited

Loading