Rework dspy logger #1732

chenmoneygithub · 2024-10-31T21:46:52Z

This PR replaces the current structlog-based logger by standard logging module-based logger. The proposed approach is simpler and more robust. Furthermore, it doesn't affect dependency package's logging setup, e.g., OpenAI's logger will remain at WARNING level.

People can also enable or disable DSPy event logging by calling dspy.enable_logging()/dspy.disable_logging().

Sample logging (with this PR's change) from running MiproV2 optimizer:

(dspy) *[fix-logger][~/Documents/mlflow_team/dspy]$ python3 tmp.py
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 7473/7473 [00:00<00:00, 67228.69it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 1319/1319 [00:00<00:00, 79755.02it/s]
2024/10/31 14:38:05 INFO dspy.teleprompt.mipro_optimizer_v2: Projected Language Model (LM) Calls

Based on the parameters you have set, the maximum number of LM calls is projected as follows:

- Prompt Generation: 10 data summarizer calls + 3 * 1 lm calls in program + (2) lm calls in program-aware proposer = 15 prompt model calls
- Program Evaluation: 2 examples in minibatch * 30 batches + 8 examples in val set * 4 full evals = 92 LM Program calls

Estimated Cost Calculation:

Total Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) 
            + (Number of program calls * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token).

For a preliminary estimate of potential costs, we recommend you perform your own calculations based on the task
and prompt models you intend to use. If the projected costs exceed your budget or expectations, you may consider:

- Reducing the number of trials (`num_trials`), the size of the valset, or the number of LM calls in your program.
- Using a cheaper task model to optimize the prompt.
- Setting `minibatch=True` if you haven't already.

2024/10/31 14:38:05 INFO dspy.teleprompt.mipro_optimizer_v2: To proceed with the execution of this program, please confirm by typing 'y' for yes or 'n' for no.

If you would like to bypass this confirmation step in future executions, set the `requires_permission_to_run` flag to `False` when calling compile.

Awaiting your input...

Do you wish to continue? (y/n): y
2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used as few-shot example candidates for our program and for creating instructions.

2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=3 sets of demonstrations...
Bootstrapping set 1/3
Bootstrapping set 2/3
Bootstrapping set 3/3
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00,  2.50s/it]
Bootstrapped 2 full traces after 1 examples for up to 1 rounds, amounting to 2 attempts.
2024/10/31 14:38:44 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2024/10/31 14:38:44 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
2024/10/31 14:38:48 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Given the fields `question`, produce the fields `answer`.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 1: Imagine you are a detective solving a high-stakes case where every second counts. You receive a critical question that requires you to calculate scores or distances based on the evidence presented. Your task is to not only provide the answer but also to explain your reasoning step by step. Given the question, produce the answer along with a clear explanation of your thought process.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 2: Given the `question`, analyze it step-by-step to provide a detailed `reasoning` process and calculate the final `answer`.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: Evaluating the default program...

Average Metric: 7 / 8  (87.5): 100%|█████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00,  2.02it/s]
2024/10/31 14:39:01 INFO dspy.evaluate.evaluate: Average Metric: 7 / 8 (87.5%)
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 87.5

2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.

/Users/chen.qian/miniconda3/envs/dspy/lib/python3.12/site-packages/optuna/samplers/_tpe/sampler.py:319: ExperimentalWarning: ``multivariate`` option is an experimental feature. The interface can change in the future.
  warnings.warn(
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 1 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.02s/it]
2024/10/31 14:39:03 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 2'].
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0]
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: ============================


2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 2 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.28s/it]
2024/10/31 14:39:06 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 2'].
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0, 100.0]
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: ============================


2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 3 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.12s/it]
2024/10/31 14:39:08 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 1'].
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0, 100.0, 100.0]
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: ============================

okhat · 2024-11-01T11:00:25Z

Lgtm. Thanks a lot @chenmoneygithub !!

Rework dspy logger

3b09cde

chenmoneygithub requested a review from okhat October 31, 2024 21:47

add enable/disable logging support

fad1c52

chenmoneygithub force-pushed the fix-logger branch 2 times, most recently from a4f9f0a to 52e03d2 Compare October 31, 2024 22:10

file rename

00aa84b

chenmoneygithub force-pushed the fix-logger branch from 52e03d2 to 00aa84b Compare October 31, 2024 22:12

okhat merged commit a8492a0 into stanfordnlp:main Nov 1, 2024
4 checks passed

chenmoneygithub deleted the fix-logger branch December 27, 2024 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework dspy logger #1732

Rework dspy logger #1732

chenmoneygithub commented Oct 31, 2024 •

edited

Loading

okhat commented Nov 1, 2024

Rework dspy logger #1732

Rework dspy logger #1732

Conversation

chenmoneygithub commented Oct 31, 2024 • edited Loading

okhat commented Nov 1, 2024

chenmoneygithub commented Oct 31, 2024 •

edited

Loading