Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework dspy logger #1732

Merged
merged 3 commits into from
Nov 1, 2024
Merged

Rework dspy logger #1732

merged 3 commits into from
Nov 1, 2024

Conversation

chenmoneygithub
Copy link
Collaborator

@chenmoneygithub chenmoneygithub commented Oct 31, 2024

This PR replaces the current structlog-based logger by standard logging module-based logger. The proposed approach is simpler and more robust. Furthermore, it doesn't affect dependency package's logging setup, e.g., OpenAI's logger will remain at WARNING level.

People can also enable or disable DSPy event logging by calling dspy.enable_logging()/dspy.disable_logging().

Sample logging (with this PR's change) from running MiproV2 optimizer:

(dspy) *[fix-logger][~/Documents/mlflow_team/dspy]$ python3 tmp.py
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 7473/7473 [00:00<00:00, 67228.69it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 1319/1319 [00:00<00:00, 79755.02it/s]
2024/10/31 14:38:05 INFO dspy.teleprompt.mipro_optimizer_v2: Projected Language Model (LM) Calls

Based on the parameters you have set, the maximum number of LM calls is projected as follows:

- Prompt Generation: 10 data summarizer calls + 3 * 1 lm calls in program + (2) lm calls in program-aware proposer = 15 prompt model calls
- Program Evaluation: 2 examples in minibatch * 30 batches + 8 examples in val set * 4 full evals = 92 LM Program calls

Estimated Cost Calculation:

Total Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) 
            + (Number of program calls * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token).

For a preliminary estimate of potential costs, we recommend you perform your own calculations based on the task
and prompt models you intend to use. If the projected costs exceed your budget or expectations, you may consider:

- Reducing the number of trials (`num_trials`), the size of the valset, or the number of LM calls in your program.
- Using a cheaper task model to optimize the prompt.
- Setting `minibatch=True` if you haven't already.

2024/10/31 14:38:05 INFO dspy.teleprompt.mipro_optimizer_v2: To proceed with the execution of this program, please confirm by typing 'y' for yes or 'n' for no.

If you would like to bypass this confirmation step in future executions, set the `requires_permission_to_run` flag to `False` when calling compile.

Awaiting your input...

Do you wish to continue? (y/n): y
2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used as few-shot example candidates for our program and for creating instructions.

2024/10/31 14:38:39 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=3 sets of demonstrations...
Bootstrapping set 1/3
Bootstrapping set 2/3
Bootstrapping set 3/3
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00,  2.50s/it]
Bootstrapped 2 full traces after 1 examples for up to 1 rounds, amounting to 2 attempts.
2024/10/31 14:38:44 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2024/10/31 14:38:44 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
2024/10/31 14:38:48 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing instructions...

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Given the fields `question`, produce the fields `answer`.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 1: Imagine you are a detective solving a high-stakes case where every second counts. You receive a critical question that requires you to calculate scores or distances based on the evidence presented. Your task is to not only provide the answer but also to explain your reasoning step by step. Given the question, produce the answer along with a clear explanation of your thought process.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 2: Given the `question`, analyze it step-by-step to provide a detailed `reasoning` process and calculate the final `answer`.

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: 

2024/10/31 14:38:57 INFO dspy.teleprompt.mipro_optimizer_v2: Evaluating the default program...

Average Metric: 7 / 8  (87.5): 100%|█████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00,  2.02it/s]
2024/10/31 14:39:01 INFO dspy.evaluate.evaluate: Average Metric: 7 / 8 (87.5%)
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 87.5

2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.

/Users/chen.qian/miniconda3/envs/dspy/lib/python3.12/site-packages/optuna/samplers/_tpe/sampler.py:319: ExperimentalWarning: ``multivariate`` option is an experimental feature. The interface can change in the future.
  warnings.warn(
2024/10/31 14:39:01 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 1 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.02s/it]
2024/10/31 14:39:03 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 1', 'Predictor 0: Few-Shot Set 2'].
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0]
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: ============================


2024/10/31 14:39:03 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 2 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.28s/it]
2024/10/31 14:39:06 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 2'].
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0, 100.0]
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: ============================


2024/10/31 14:39:06 INFO dspy.teleprompt.mipro_optimizer_v2: == Minibatch Trial 3 / 30 ==
Average Metric: 2 / 2  (100.0): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.12s/it]
2024/10/31 14:39:08 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 100.0 on minibatch of size 2 with parameters ['Predictor 0: Instruction 0', 'Predictor 0: Few-Shot Set 1'].
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Minibatch scores so far: [100.0, 100.0, 100.0]
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Full eval scores so far: [87.5]
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: Best full score so far: 87.5
2024/10/31 14:39:08 INFO dspy.teleprompt.mipro_optimizer_v2: ============================

@chenmoneygithub chenmoneygithub requested a review from okhat October 31, 2024 21:47
@chenmoneygithub chenmoneygithub force-pushed the fix-logger branch 2 times, most recently from a4f9f0a to 52e03d2 Compare October 31, 2024 22:10
@okhat
Copy link
Collaborator

okhat commented Nov 1, 2024

Lgtm. Thanks a lot @chenmoneygithub !!

@okhat okhat merged commit a8492a0 into stanfordnlp:main Nov 1, 2024
4 checks passed
@chenmoneygithub chenmoneygithub deleted the fix-logger branch December 27, 2024 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants