Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DeepSeek language models in STORM Wiki pipeline #84

Merged
merged 2 commits into from
Jul 19, 2024

Conversation

rmcc3
Copy link
Contributor

@rmcc3 rmcc3 commented Jul 17, 2024

Description

This pull request adds support for using DeepSeek language models in the STORM Wiki pipeline, providing an alternative to the existing OpenAI models. The integration allows users to easily switch between OpenAI and DeepSeek models, enhancing the flexibility and capabilities of the STORM Wiki system.

Key Changes

  1. Implemented a new DeepSeekModel class in src/lm.py, which is compatible with the existing dspy.OpenAI interface.
  2. Created a new example script examples/run_storm_wiki_deepseek.py to demonstrate how to use DeepSeek models with the STORM Wiki pipeline.
  3. Updated the STORMWikiRunner class to be model-agnostic, ensuring compatibility with both OpenAI and DeepSeek models.
  4. Added topic name sanitization to handle special characters in file names, which fixes a bug which would cause the research to crash when using special characters.
  5. Add Unicode normalization to handle non-ASCII characters.

How to Use

Users can now run the STORM Wiki pipeline with DeepSeek models by using the new run_storm_wiki_deepseek.py script. The script allows configuration of various DeepSeek-specific parameters such as model choice, temperature, and top_p sampling. The API key should be set with DEEPSEEK_API_KEY. The DeepSeek base can be set with DEEPSEEK_API_BASE, if needed.

Testing

  • Tested the integration with DeepSeek models.
  • Verified compatibility with existing retrieval methods.
  • Ensured proper handling of topic names with special characters.

Future Considerations

  • Consider implementing a unified interface for easily switching between different model providers and retrievers.

Copy link
Collaborator

@shaoyijia shaoyijia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rmcc3 , thank you so much for the prompt response! The quality of the example outputs is pretty good - we are very happy to support DeepSeek models in our project. Could you make the following change so that I can merge this PR?

  • Resolve the merge conflict. (We recently have breaking change in order to support installing our project via pip.)
  • Add a few lines (see my comment) to DeepSeekModel to ensure the call history is tracked. Currently, llm_call_history.jsonl is empty in your shared output. After adding these lines, it shall include call history in the session.

src/lm.py Outdated Show resolved Hide resolved
@rmcc3 rmcc3 reopened this Jul 19, 2024
@rmcc3
Copy link
Contributor Author

rmcc3 commented Jul 19, 2024

Seems updating to head caused this to close. Working on the change now.

@rmcc3 rmcc3 marked this pull request as draft July 19, 2024 07:19
@rmcc3 rmcc3 marked this pull request as ready for review July 19, 2024 07:38
@rmcc3 rmcc3 requested a review from shaoyijia July 19, 2024 07:40
@rmcc3
Copy link
Contributor Author

rmcc3 commented Jul 19, 2024

llm_call_history.txt

Here is the call history (had to upload as .txt because GitHub does not accept .jsonl)

Copy link
Collaborator

@shaoyijia shaoyijia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you so much!!

@shaoyijia shaoyijia merged commit 61159c9 into stanford-oval:main Jul 19, 2024
feldges pushed a commit to feldges/storm that referenced this pull request Dec 4, 2024
Add support for DeepSeek language models in STORM Wiki pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants