Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use llguidance library for constraints (including json schemas) #899

Merged
merged 29 commits into from
Dec 2, 2024

Conversation

mmoskal
Copy link
Contributor

@mmoskal mmoskal commented Nov 6, 2024

Fixes #754

This imports the llguidance library for dealing with constraints (regexp, json schema, CFGs in lark syntax) and removes the aici submodule which was implementing regexps and CFGs (in yacc syntax) before. The code for both is related (I wrote the original versions of both).

This is WIP - putting it out there for comments about how likely it is to be merged.

TODO

  • remove testing scriptgs (scripst/*.sh)
  • remove aici folder
  • update python APIs to accept "json_schema"
  • update examples

@mmoskal mmoskal mentioned this pull request Nov 6, 2024
Copy link

github-actions bot commented Nov 6, 2024

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           41           22           10            9
 Happy                   1          442          369            0           73
 JSON                   12          105          104            0            1
 Python                 57         2435         2094           67          274
 Shell                   1           57           22           18           17
 Plain Text              3         3723            0         2413         1310
 TOML                   18          594          529            2           63
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          205          178            1           26
 (Total)                            282          210           32           40
-------------------------------------------------------------------------------
 Markdown               44         3314            0         2510          804
 |- BASH                 6          101           98            0            3
 |- JSON                 1           12           12            0            0
 |- Python               7          121          109            0           12
 |- Rust                12          406          344            0           62
 |- TOML                 2           75           63            0           12
 (Total)                           4029          626         2510          893
-------------------------------------------------------------------------------
 Rust                  289        88547        79444         1844         7259
 |- Markdown           140         1499           25         1366          108
 (Total)                          90046        79469         3210         7367
===============================================================================
 Total                 434        99314        82631         6866         9817
===============================================================================
  

@EricLBuehler
Copy link
Owner

Hi @mmoskal! Thanks for the PR.

I took a look at the code and it looks great already. As you noted, adding some examples and updating the APIs + docs would be great (maybe some tests?). After that, this is something I would certainly be excited to merge.

@mmoskal
Copy link
Contributor Author

mmoskal commented Nov 10, 2024

Thanks for the review! I probably won't have time for it this week, but hopefully the one after that.

@EricLBuehler
Copy link
Owner

@mmoskal friendly ping :) Let me know if I can help!

@mmoskal
Copy link
Contributor Author

mmoskal commented Nov 19, 2024

It seems I won't have much time in the next two-three weeks. If anyone else is willing to take on the TODOs that would be great!

I'll be extending the llguidance APIs to be able to perform multithreaded mask computation in background (one of the things I will be busy with). Enabling this will get the constraint computation out of the hot path of sampling - but that should be a one-line change here.

@EricLBuehler
Copy link
Owner

EricLBuehler commented Nov 28, 2024

@mmoskal that sounds exciting - I totally understand however! I tried to recreate this PR to quickly merge it and get something working but it seems many of the APIs have changed since the code here was written. Looking forward to the enhancements and hopefully getting this merged!

@mmoskal mmoskal marked this pull request as ready for review December 1, 2024 19:41
@mmoskal
Copy link
Contributor Author

mmoskal commented Dec 1, 2024

@EricLBuehler should be good for review!

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks really great! I was wondering if you could please add more examples, namely:

  • Server: add an example of JSON schema and maybe a llguidance grammar
  • Python: similarly, add examples of Lark, JSON schema, and maybe a llguidance grammar
  • Rust: similarly, add examples of Lark, JSON schema, and maybe a llguidance grammar

Otherwise, the code looks great. I left a few comments, and it looks like formatting and clippy are failing in CI.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is really cool! I also saw a JSON yacc grammar here which might be nice to include as a seperate example, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, but I'm not sure this is the right place to point people to - getting a grammar for specific JSON is somewhat tricky (for example, you need to care about quoting in strings).

I have also added example in Llguidance format that combine Lark with JSON Schema - that's a more robust way of doing this.

mistralrs-core/src/request.rs Show resolved Hide resolved
@mmoskal mmoskal requested a review from EricLBuehler December 2, 2024 01:35
@mmoskal
Copy link
Contributor Author

mmoskal commented Dec 2, 2024

Added samples in rust, python and for server; fixed clippy and fmt. Should be good to go!

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! This is really amazing and looks great.

I'll keep my eye on llguidance's development for new features!

@EricLBuehler EricLBuehler merged commit 4b92d4b into EricLBuehler:master Dec 2, 2024
12 checks passed
@mmoskal mmoskal deleted the llg_cleanup branch December 2, 2024 17:04
@mmoskal
Copy link
Contributor Author

mmoskal commented Dec 2, 2024

Thanks for merging!

There is a few areas where the integration can be improved, I filed #963, #964, #965 to track.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AICI -> llguidance?
2 participants