`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

ibehnam · 2023-10-17T05:25:34Z

Expected Behavior

When using ./main with the --grammar flag, llama.cpp successfully generates an output according to the grammar string.

It is expected that this behavior transfers to ./parallel as well.

Current Behavior

./parallel <args> ... --grammar <grammar_string> doesn't respect the grammar, so llama.cpp generates free-form text.

Environment and Context

MacBook Pro, M1 Pro chip, macOS Sonoma

Operating System, e.g. for Linux:

$ uname -a

Darwin <my_username>.local 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:41:43 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T6000 arm64

SDK version, e.g. for Linux:

$ python3 --version
$ make --version

GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for i386-apple-darwin11.3.0

$ g++ --version

Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Failure Information (for bugs)

I'm not sure if it's related, but I noticed parallel decoding treats each line of the prompt as a separate prompt (for separate sequences).

Also, parallel decoding seems to take place ina chat settings, not completion settings.

Steps to Reproduce

For example, try this:

./parallel --prompt 'What's your favorite number?' --in-prefix '' --in-suffix '' --model <model_path> --ctx-size 8192 --color --n-predict 128 --keep 0 --temp 0.8 --repeat-penalty 1.1 --repeat-last-n 64 --grammar '# `root` specifies the pattern for the overall output
root ::= (
    value
)

value ::= "1" | "2" | "3"
' --parallel 1 --sequences 1 --threads 10 --n-gpu-layers 128 --main-gpu 0

The text was updated successfully, but these errors were encountered:

ggerganov · 2023-10-17T05:53:26Z

This is fixed in #3624 - pending merge

ggerganov closed this as completed Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

ibehnam commented Oct 17, 2023 •

edited

Loading

ggerganov commented Oct 17, 2023

grammar doesn't work in parallel decoding even when np = 1 #3650

grammar doesn't work in parallel decoding even when np = 1 #3650

Comments

ibehnam commented Oct 17, 2023 • edited Loading

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

ggerganov commented Oct 17, 2023

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

`grammar` doesn't work in `parallel` decoding even when `np = 1` #3650

ibehnam commented Oct 17, 2023 •

edited

Loading