idea: Training Ichigo on Structured Output #126

hahuyhoang411 · 2024-11-20T20:21:52Z

Problem Statement

Current LLM development is moving toward structured output. It's proved to improve model performance in various tasks. Also when training with structured output, we can explore further into training long context which we haven't trained.

Idea

Reference: https://arxiv.org/pdf/2411.10440

hahuyhoang411 · 2024-11-20T20:22:08Z

Paper summary:

Structured responses significantly improve the model’s systematic reasoning ability. To achieve this, they design <SUMMARY>, <CAPTION>, <REASONING>, <CONCLUSION> tags to help the model recognize the current stage of reasoning, and create the LLaVA-o1-100k dataset by using GPT-4o to generate stage-level reasoning.

Inference time scaling: Unlike previous methods like best-of-N search and sentence-level beam search, we propose a novel stage-level beam search method. Specifically, we generate multiple responses for each stage (marked by tags) and select the best one to proceed to the next stage.

hahuyhoang411 added the type: idea Research, data, any new ideas label Nov 20, 2024

hahuyhoang411 self-assigned this Nov 20, 2024

hiento09 added this to Menlo Nov 22, 2024

github-project-automation bot moved this to Investigating in Menlo Nov 22, 2024

tikikun moved this from Investigating to In Review in Menlo Nov 25, 2024

bachvudinh added this to the Ichigo v0.5 milestone Nov 25, 2024

dan-menlo modified the milestones: Ichigo v0.5, Ichigo v0.6 Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea: Training Ichigo on Structured Output #126

idea: Training Ichigo on Structured Output #126

hahuyhoang411 commented Nov 20, 2024

hahuyhoang411 commented Nov 20, 2024 •

edited

Loading

idea: Training Ichigo on Structured Output #126

idea: Training Ichigo on Structured Output #126

Comments

hahuyhoang411 commented Nov 20, 2024

Problem Statement

Idea

hahuyhoang411 commented Nov 20, 2024 • edited Loading

hahuyhoang411 commented Nov 20, 2024 •

edited

Loading