Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I convert output.jsonl into a format similar to SWE-Gym/OpenHands-SFT-Trajectories? #6830

Open
lycfight opened this issue Feb 19, 2025 · 0 comments
Labels
evaluation Related to running evaluations with OpenHands troubleshooting/help User requires help

Comments

@lycfight
Copy link

What problem or use case are you trying to solve?

Describe the UX of the solution you'd like

Do you have thoughts on the technical implementation?

Describe alternatives you've considered

Additional context

@xingyaoww

I want to train a model using long dialogue trajectory data generated by OpenHands to improve performance. However, I am confused by the output.jsonl data generated by run_infer. Some history entries have an odd number of turns, some have an even number, and some rounds are empty. This differs significantly from the common multi-turn dialogue SFT data format.

Which parts of the data can be used to form a proper multi-turn dialogue training format? Can you explain the meaning of the fields in this file in detail? Also, how can I convert output.jsonl into a format similar to SWE-Gym/OpenHands-SFT-Trajectories?

@lycfight lycfight added the enhancement New feature or request label Feb 19, 2025
@mamoodi mamoodi added evaluation Related to running evaluations with OpenHands troubleshooting/help User requires help and removed enhancement New feature or request labels Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation Related to running evaluations with OpenHands troubleshooting/help User requires help
Projects
None yet
Development

No branches or pull requests

2 participants