-
Notifications
You must be signed in to change notification settings - Fork 663
[Optimization] compulte real max_logprobs in batch #5430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Thanks for your contribution! |
| self.top_p_normalized_logprobs = True | ||
| self.prompt_logprobs_reqs: dict[str, Request] = {} | ||
| self.in_progress_prompt_logprobs: dict[str, LogprobsTensors] = {} | ||
| self.forward_batch_reqs_list: list[Request] = [None for _ in range(self.scheduler_config.max_num_seqs)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear_requests中清理一下
| logprobs = d.get("logprobs", None) | ||
| if logprobs is not None: | ||
| if logprobs is True: | ||
| sampling_params.logprobs = d.get("top_logprobs", None) | ||
| elif logprobs is False: | ||
| sampling_params.logprobs = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
精简一下
| logprobs = d.get("logprobs", None) | |
| if logprobs is not None: | |
| if logprobs is True: | |
| sampling_params.logprobs = d.get("top_logprobs", None) | |
| elif logprobs is False: | |
| sampling_params.logprobs = None | |
| logprobs = d.get("logprobs", None) | |
| if logprobs: | |
| sampling_params.logprobs = d.get("top_logprobs", None) | |
| else: | |
| sampling_params.logprobs = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logprobs可能为true、false和int值[-1, 0, 1, 2,....],chat接口需要将bool类型映射到数字或者None。
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #5430 +/- ##
==========================================
Coverage ? 59.59%
==========================================
Files ? 327
Lines ? 40666
Branches ? 6175
==========================================
Hits ? 24233
Misses ? 14555
Partials ? 1878
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
每次根据batch 请求中的真实的logprob计算,相比每次按照最大20计算,端到端性能提升10%
Modifications
无改变
Usage or Command
无改变
Accuracy Tests
已存在
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.