-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable QNN EP weight sharing generation using public API #23702
base: main
Are you sure you want to change the base?
Conversation
…s if ep.share_ep_contexts is enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
#include "command_args_parser.h" | ||
#include <google/protobuf/stubs/common.h> | ||
|
||
#include "core/session/onnxruntime_session_options_config_keys.h" | ||
#include "core/session/inference_session.h" | ||
#include "core/session/ort_env.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I believe this is still an internal header. And I also see that this tool still depends on the internal graph classes onnxruntime::Graph
and onnxruntime::Node
, which have a public header but a private/internal implementation. Since this still requires the tool to be compiled with internal ORT code, would this prevent users from integrating this into their own toolchains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Actually for that post processing part, user should be able to use Onnx API to update the Onnx model. That's not the main part we want to cover in this tool. But anyway, let me make the changes to use Onnx API to make it clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
-v: Show verbose information. | ||
|
||
-C: [session_config_entries]: Specify session configuration entries as key-value pairs: -C "<key1>|<val1> <key2>|<val2>" | ||
Refer to onnxruntime_session_options_config_keys.h for valid keys and values. | ||
[Example] -C "ep.context_enable|1 ep.context_embed_mode|0" | ||
[Example] -C "ep.context_enable|1 ep.context_embed_mode|0". These are set as default so can be ignored. | ||
|
||
-i: [provider_options]: Specify QNN EP specific runtime options as key value pairs. Different runtime options available are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-i help string still mentions QNN EP
@@ -43,6 +43,35 @@ static const std::string& GetNodeAttr(const Node& node, const std::string& attr_ | |||
return default_val; | |||
} | |||
|
|||
// from the context ache Onnx model, find the EPContext node with main_context=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo cache
cmake/onnxruntime_unittests.cmake
Outdated
endif() | ||
target_link_libraries(onnxruntime_qnn_ctx_gen PRIVATE onnx_test_runner_common onnxruntime_test_utils onnxruntime_common onnxruntime_graph onnxruntime_session onnxruntime_providers onnxruntime_framework onnxruntime_util onnxruntime_mlas onnxruntime_optimizer onnxruntime_flatbuffers onnx_test_data_proto ${onnxruntime_test_providers_libs} ${onnxruntime_EXTERNAL_LIBRARIES} ${GETOPT_LIB_WIDE} ${SYS_PATH_LIB} ${CMAKE_DL_LIBS}) | ||
target_link_libraries(ep_weight_sharing_ctx_gen PRIVATE onnx_test_runner_common onnxruntime_test_utils onnxruntime_common onnxruntime_graph onnxruntime_session onnxruntime_providers onnxruntime_framework onnxruntime_util onnxruntime_mlas onnxruntime_optimizer onnxruntime_flatbuffers onnx_test_data_proto ${onnxruntime_test_providers_libs} ${onnxruntime_EXTERNAL_LIBRARIES} ${GETOPT_LIB_WIDE} ${SYS_PATH_LIB} ${CMAKE_DL_LIBS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can some of these internal libraries be removed now that the tool uses public APIs?
#endif | ||
|
||
if (test_config.model_file_paths.size() > 2) { | ||
std::cerr << "QNN EP only support 2 models for the weight sharing feature."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some cases that more than 2 models share the weight.
Description
Enable QNN EP weight sharing generation using public API instead of internal interfaces, so that user can integrate into their own toolchain.
Change the tool name from onnxruntime_qnn_ctx_gen to ep_weight_sharing_ctx_gen, so that it can be shared for other EPs.