-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExtAPI: questions about new format #1186
Comments
Your proposal seems good to me. Could you please make a pr? @shuangxiangkan could you take a look at this? Make any comments if you have. |
Another quick question: it looks like all functions that are annotated with STATIC in extapi.c are also annotated with ALLOC_RET; is that intentional? The aliasing behaviors of those annotations seem contradictory: ALLOC_RET always returns a new object, whereas all calls to a STATIC function should alias. Most other functions in extapi.c seem to have function-level annotations as well. E.g. "strtok", should that not be implementable as follows:
without any function-level annotation? |
Though calling the same static function, it will return different new objects. You could have a try to validate it. If a function in extapi.c has one or more annotations, the function body will be ignored and the side-effect is based on the annotations. |
(1) If the application code has a declared function, you could provide your own implementated function in extapi.c For your case |
Thanks for your reply. As far as I know, But even if they didn't (e.g. |
I see. I think you are right. STATIC and ALLOC_RET sometimes are contradictory to each other. We will need to sort this out. What suggestions for the fix? |
Would be good to go through each function with these two annotations. For a function which is not STATIC, the annotation STATIC should be removed. For a function that is STATIC, both annotations should be removed and make the function body return a malloc object so that the application code will share the same malloc site with only one object for all its invocations. |
Would SVF not be able to handle the STATIC annotation alone? If we return a malloc() object, will that not lead SVF to believe that every call to this function would result in a new object? |
No, if we have a function body returns malloc in extapi. The SVFIR or PAG will treat this malloc as a single object returning to all its call sites that call this function. If we use annotation like RET_ALLOC, SVF will return different objects for each callsite calls this function |
I see. From the point of view of allocation wrappers, we frequently encounter not simply wrapper functions (such as xmalloc), but also initialization functions or functions that contain additional pointer behavior beyond simple allocation, e.g.:
How could we let SVF capture that
With the old extapi format, we were able to generate an extapi entry that captured both the new object allocation and the copy edge between the 2nd parameter and the 1st parameter. We were hoping to also capture the initialization of the struct field in the new format, but how could we achieve this? If we specify "ALLOC_RET", the body of the function will be ignored? |
In your case, you can simply remove the ALLOC_RET annotation and use your self-defined function body in ExtAPI.c It should work. |
If you specify "ALLOC_RET", yes, the body of the function will be ignored. |
Okay, but would SVF not treat all calls to this function as returning the same object then? Which would defeat the point of adding entries for the allocation wrappers in the first place, because we are trying to have SVF treat calls to the allocation wrapper function as separate objects. |
Proposal: would it be possible to "inline" the PAG information obtained from the extapi entries in their callers (1-level context sensitive)? Most extapi functions could then be entirely defined by their bodies (no function-level annotations), and function-level annotations would be mandatory on declarations in extapi (functions without body). e.g. // mandatory function-level attribute on functions without body
// svf_extapi_new_obj is an artificial function that just provides the new object node
// the only necessary annotations (I think) are ALLOC_RET and STATIC
__attribute__((annotate("ALLOC_RET"))
void* svf_extapi_new_obj();
// no need for ALLOC_RET here; PAG information of body is inlined into every caller -> no two callsites will alias
void* malloc(size_t bytes) {
return svf_extapi_new_obj();
}
// no need for ALLOC_ARGi, the PAG of this body will already contain information about where the new object is stored
int asprintf(char **restrict strp, const char *restrict fmt, ...) {
*strp = svf_extapi_new_obj();
return 0;
}
// no need for REALLOC_RET, the PAG of this body handles the NULL case
void *realloc(void *ptr, size_t size) {
if (ptr)
return ptr;
return svf_extapi_new_obj();
}
// another artificial function that provides a pointer to the location of the
// strtok global. all calls to this function will alias
// note: function-level annotation mandatory because the function has no body
__attribute__((annotate("STATIC"))
char** svf_extapi_strtok_global_loc();
// again, no need for an attribute here: the body contains all the necessary information
char *strtok(char * str, const char * delim) {
char** global_loc = svf_extapi_strtok_global_loc();
if (str)
*global_loc = str;
return *global_loc;
}
// __ctype_b_loc: this function returns a pointer to a STATIC pointer to a STATIC array
// this corresponds to the old EFT_STAT2 entry (multi-level static), which no longer exists in the new extapi format (?)
// none of the entries in the top-level array should alias each other
// when we inline PAG information, we can describe arbitrarily deep/complex datastructures without additional annotations
// "STATIC" means "return value points to static var"
__attribute__((annotate("STATIC"))
const unsigned short* svf_extapi_ctype_b_array();
__attribute__((annotate("STATIC"))
const unsigned short** svf_extapi_ctype_b_loc();
const unsigned short** __ctype_b_loc() {
const unsigned short** loc = svf_extapi_ctype_b_loc();
for (int = -128; i < 256; i++)
loc[i] = svf_extapi_ctype_b_array();
return loc;
} Given a few restrictions on extapi entries (no indirect calls, no recursion), we could then express arbitrarily complex pointer behavior in the extapi, including the example from our discussion above: // no annotation needed
void* init_struct(void* ptr, void** secondptr) {
struct* ret = svf_extapi_new_obj();
ret->field = ptr;
*secondptr = ptr;
return ret;
} In my understanding, this format provides the following benefits:
|
The proposal looks pretty good! there are two challenges:
|
re 1: I'm not too familiar with how the SVFIR is constructed, but I believe calls to functions that have an extapi counterpart should be handled specially. When encountering a call to such a function, e.g., malloc, I imagine we should not just add a directcallgraphedge to the callgraph (CallGraphBuilder.cpp), but instead add the ICFG nodes of the callee to the ICFG of the caller (inlining the SVFIR representation of the function). To be clear, I don't think we should inline the actual extapi code into the app module. re 2. As long as we call the declarations in the extapi, they should not get optimized out, I think. But if it proves to be an issue, we can use the same workaround as in extapi.c right now and add a dummy body. The bodies of functions with annotations on them will then simply be ignored. (note that this is still different from the current format; non-annotated functions can still behave like "ALLOC_RET"). I'm out of my depth on how to implement such "inlining" in SVF; could you point me towards where the SVFIR of the current extapi entries is queried during pointer analysis? I imagine that the changes would have to happen there. |
I agree with your idea of inline the PAG information from the external API into the caller's PAG. However, constructing the PAG can involve various data structures, and if the body of the external API function is complex, inlining the PAG from the external API into the caller's PAG can be challenging. This is because it's not as simple as adding a copy or store edge at the external callsite, as was done in the previous ExtAPI.json. SVFIR requires the information from the external API to be present at this place: SVF/svf-llvm/lib/SVFIRBuilder.cpp Line 850 in dab4f64
Even when calling a declaration within extapi.c, while using Clang to compile extapi.c into extapi.bc with optimization flags disabled (-O0), the annotations information still cannot be preserved. Therefore, it is necessary to include an empty function body to retain the annotations information. |
For the second challenge, welcome any solution to remove the dummy function body for external functions with annotation. |
I've looked a little more at how to do this (challenge 1), and I believe it would be good to create separate SVFInstructions per-callsite for the extapi functions. So in All later analyses would observe the contents of the extapi functions as if they were inlined into the app functions, which achieves our goal, I believe. One (major?) issue i see with this currently are indirect calls to extapi functions. At the SVFFunction construction level, do we know yet an estimation of the targets of indirect calls? |
In pull request #1188, "attribute((annotate("STATIC")))" has been removed, and instead, returns malloc() within the function body. When functions with the "STATIC" annotation in previous extapi.c are called multiple times, only the same object will be returned. |
@adriaanjacobs can you take a look at #1188 before I merge it? |
Hi
We've submitted patches in the past to allow extending the ExtAPI programmatically with new entries to describe allocation wrappers (#821, #1083), and would like to continue updating that functionality to support the new format. To that purpose, we have a few questions about the direction of the new format, and how you would like extensibility to be handled:
OVERWRITE
attribute will now always clear the function definition in the app module (app_def). This seems different from the previous behaviour of theOVERWRITE_APP
flag: we used to be able to keep the original definition in the app, while SVF simply ignored the app definition and used the behaviour provided by the extapi database (now extapi.bc, previously extapi.json) to handle it.As always, happy to send PRs for these changes. Just wanted to confirm whether they are welcome :)
The text was updated successfully, but these errors were encountered: