-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HasId
method to ValueStore
for debug checks
#4866
base: trunk
Are you sure you want to change the base?
Conversation
// Checks whether the ID has a value in this store. | ||
// | ||
// This is usually unnecessary to check, but can add safety to debug code. | ||
// Note that the ID may originate from a different store with a different | ||
// value. | ||
auto HasId(IdT id) const -> bool { | ||
return id.index >= 0 && static_cast<size_t>(id.index) < size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's it look like if you just check against size()
in debug code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonmeow I was writing size()
checks in Dump
functions, but found it to be a little noisy when working with ValueStore
. Here's an example of my first sketch:
toolchain/check/dump.cpp:48-55
:
+ if (static_cast<size_t>(loc_id.import_ir_inst_id().index) >=
+ context.sem_ir().import_ir_insts().size()) {
+ llvm::errs() << "LocId(<out of bounds import instruction>)";
+ return;
+ }
auto import_ir_id = context.sem_ir()
.import_ir_insts()
.Get(loc_id.import_ir_inst_id())
.ir_id;
+ if (static_cast<size_t>(import_ir_id.index) >=
+ context.sem_ir().import_irs().size()) {
+ llvm::errs() << "LocId(<out of bounds import file>)";
+ return;
+ }
const auto* import_file =
context.sem_ir().import_irs().Get(import_ir_id).sem_ir;
llvm::errs() << "LocId(import from \""
<< FormatEscaped(import_file->filename()) << "\")";
For Lex & Parse, each having only one small Dump
function, checking against size()
is relatively quiet and works well.
Looking now at toolchain/sem_ir/dump.cpp
, I realize that many ValueStore
types there are wrappers. 😅 So HasId
won't be as widely available as I originally thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One alternative would be a template function like MemUsage::Collect
for types that have an array_ref()
or size()
member. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether this is what you're thinking about with the Collect
example, but how about:
- Add support to
ErrorOr
to support printing ifT
is printable. - Add a
SafeGet
helper somewhere liketoolchain/base/dump.h
(i.e., separating the dump-specific functions from regular code) that looks something liketemplate<ValueStoreT, IdT> auto SafeGet(ValueStoreT value_store, IdT id) -> ErrorOr<ValueStoreT::ConstRefType>
[if any value store-ish things are missingsize
, I think it'd be fair to add)
Note, I'm suggesting this to allow changes like:
static auto DumpNoNewline(const File& file, InstId inst_id) -> void {
llvm::errs() << inst_id;
if (inst_id.has_value()) {
- llvm::errs() << ": " << file.insts().Get(inst_id);
+ llvm::errs() << ": " << SafeGet(file.insts(), inst_id);
}
}
The example you're looking at with LocId
is a little more complex and would need to examine the ErrorOr
, but for the SemIR
case it should be easier to replace.
@@ -68,13 +68,13 @@ class ValueStore | |||
|
|||
// Returns a mutable value for an ID. | |||
auto Get(IdT id) -> RefType { | |||
CARBON_DCHECK(id.index >= 0, "{0}", id); | |||
CARBON_DCHECK(HasId(id), "{0}", id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny sidenote, SmallVector
has a assert(idx < size());
, which already covers what's changing here.
// Checks whether the ID has a value in this store. | ||
// | ||
// This is usually unnecessary to check, but can add safety to debug code. | ||
// Note that the ID may originate from a different store with a different | ||
// value. | ||
auto HasId(IdT id) const -> bool { | ||
return id.index >= 0 && static_cast<size_t>(id.index) < size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether this is what you're thinking about with the Collect
example, but how about:
- Add support to
ErrorOr
to support printing ifT
is printable. - Add a
SafeGet
helper somewhere liketoolchain/base/dump.h
(i.e., separating the dump-specific functions from regular code) that looks something liketemplate<ValueStoreT, IdT> auto SafeGet(ValueStoreT value_store, IdT id) -> ErrorOr<ValueStoreT::ConstRefType>
[if any value store-ish things are missingsize
, I think it'd be fair to add)
Note, I'm suggesting this to allow changes like:
static auto DumpNoNewline(const File& file, InstId inst_id) -> void {
llvm::errs() << inst_id;
if (inst_id.has_value()) {
- llvm::errs() << ": " << file.insts().Get(inst_id);
+ llvm::errs() << ": " << SafeGet(file.insts(), inst_id);
}
}
The example you're looking at with LocId
is a little more complex and would need to examine the ErrorOr
, but for the SemIR
case it should be easier to replace.
### Context & Motivation The error handling utilities in `//base/error.h` are very useful for writing code with strong safety guarantees. While hardening the `Dump` debug utilities (from review in #4866), I encountered a rough edge with references and pointers. After a [brief Discord discussion in #contributing-help](https://discord.com/channels/655572317891461132/1052653651895779359/1334675462877610038), it was suggested that adding support for references to `ErrorOr` would be a good candidate to move forward. Using a reference type with the `ErrorOr` class (e.g. `ErrorOr<Node&>`) produces two errors: <ol> <li><strong><code>variant can not have a reference type as an alternative</code></strong> <ul><li>From private field: <code>std::variant<Error, T> val_;</code></li></ul> </li> <li><strong><code>'operator->' declared as a pointer to a reference</code></strong> <ul><li>From member function: <code>auto operator->() -> T*</code></li></ul> </li> </ol> ### Changes To support reference types, both errors are resolved: 1. `std::reference_wrapper` is conditionally used for storage when `T` is a reference type 2. type trait aliases like `using ValueT = std::remove_reference_t<T>` are used to produce compatible types for methods like `auto operator->() -> ValueT*`
Small addition to
ValueStore
to check if an ID can be safely used to access values from the store. This change iterates on the DCHECK bounds check that currently exists inValueStore::Get
.When working with multiple
ValueStore
s, IDs can become mixed up and misused to access the wrongValueStore
. This always indicates a bug in the code, so this method is added for the convenience of debug checks to identify bugs.In many situations there is only one
ValueStore
instance available with a unique ID type, so it's guaranteed that those IDs came fromValueStore::Add
; thus, this check is usually unnecessary. To deter overly-cautious code, this method is documented as a debug tool.I plan to use this method in a follow-up PR to harden
Dump
functions against out-of-bounds IDs.