-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VAULT-28477 Bootstrap and persist autopilot versions #28186
VAULT-28477 Bootstrap and persist autopilot versions #28186
Conversation
CI Results: |
Build Results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome job! I had a few questions and comments but I don't think anything showstopping.
physical/raft/raft_autopilot.go
Outdated
} | ||
if upgradeVersion == "" { | ||
upgradeVersion = d.upgradeVersion | ||
d.logger.Debug("no persisted state, using leader upgrade version version", "id", id, "upgrade_version", d.effectiveSDKVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d.logger.Debug("no persisted state, using leader upgrade version version", "id", id, "upgrade_version", d.effectiveSDKVersion) | |
d.logger.Debug("no persisted state, using leader upgrade version version", "id", id, "upgrade_version", d.upgradeVersion) |
Description
When setting up Raft and autopilot, the leader node will read a storage entry (
core/raft/autopilot/states
), which has a map of server IDs to upgrade and sdk versions. Autopilot will store this map in memory in a structure calledpersistedStates
. If the storage entry does not exist, an error is logged but operation continues.Whenever the autopilot library calls
NotifyStates
to inform the vault delegate that a state has changed, the leader will check to see if the cluster membership differs from what is in thepersistedStates
map, or if the upgrade version or sdk version differs from thepersistedStates
map. If either of these conditions is true,persistedStates
is updated and written to storage at the pathcore/raft/autopilot/states
.core/raft/autopilot/states
is not replicated to performance or DR secondaries.New nodes joining a cluster will include their sdk version and upgrade version when they answer the Raft bootstrap challenge. When the active node receives this answer, the versions will be stored (along with the other server state) in the follower states map.
When autopilot routinely calls
KnownServers
to get information about the nodes in the cluster, the leader will:The persisted states need to exist in order to ensure that a new leader doesn’t demote existing voters if their heartbeat is late. If the persisted states weren’t available, a new leader wouldn’t have any knowledge of the other node’s versions until the heartbeats happened.
Ent PR: https://github.com/hashicorp/vault-enterprise/pull/6351
Doc: https://docs.google.com/document/d/10MY9U-r8dH46-ICIdrObEjVfHKwQxLzqzh33YWWnYrQ/edit
TODO only if you're a HashiCorp employee
to N, N-1, and N-2, using the
backport/ent/x.x.x+ent
labels. If this PR is in the CE repo, you should only backport to N, using thebackport/x.x.x
label, not the enterprise labels.of a public function, even if that change is in a CE file, double check that
applying the patch for this PR to the ENT repo and running tests doesn't
break any tests. Sometimes ENT only tests rely on public functions in CE
files.
in the PR description, commit message, or branch name.
description. Also, make sure the changelog is in this PR, not in your ENT PR.