Releases: ClusterLabs/pacemaker
Releases · ClusterLabs/pacemaker
Pacemaker 1.1.11 - Release Candidate 3
Changes since Release Candidate 2
- Fix: ipc: fix memory leak for failed ipc client connections.
- Fix: pengine: Fixes memory leak in regex pattern matching code for constraints.
- Low: Avoid potentially misleading and inaccurate compression time log msg
- Fix: crm_report: Suppress logging errors after the target directory has been compressed
- Fix: crm_attribute: Do not swallow hostname lookup failures
- Fix: crmd: Avoid deleting the 'shutdown' attribute
- Log: attrd: Quote attribute names
- Doc: Pacemaker_Explained: Fix formatting
If you are a user of pacemaker_remoted
, you should take the time to read about changes to the online wire protocol that are present in this release.
Pacemaker 1.1.11 - Release Candidate 2
Changes since Release Candidate 1
- Build: cts: Install cib_xml.py
- Low: report: Add support for xz compressed logs
- Fix: attrd: Memory leak
- Fix overflow on SMTP subject line
- Fix: Removes unnecessary newlines in crm_resource -O output
- Fix: crmd: Memory leak
- Fixed a problem that chkconfig has not been able to get a list correctly in the environment of RHEL5.
If you are a user of pacemaker_remoted
, you should take the time to read about changes to the online wire protocol that are present in this release.
Pacemaker 1.1.11 - Release Candidate 1
The most notable changes/fixes since Pacemaker-1.1.10 include:
- attrd: Implementation of a truely atomic attrd for use with corosync 2.x
- cib: Allow values to be added/updated and removed in a single update
- cib: Support XML comments in diffs
- Core: Allow blackbox logging to be disabled with SIGUSR2
- crmd: Do not block on proxied calls from pacemaker_remoted
- crmd: Enable cluster-wide throttling when the cib heavily exceeds its target load
- crmd: Use the load on our peers to know how many jobs to send them
- crm_mon: add --hide-headers option to hide all headers
- crm_report: Collect logs directly from journald if available
- Fencing: On timeout, clean up the agent's entire process group
- Fencing: Support agents that need the host to be unfenced at startup
- ipc: Raise the default buffer size to 128k
- PE: Add a special attribute for distinguishing between real nodes and containers in constraint rules
- PE: Allow location constraints to take a regex pattern to match against resource IDs
- pengine: Distinguish between the agent being missing and something the agent needs being missing
- remote: Properly version the remote connection protocol
- services: Detect missing agents and permission errors before forking
- Bug cl#5171 - pengine: Don't prevent clones from running due to dependant resources
- Bug cl#5179 - Corosync: Attempt to retrieve a peer's node name if it is not already known
- Bug cl#5181 - corosync: Ensure node IDs are written to the CIB as unsigned integers
If you are a user of pacemaker_remoted
, you should take the time to read about changes to the online wire protocol that are present in this release.
1.1.10 - final
Details - 1.1.10 - final
Changesets | 602 |
Diff | 143 files changed, 8162 insertions(+), 5159 deletions(-) |
Highlights
Features added since Pacemaker-1.1.9
- Core: Convert all exit codes to positive errno values
- crm_error: Add the ability to list and print error symbols
- crm_resource: Allow individual resources to be reprobed
- crm_resource: Allow options to be set recursively
- crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove)
- crm_resource: Support OCF tracing when using --force-(check|start|stop)
- PE: Allow active nodes in our current membership to be fenced without quorum
- PE: Suppress meaningless IDs when displaying anonymous clone status
- Turn off auto-respawning of systemd services when the cluster starts them
- Bug cl#5128 - pengine: Support maintenance mode for a single node
Changes since Pacemaker-1.1.9
- crmd: cib: stonithd: Memory leaks resolved and improved use of glib reference counting
- attrd: Fixes deleted attributes during dc election
- Bug cf#5153 - Correctly display clone failcounts in crm_mon
- Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation
- Bug cl#5148 - legacy: Correctly remove a node that used to have a different nodeid
- Bug cl#5151 - Ensure node names are consistently compared without case
- Bug cl#5152 - crmd: Correctly clean up fenced nodes during membership changes
- Bug cl#5154 - Do not expire failures when on-fail=block is present
- Bug cl#5155 - pengine: Block the stop of resources if any depending resource is unmanaged
- Bug cl#5157 - Allow migration in the absence of some colocation constraints
- Bug cl#5161 - crmd: Prevent memory leak in operation cache
- Bug cl#5164 - crmd: Fixes crash when using pacemaker-remote
- Bug cl#5164 - pengine: Fixes segfault when calculating transition with remote-nodes.
- Bug cl#5167 - crm_mon: Only print "stopped" node list for incomplete clone sets
- Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints
- Bug cl#5170 - Correctly support on-fail=block for clones
- cib: Correctly read back archived configurations if the primary is corrupted
- cib: The result is not valid when diffs fail to apply cleanly for CLI tools
- cib: Restore the ability to embed comments in the configuration
- cluster: Detect and warn about node names with capitals
- cman: Do not pretend we know the state of nodes we've never seen
- cman: Do not unconditionally start cman if it is already running
- cman: Support non-blocking CPG calls
- Core: Ensure the blackbox is saved on abnormal program termination
- corosync: Detect the loss of members for which we only know the nodeid
- corosync: Do not pretend we know the state of nodes we've never seen
- corosync: Ensure removed peers are erased from all caches
- corosync: Nodes that can persist in sending CPG messages must be alive afterall
- crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns
- crmd: Do not update fail-count and last-failure for old failures
- crmd: Ensure all membership operations can complete while trying to cancel a transition
- crmd: Ensure operations for cleaned up resources don't block recovery
- crmd: Ensure we return to a stable state if there have been too many fencing failures
- crmd: Initiate node shutdown if another node claims to have successfully fenced us
- crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons
- crmd: Properly handle recurring monitor operations for remote-node agent
- crmd: Store last-run and last-rc-change for all operations
- crm_mon: Ensure stale pid files are updated when a new process is started
- crm_report: Correctly collect logs when 'uname -n' reports fully qualified names
- fencing: Fail the operation once all peers have been exhausted
- fencing: Restore the ability to manually confirm that fencing completed
- ipc: Allow unpriviliged clients to clean up after server failures
- ipc: Restore the ability for members of the haclient group to connect to the cluster
- legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278)
- lrmd: Default to the upstream location for resource agent scratch directory
- lrmd: Pass errors from lsb metadata generation back to the caller
- pengine: Correctly handle resources that recover before we operate on them
- pengine: Delete the old resource state on every node whenever the resource type is changed
- pengine: Detect constraints with inappropriate actions (ie. promote for a clone)
- pengine: Ensure per-node resource parameters are used during probes
- pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop
- pengine: Implement the rest of get_timet_now() and rename to get_effective_time
- pengine: Re-initiate active recurring monitors that previously failed but have timed out
- remote: Workaround for inconsistent tls handshake behavior between gnutls versions
- systemd: Ensure we get shut down correctly by systemd
- systemd: Reload systemd after adding/removing override files for cluster services
- xml: Check for and replace non-printing characters with their octal equivalent while exporting xml text
- xml: Prevent lockups by setting a more reliable buffer allocation strategy
1.1.10 - Release Candidate 7
Details - 1.1.10-rc6
Changesets | 57 |
Diff | 37 files changed, 414 insertions(+), 331 deletions(-) |
Features added in Pacemaker-1.1.10-rc7
- N/A
Changes since Pacemaker-1.1.10-rc6
- Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints
- Bug cl#5170 - Correctly support on-fail=block for clones
- Bug cl#5164 - crmd: Fixes crmd crash when using pacemaker-remote
- cib: The result is not valid when diffs fail to apply cleanly for CLI tools
- cluster: Correctly construct the header for compressed messages
- cluster: Detect and warn about node names with capitals
- Core: remove the mainloop_trigger that are no longer needed.
- corosync: Ensure removed peers are erased from all caches
- cpg: Correctly free sent messages
- crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons
- crmd: Properly handle recurring monitor operations for remote-node agent
- crm_mon: Bug cl#5167 - Only print "stopped" node list for incomplete clone sets
- crm_node: Return 0 if --remove passed
- fencing: Correctly detect existing device entries when registering a new one
- lrmd: Prevent use-of-NULL in client library
- pengine: cl5164 - Fixes pengine segfault when calculating transition with remote-nodes.
- pengine: Do the right thing when admins specify the internal resource instead of the clone
- pengine: Re-allow ordering constraints with fencing devices now that it is safe to do so
1.1.10 - Release Candidate 6
Details
Changesets | 63 |
Diff | 24 files changed, 356 insertions(+), 133 deletions(-) |
Highlights
Features added
- tools: crm_mon --neg-location drbd-fence-by-handler
- pengine: cl#5128 - Support maintenance mode for a single node
Other Changes
- cluster: Correctly remove duplicate peer entries
- crmd: Ensure operations for cleaned up resources don't block recovery
- pengine: Bug cl#5157 - Allow migration in the absence of some colocation constraints
- pengine: Delete the old resource state on every node whenever the resource type is changed
- pengine: Detect constraints with inappropriate actions (ie. promote for a clone)
- pengine: Do the right thing when admins specify the internal resource instead of the clone
1.1.10 - Release Candidate 5
Details
Changesets | 168 |
Diff | 96 files changed, 4983 insertions(+), 3097 deletions(-) |
Features added
- crm_error: Add the ability to list and print error symbols
- crm_resource: Allow individual resources to be reprobed
- crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove)
- crm_resource: Support OCF tracing when using --force-(check|start|stop)
- PE: Allow active nodes in our current membership to be fenced without quorum
- Turn off auto-respawning of systemd services when the cluster starts them
Other Changes
- Bug pengine: cl#5155 - Block the stop of resources if any depending resource is unmanaged
- Convert all exit codes to positive errno values
- Core: Ensure the blackbox is saved on abnormal program termination
- corosync: Detect the loss of members for which we only know the nodeid
- corosync: Do not pretend we know the state of nodes we've never seen
- corosync: Nodes that can persist in sending CPG messages must be alive afterall
- crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns
- crmd: Ensure all membership operations can complete while trying to cancel a transition
- crmd: Everyone who gets a fencing notification should mark the node as down
- crmd: Initiate node shutdown if another node claims to have successfully fenced us
- crmd: Update the status section with details of nodes for which we only know the nodeid
- crm_report: Find logs in compressed files
- logging: If SIGTRAP is sent before tracing is turned on, turn it on
- pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop
- remote: Workaround for inconsistent tls handshake behavior between gnutls versions
- systemd: Ensure we get shut down correctly by systemd
1.1.10 - Release Candidate 3
Details
Changesets | 116 |
Diff | 59 files changed, 707 insertions(+), 408 deletions(-) |
Highlights
Features added
- PE: Display a list of nodes on which stopped anonymous clones are not active instead of meaningless clone IDs
- PE: Suppress meaningless IDs when displaying anonymous clone status
Other Changes
- Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation
- Bug cl#5151 - Ensure node names are consistently compared without case
- Check for and replace non-printing characters with their octal equivalent while exporting xml text
- cib: CID#1023858 - Explicit null dereferenced
- cib: CID#1023862 - Improper use of negative value
- cib: CID#739562 - Improper use of negative value
- cman: Our daemons have no need to connect to pacemakerd in a cman based cluster
- crmd: Do not record pending delete operations in the CIB
- crmd: Ensure pending and lost actions have values for last-run and last-rc-change
- crmd: Insert async failures so that they appear in the correct order
- crmd: Store last-run and last-rc-change for fail operations
- Detect child processes that terminate before our SIGCHLD handler is installed
- fencing: CID#739461 - Double close
- fencing: Correctly broadcast manual fencing ACKs
- fencing: Correctly mark manual confirmations as complete
- fencing: Do not send duplicate replies for manual confirmation operations
- fencing: Restore the ability to manually confirm that fencing completed
- lrmd: CID#1023851 - Truncated stdio return value
- lrmd: Don't complain when heartbeat invokes us with -r
- pengine: Correctly handle resources that recover before we operate on them
- pengine: Re-initiate active recurring monitors that previously failed but have timed out
- xml: Restore the ability to embed comments in the cib
1.1.10 - Release Candidate 2
Details - 1.1.10-rc2
Changesets | 31 |
Diff | 30 files changed, 687 insertions(+), 138 deletions(-) |
Highlights
Features added in Pacemaker-1.1.10-rc2
N/A
Changes since Pacemaker-1.1.10-rc1
- Bug cl#5152 - Correctly clean up fenced nodes during membership changes
- Bug cl#5153 - Correctly display clone failcounts in crm_mon
- Bug cl#5154 - Do not expire failures when on-fail=block is present
- cman: Skip cman_pre_stop in the init script if fenced is not running
- Core: Ensure the last field in transition keys is 36 characters
- crm_mon: Check if a process can be daemonized before forking so the parent can report an error
- crm_mon: Ensure stale pid files are updated when a new process is started
- crm_report: Correctly collect logs when 'uname -n' reports fully qualified names
- crm_resource: Allow --cleanup without a resource name
- init: Unless specified otherwise, assume cman is in use if cluster.conf exists
- mcp: inhibit error messages without cman
- pengine: Ensure per-node resource parameters are used during probes
- pengine: Implement the rest of get_timet_now() and rename to get_effective_time
1.1.10 - Release Candidate 1
Details - 1.1.10-rc1
Changesets | 143 |
Diff | 104 files changed, 3327 insertions(+), 1186 deletions(-) |
Highlights
Features added in Pacemaker-1.1.10
- crm_resource: Allow individual resources to be reprobed
- mcp: Alternate Upstart job controlling both pacemaker and corosync
- mcp: Prevent the cluster from trying to use cman even when it is installed
Changes since Pacemaker-1.1.9
- Allow programs in the haclient group to use CRM_CORE_DIR
- cman: Do not unconditionally start cman if it is already running
- core: Ensure custom error codes are less than 256
- crmd: Clean up memory at exit
- crmd: Do not update fail-count and last-failure for old failures
- crmd: Ensure we return to a stable state if there have been too many fencing failures
- crmd: Indicate completion of refresh to callers
- crmd: Indicate completion of re-probe to callers
- crmd: Only perform a dry run for deletions if built with ACL support
- crmd: Prevent use-after-free when the blackbox is enabled
- crmd: Suppress secondary errors when no metadata is found
- doc: Pacemaker Remote deployment and reference guide
- fencing: Avoid memory leak in can_fence_host_with_device()
- fencing: Clean up memory at exit
- fencing: Correctly filter devices when no nodes are configured yet
- fencing: Correctly unpack device parameters before using them
- fencing: Fail the operation once all peers have been exhausted
- fencing: Fix memory leaks during query phase
- fencing: Prevent empty call-id during notification processing
- fencing: Prevent invalid read in parse_host_list()
- fencing: Prevent memory leak when registering devices
- crmd: lrmd: stonithd: fixed memory leaks
- ipc: Allow unpriviliged clients to clean up after server failures
- ipc: Restore the ability for members of the haclient group to connect to the cluster
- legacy: cl#5148 - Correctly remove a node that used to have a different nodeid
- legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278)
- logging: Better checks when determining if file based logging will work
- Pass errors from lsb metadata generation back to the caller
- pengine: Do not use functions from the cib library during unpack
- Prevent use-of-NULL when reading CIB_shadow from the environment
- Skip WNOHANG when waiting after sending SIGKILL to child processes
- tools: crm_mon - Print a timing field only if its value is non-zero
- Use custom OCF_ROOT_DIR if requested
- xml: Prevent lockups by setting a more reliable buffer allocation strategy
- xml: Prevent use-after-free in cib_process_xpath()
- xml: Prevent use-after-free when not processing all xpath query results
Details - 1.1.9
Changesets | 731 |
Diff | 1301 files changed, 92909 insertions(+), 57455 deletions(-) |
Highlights
Features added in Pacemaker-1.1.9
- corosync: Allow cman and corosync 2.0 nodes to use a name other than uname()
- corosync: Use queues to avoid blocking when sending CPG messages
- ipc: Compress messages that exceed the configured IPC message limit
- ipc: Use queues to prevent slow clients from blocking the server
- ipc: Use shared memory by default
- lrmd: Support nagios remote monitoring
- lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster.
- pengine: Check for master/slave resources that are not OCF agents
- pengine: Support a 'requires' resource meta-attribute for controlling whether it needs quorum, fencing or nothing
- pengine: Support for resource container
- pengine: Support resources that require unfencing before start
Changes since Pacemaker-1.1.8
- attrd: Correctly handle deletion of non-existant attributes
- Bug cl#5135 - Improved detection of the active cluster type
- Bug rhbz#913093 - Use crm_node instead of uname
- cib: Avoid use-after-free by correctly support cib_no_children for non-xpath queries
- cib: Correctly process XML diff's involving element removal
- cib: Performance improvements for non-DC nodes
- cib: Prevent error message by correctly handling peer replies
- cib: Prevent ordering changes when applying xml diffs
- cib: Remove text nodes from cib replace operations
- cluster: Detect node name collisions in corosync
- cluster: Preserve corosync membership state when matching node name/id entries
- cman: Force fenced to terminate on shutdown
- cman: Ignore qdisk 'nodes'
- core: Drop per-user core directories
- corosync: Avoid errors when closing failed connections
- corosync: Ensure peer state is preserved when matching names to nodeids
- corosync: Clean up CMAP connections after querying node name
- corosync: Correctly detect corosync 2.0 clusters even if we don't have permission to access it
- crmd: Bug cl#5144 - Do not updated the expected status of failed nodes
- crmd: Correctly determin if cluster disconnection was abnormal
- crmd: Correctly relay messages for remote clients (bnc#805626, bnc#804704)
- crmd: Correctly stall the FSA when waiting for additional inputs
- crmd: Detect and recover when we are evicted from CPG
- crmd: Differentiate between a node that is up and coming up in peer_update_callback()
- crmd: Have cib operation timeouts scale with node count
- crmd: Improved continue/wait logic in do_dc_join_finalize()
- crmd: Prevent election storms caused by getrusage() values being too close
- crmd: Prevent timeouts when performing pacemaker level membership negotiation
- crmd: Prevent use-after-free of fsa_message_queue during exit
- crmd: Store all current actions when stalling the FSA
- crm_mon: Do not try to render a blank cib and indicate the previous output is now stale
- crm_mon: Fixes crm_mon crash when using snmp traps.
- crm_mon: Look for the correct error codes when applying configuration updates
- crm_report: Ensure policy engine logs are found
- crm_report: Fix node list detection
- crm_resource: Have crm_resource generate a valid transition key when sending resource commands to the crmd
- date/time: Bug cl#5118 - Correctly convert seconds-since-epoch to the current time
- fencing: Attempt to provide more information that just 'generic error' for failed actions
- fencing: Correctly record completed but previously unknown fencing operations
- fencing: Correctly terminate when all device options have been exhausted
- fencing: cov#739453 - String not null terminated
- fencing: Do not merge new fencing requests with stale ones from dead nodes
- fencing: Do not start fencing until entire device topology is found or query results timeout.
- fencing: Do not wait for the query timeout if all replies have arrived
- fencing: Fix passing of parameters from CMAN containing '='
- fencing: Fix non-comparison when sorting devices by priority
- fencing: On failure, only try a topology device once from the remote level.
- fencing: Only try peers for non-topology based operations once
- fencing: Retry stonith device for duration of action's timeout period.
- heartbeat: Remove incorrect assert during cluster connect
- ipc: Bug cl#5110 - Prevent 100% CPU usage when looking for synchronous replies
- ipc: Use 50k as the default compression threshold
- legacy: Prevent assertion failure on routing ais messages (bnc#805626)
- legacy: Re-enable logging from the pacemaker plugin
- legacy: Relax the 'active' check for plugin based clusters to avoid false negatives
- legacy: Skip peer process check if the process list is empty in crm_is_corosync_peer_active()
- mcp: Only define HA_DEBUGLOG to avoid agent calls to ocf_log printing everything twice
- mcp: Re-attach to existing pacemaker components when mcp fails
- pengine: Any location constraint for the slave role applies to all roles
- pengine: Avoid leaking memory when cleaning up failcounts and using containers
- pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups
- pengine: Bug cl#5140 - Allow set members to be stopped when the subseqent set has require-all=false
- pengine: Bug cl#5143 - Prevent shuffling of anonymous master/slave instances
- pengine: Bug rhbz#880249 - Ensure orphan masters are demoted before being stopped
- pengine: Bug rhbz#880249 - Teach the PE how to recover masters into primitives
- pengine: cl#5025 - Automatically clear failcount for start/monitor failures after resource parameters change
- pengine: cl#5099 - Probe operation uses the timeout value from the minimum interval monitor by default (#bnc776386)
- pengine: cl#5111 - When clone/master child rsc has on-fail=stop, insure all children stop on failure.
- pengine: cl#5142 - Do not delete orphaned children of an anonymous clone
- pengine: Correctly unpack active anonymous clones
- pengine: Ensure previous migrations are closed out before attempting another one
- pengine: Introducing the whitebox container resources feature
- pengine: Prevent double-free for cloned primitive from template
- pengine: Process rsc_ticket dependencies earlier for correctly allocating resources (bnc#802307)
- pengine: Remove special cases for fencing resources
- pengine: rhbz#902459 - Remove rsc node status for orphan resources
- systemd: Gracefully handle unexpected DBus return types
- Replace the use of the insecure mktemp(3) with mkstemp(3)