- Changes that address cluster instability.
Fixes rolled up from previous hotfix releases
- Improved reliability of JEA endpoint creation.
- Fixed bug to unblock concurrent VM creation in batch sizes of 20 or above.
- Improved the reliability and stability of the portal, adding a monitoring capability to restart the hosting service if it experiences any downtime.
- Addressed an issue where some alerts were not paused during update.
- Improved diagnostics around failures in DSC resources.
- Improved error message generated by an unexpected failure in bare metal deployment script.
- Added resiliency during physical node repair operations.
- Fixed a code defect that sometimes caused HRP SF app to become unhealthy. Also fixed a code defect that prevented alerts from being suspended during update.
- Added resiliency to image creation code when the destination path is unexpectedly not present.
- Added disk cleanup interface for ERCS VMs and ensured that it runs prior to attempting to install new content to those VMs.
- Improved quorum check for Service Fabric node repair in the auto-remediation path.
- Improved logic around bringing cluster nodes back online in rare cases where outside intervention puts them into an unexpected state.
- Improved resiliency of engine code to ensure typos in machine name casing do not cause unexpected state in the ECE configuration when manual actions are used to add and remove nodes.
- Added a health check to detect VM or physical node repair operations that were left in a partially completed state from previous support sessions.
- Improved diagnostic logging for installation of content from NuGet packages during update orchestration.
- Fixed the internal secret rotation failure for customers who use AAD as identity system, and block ERCS outbound internet connectivity.
- Increased the default timeout of Test-AzureStack for AzsScenarios to 45 minutes.
- Improved HealthAgent update reliability.
- Fixed an issue where VM repair of ERCS VMs was not being triggered during remediation actions.
- Made host update resilient to issues caused by a silent failure to clean up stale infrastructure VM files.
- Added a preventative fix for certutil parsing errors when using randomly generated passwords.
- Added a round of health checks prior to the engine update, so that failed admin operations can be allowed to continue running with their original version of orchestration code.
- Fixed ACS backup failure when the ACSSettingsService backup finished first.
- Upgraded Azure Stack AD FS farm behavior level to v4. Azure Stack Hubs deployed with 1908 or later are already on v4.
- Improved reliability of the host update process.
- Fixed a certificate renewal issue that could have caused internal secret rotation to fail.
- Fixed the new time server sync alert to correct an issue where it incorrectly detects a time sync issue when the time source was specified with the 0x8 flag.
- Corrected a validation constraint error that occurred when using the new automatic log collection interface, and it detected
https://login.windows.net/ as an invalid Azure AD endpoint. - Fixed an issue that prevented the use of SQL auto backup via the SQLIaaSExtension.
- Corrected the alerting used in Test-AzureStack when validating the network controller certificates.
- Upgraded Azure Stack AD FS farm behavior level to v4. Azure Stack Hubs deployed with 1908 or later are already on v4.
- Improved reliability of the host update process.
- Fixed a certificate renewal issue that could have caused internal secret rotation to fail.
- Reduced alert triggers in order to avoid unnecessary proactive log collections.
- Improved reliability of storage upgrade by eliminating Windows Health Service WMI call timeout.