And what about non-corosync configurations? Password Linux - General This Linux forum is for general Linux questions and discussion. Skip to ContentSkip to FooterSolutions Transform to a Hybrid Infrastructure Protect Your Digital Enterprise Empower the Data-Driven Organization Enable Workplace Productivity Cloud Security Big Data Mobility Infrastructure Internet of Things Small We Acted. http://dukesoftwaresolutions.com/an-unrecoverable/an-unrecoverable-system-error-nmi-has-occurred-system-error-code-0x0000002b-0x00000000.html
Without the module the server reboot. All watchdogs are blacklisted by default in Ubuntu and can be enabled if needed (like for example a case where corosync wants to rely on HW watchdog for making sure that intel_idle+0xe7/0x160 [ 5493.734432] <
iLO Event Log [ 5492.505988] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-123.9.2.el7.x86_64 #1 [ 5492.605615] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014 [ 5492.692636] ffffffffa03ae2d8 17844fa82b224426 ffff880fffa06de0 If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'. Tried to update the Intelligent Provisioning but there is no new update for DL360 G8 server 2012 R2. I have the Windows updated as well with the latest update from Microsoft. Finally I run the anti-virus full scan (Trend Micro Worry Free) and Malwarebytes and nothing was detected.
Systems are crashing with following panic message: [ 5492.146364] Kernel panic - not syncing: An NMI occurred. In addition, I think there is a second problem here. So it is recommended that on all HP Proliant Servers Gen8, or newer, to use the following cmdline: " intremap=no_x2apic_optout ". Ilo Application Watchdog Timeout Nmi Service Information 0x0000002b 0x00000000 OA Syslog 3.
This seems to be a kernel/driver/firmware/platform issue that prevented the watchdog NMI from being reported in customer friendly terms. An Unrecoverable System Error (nmi) Has Occurred (service Information: 0x7fbce8f6, 0x00000000) Thank's a lot for investigating. We have a ceph cluster with 3 hosts, 3 monitors up and running on this lab and erverything seems to be quite good. HP is trying to figure out what is generating the NMIs with intel_idle but it might be the case to recommend all HP servers to deactivate intel_idle module (in a near
Solution Verified - Updated 2016-08-29T04:26:10+00:00 - English No translations currently exist. Uncorrectable Pci Express Error Dl380p Gen8 Report a bug This report contains Public information Edit Everyone can see this information. Thank you! Rafael David Tinoco (inaddy) wrote on 2015-03-18: #6 Sorry, there is a misunderstanding regarding the case and this bug.
Open Source Communities Subscriptions Downloads Support Cases Account Back Log In Register Red Hat Account Number: Account Details Newsletter and Contact Preferences User Management Account Maintenance My Profile Notifications Help Log https://bugs.launchpad.net/bugs/1432837 Rafael David Tinoco (inaddy) wrote on 2015-04-07: #12 Checked /lib/modprobe.d/blacklist_linux_* on Precise, Trusty, Utopic and Vivid and all of the contain hpwdt being blacklisted. An Unrecoverable System Error (nmi) Has Occurred Proliant Please test the kernel and update this bug with the results. An Unrecoverable System Error Has Occurred Error Code 0x0000002d 0x00000000 They replaced the PSP with another acronym for G7s and above) supports Emulex cards, That's Emulex rather than the HP branded variety.
proxmox-ve: 4.0-16 (running kernel: 4.2.2-1-pve) pve-manager: 4.0-50 (running version: 4.0-50/d3a6b7e5) pve-kernel-4.2.2-1-pve: 4.2.2-16 lvm2: 2.02.116-pve1 corosync-pve: 2.3.5-1 libqb0: 0.17.2-1 pve-cluster: 4.0-23 qemu-server: 4.0-31 pve-firmware: 1.1-7 libpve-common-perl: 4.0-32 libpve-access-control: 4.0-9 libpve-storage-perl: 4.0-27 pve-libspice-server1: http://dukesoftwaresolutions.com/an-unrecoverable/an-unrecoverable-system-error-has-occurred-error-code-0x0000002d-0x00000000.html The latest version is 2014.02.0(B). This occur only on the HP server. This is exactly, what I got.. Ilo Watchdog Nmi
We have backported the fix to Ubuntu-3.13.0-35.61. ILO: "76 CriticalSystem Error03/12/2015 12:4203/12/2015 12:072 An Unrecoverable System Error (NMI) has occurred (System error code 0x0000002B, 0x00000000)" Examples: PID: 0 TASK: ffffffff81c1a480 CPU: 0 COMMAND: "swapper/0" #0 [ffff88085fc05c88] machine_kexec at A few months ago they phoned me because one of their LOB programs was reporting some errors, and several times a day when they try to save or open Word documents Check This Out We are getting feedback from community that these options are being enough to avoid the Proliant Server Family to have kernel panics and they might be released as a "public recommendation"
It seems a buggy iLO driver can cause NMI ASRs under some conditions. Kernel Panic - Not Syncing: An Nmi Occurred Removing the watchdog is not a proper solution. I then created a scheduled task to reboot the server every day at 06:30Am with the hope it will solve the problem.
The issue occurs most often when we use live migration. BRs, Spyros kaito.7 View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by kaito.7 06-02-2014, 05:00 AM #2 Ser Olmy Senior Member If you blacklist watchdog module server not panic but reset immediatelly. Nmi Detected Please Consult The Integrated Management Log For More Details BRs, Spyros kaito.7 View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by kaito.7 Thread Tools Show Printable Version Email this Page Search
You'll need to look at any system events and error codes prior to the ASR to determine the reason. Reason: Added link to the HP forum Ser Olmy View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by Ser Olmy 06-02-2014, 06:33 AM Please test the kernel and update this bug with the results. http://dukesoftwaresolutions.com/an-unrecoverable/hp-an-unrecoverable-system-error-has-occurred-error-code-0x0000002d-0x00000000.html The IML log is on the System Status page of the iLO web interface.
Red Hat Account Number: Red Hat Account Account Details Newsletter and Contact Preferences User Management Account Maintenance Customer Portal My Profile Notifications Help For your security, if you’re on a public but it's a bit different, you are right. #14 pipomambo, Nov 11, 2015 adamb Member Proxmox VE Subscriber Joined: Mar 1, 2012 Messages: 777 Likes Received: 3 pipomambo said: ↑ It has 10Gb of RAM and 2X 250Gb SATA HDD's on a RAID 1 configuration. This Issue is not a Proxmox VE one.Click to expand...
In our case the problems appear only in the server that we have Microsoft Virtual Server 2005 and Hypper-V.Rergards,Andres 0 Kudos Reply cevers Occasional Visitor Options Mark as New Bookmark Subscribe Doesn't sound quite like the same issue. I was wrong. My next step was to run all diagnostic in the Insight Diagnostics Online Edition. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.
Please test the kernel and update this bug with the results. We have a cluster on Proxmox V4.0-48 with two Dell R900 and one HP DL380 G9.