centos6/7网卡休眠造成网络不通问题



Since upgrading a CentOS 6.4 with a SuperMicro X8SIE-F/X9SCL board with Intel 82574L NIC to 2.6.32-431.el6.x86_64 and rebooting, I have been having consistent NIC failures where the NIC shuts down permanently until a soft reboot is performed.
从centos 6.4升级到centos 6.5后,经常出现网络断开的现象。

查看系统messages日志,发现了如下kernel错误

# cat /var/log/messages | grep eth1
kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
kernel: e1000e 0000:02:00.0: eth1: Reset adapter unexpectedly
kernel: e1000e 0000:02:00.0: eth1: Timesync Tx Control register not set as expected
kernel: e1000e 0000:02:00.0: eth1: Timesync Tx Control register not set as expected
kernel: e1000e 0000:02:00.0: eth1: Timesync Tx Control register not set as expected
kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
# uname -a
Linux svr.lifelinux.com 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
# ethtool -i eth0
driver: e1000e
version: 2.3.2-k
firmware-version: 0.13-4
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
# dmidecode --type baseboard
SMBIOS 2.7 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: Supermicro
        Product Name: X9SCL/X9SCM
        Version: 1.11A
        Serial Number: ZM2BS41603
        Asset Tag: To be filled by O.E.M.
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: To be filled by O.E.M.
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0
I found the following conversations on the net, exactly the same situation:


文献参考:
https://bugzilla.redhat.com/show_bug.cgi?id=625776
https://lkml.org/lkml/2012/3/17/48
http://lists.centos.org/pipermail/centos/2011-September/118027.html
http://sourceforge.net/p/e1000/bugs/358/

解决方法


1. 以root身份登录SSH

2. 在grub.config文件中,添加pcie_aspm=off

# cd /boot/grub
# vi grub.conf

找到kernel 部分的对应位置,如下,在行的最后添加“pcie_aspm=off” 参数


default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.32-431.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=/dev/mapper/vg_host10588-lv_root nomodeset rd_NO_LUKS rd_MD_UUID=083ae648:88342690:2c5c8edb:4af053ee rd_LVM_LV=vg_host10588/lv_root SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_MD_UUID=8263fd2d:c008be86:08ead055:2b929ecf rd_LVM_LV=vg_host10588/lv_swap  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM LANG=en_US.UTF-8 rhgb quiet pcie_aspm=off
        initrd /initramfs-2.6.32-431.el6.x86_64.img
        


3. 保存并重启

# shutdown -r now