ATA regs: error 10, sector count 8, LBA low ff, LBA mid ff, LBA high ff, device 4f, status 51

Problem

Under heavy I/O load, the kernel reports the following error messages:

Oct 30 07:23:59 acc kernel: IAL: COMPLETION ERROR, adapter 0, channel 2, flags=104
Oct 30 07:23:59 acc kernel: ATA regs: error 10, sector count 8, LBA low ff, LBA mid ff, LBA high ff, device 4f, status 51
Oct 30 07:24:00 acc kernel: Badness in mvMicroSecondsDelay at /home/paul/sysadmin/highpoint.2/mvlib.c:56
Oct 30 07:24:00 acc kernel: [] mvMicroSecondsDelay+0x5c/0x85 [hptmv]
Oct 30 07:24:00 acc kernel: [] mvSataChannelHardReset+0x1a6/0x2a0 [hptmv]
Oct 30 07:24:00 acc kernel: [] mvSataFlushDmaQueue+0x37/0x50 [hptmv]
Oct 30 07:24:00 acc kernel: [] MvSataResetChannel+0x40/0x11c [hptmv]
Oct 30 07:24:00 acc kernel: [] handleEdmaError+0x6d/0xa1 [hptmv]
Oct 30 07:24:00 acc kernel: [] CheckPendingCall+0x3e/0x60 [hptmv]
Oct 30 07:24:00 acc kernel: [] hptmv_int_handler+0x47/0x8d [hptmv]

Affected OS:

Red Hat Enterprise Linux 4 / CentOS 4.x

Affected controllers:

Highpoint RocketRAID 1810-A, 1820-A

Solution

This is a driver bug which has been fixed in version 1.13. To upgrade the driver, perform the following steps:

  1. Download the driver source code here

    http://updates.aslab.com/kernel/source/RAID/highpoint/rr182x-opensource-v1.13-0923[1].tar

  2. Compile the new driver
    # mkdir hpt
    # cd hpt
    # tar xf rr182x-opensource-v1.13-0923[1].tar
    # make
    
  3. Update the new driver # cp -f hptmv.ko /lib/modules/2.6.9-11.ELsmp/kernel/drivers/scsi
  4. Create a new initial ramdisk
    # mkinitrd -f /boot/initrd-2.6.9-11.ELsmp.img 2.6.9-11.ELsmp
    
  5. Reboot the server

Note: If necessary, replace 2.6.9-11.ELsmp with the appropriate kernel version. The kernel version is obtained using the command 'uname -a'.