Grub Boot Error

Update 20150419: This OCZ SSD Drive is now entirely broken.

My desktop computer (it is still an ASUS Barebone V3-M3N8200) sometimes gives me the following error when I turn it on:

error: attempt to read or write outside of disk 'hd0'.
Entering rescue mode...
grub rescue> _


My observations:

  • This has happened now and then for a while
  • It seems to happen more often when the computer have been off for a longer period of time (sounds unlikely, I know)
  • Ctrl-Alt-Del: It always boots properly the second time

I have three SATA drives. BIOS boots the first harddrive, where GRUB is installed on the mbr, and where / is the first and only partition, and /boot lives on the / partition.

The drive in question is (from dmesg):

[    1.339215] ata3.00: ATA-8: OCZ-VERTEX PLUS R2, 1.2, max UDMA/133
[    1.339217] ata3.00: 120817072 sectors, multi 1: LBA48 NCQ (depth 31/32)
[    1.339323] ata3.00: configured for UDMA/133
[    1.339466] scsi 2:0:0:0: Direct-Access     ATA      OCZ-VERTEX PLUS  1.2  PQ: 0 ANSI: 5
[    1.339623] sd 2:0:0:0: Attached scsi generic sg1 type 0
[    1.339715] sd 2:0:0:0: [sda] 120817072 512-byte logical blocks: (61.8 GB/57.6 GiB)

That is, a 60GB SSD drive from OCZ (yes, I had another OCZ SSD drive that died).

I can not explain my occational boot errors, but I have some theories:

  • The SSD drive is broken/corrupted (but no signs within Ubuntu of anything like it)
  • All drive is somehow not initialized when GRUB executes (?)
  • Somehow, more than one hard drive is involved in the boot process, and they are not all initialized at the same time (but this does not seem to be the case)

GSmartControl gives me some suspicious output about my drive… but I do not know how to interpret it:

Error in ATA Error Log structure: checksum error
Error in Self-Test Log structure: checksum error

The (Short) self test completes without errors.

Any ideas or suggestions are welcome! I will update this post if I learn anything new.