PDA

View Full Version : Dubbio su AMD SR5690 / SP5100 Chipset


Мир
26-02-2012, 10:52
Ho la MB Supermicro H8QGi+-F (http://www.supermicro.nl/Aplus/motherboard/Opteron6000/SR56x0/H8QGi_-F.cfm) con 4 Opteron 6168 e linux Gentoo (kernel 3.2.1) installato.
Ho 2 HD Samsung HD103SJ da 1Tb ed 1 Samsung HD204UI da 2TB (firmware flashato per ovviare il noto bug).
Il chip che gesticono i SATA sono della serie AMD SR5690 / SP5100.
Il bios e' settato su AHCI.

Gli HD da 1 TB vanno bene senza problemi. Quello da 2 TB mi da' diversi grattacapi. Appena fisicamente acceso il server funziona, poi dopo qualche giorno inevitabilmente mi da' errore:

ata3.00: exception Emask 0x0 SAct 0x8000 SErr 0x0 action 0x6 frozen
ata3.00: failed command: READ FPDMA QUEUED
ata3.00: cmd 60/10:78:90:da:e4/00:00:3d:00:00/40 tag 15 ncq 8192 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: softreset failed (device not ready)
ata3: hard resetting link
ata3: softreset failed (device not ready)
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3.00: device reported invalid CHS sector 0
ata3: EH complete
ata3.00: exception Emask 0x0 SAct 0x1ff0 SErr 0x0 action 0x6 frozen
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/00:20:50:84:18/04:00:3e:00:00/40 tag 4 ncq 524288 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/a8:28:50:90:18/03:00:3e:00:00/40 tag 5 ncq 479232 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
...
...
ata3: hard resetting link
ata3: softreset failed (device not ready)
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 1c e0 00 04 00 00
end_request: I/O error, dev sdc, sector 1041767648
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 78 c0 00 03 90 00
end_request: I/O error, dev sdc, sector 1041791168
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 7c 50 00 04 00 00
end_request: I/O error, dev sdc, sector 1041792080
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 80 50 00 04 00 00
end_request: I/O error, dev sdc, sector 1041793104
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 88 50 00 04 00 00
end_request: I/O error, dev sdc, sector 1041795152
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 3e 18 8c 50 00 04 00 00
end_request: I/O error, dev sdc, sector 1041796176
ata3: EH complete
Buffer I/O error on device sdc1, logical block 130220316
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220317
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220318
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220319
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220320
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220321
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220322
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220323
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220324
lost page write due to I/O error on sdc1
Buffer I/O error on device sdc1, logical block 130220325
lost page write due to I/O error on sdc1
ata3.00: exception Emask 0x0 SAct 0x7e003066 SErr 0x0 action 0x6 frozen
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/20:08:b0:79:99/02:00:b4:00:00/40 tag 1 ncq 278528 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/00:10:c8:70:ef/04:00:7a:00:00/40 tag 2 ncq 524288 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/00:28:c8:68:ef/04:00:7a:00:00/40 tag 5 ncq 524288 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3.00: status: { DRDY }
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/00:f0:b0:75:99/04:00:b4:00:00/40 tag 30 ncq 524288 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
sd 2:0:0:0: [sdc] Unhandled error code
sd 2:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 2:0:0:0: [sdc] CDB: Write(10): 2a 00 7a ef 64 c8 00 04 00 00
end_request: I/O error, dev sdc, sector 2062509256
ata3: EH complete


L'unica maniera per resettare il tutto e' spegnere fisicamente il server.
L'HD da 2 TB, che uso solo per backup, praticamente non funziona.
Qualcuno e' a conoscenza di problemi dei chipset della serie AMD 5600 con HD da 2TB?
O mi sa suggerire cosa tentare? :D

Tasslehoff
27-02-2012, 12:51
Prova a fare un po' di verifiche con fsck ma a occhio e croce credo che la causa più plausibile sia un guasto di quel disco.

Rasoio di Occam rulez ;)

Мир
27-02-2012, 23:12
Prova a fare un po' di verifiche con fsck ma a occhio e croce credo che la causa più plausibile sia un guasto di quel disco.

Rasoio di Occam rulez ;)

Grazie, almeno 1 risposta! :D
E' che ho fatto piu' e piu' volte xfs_repair. Torna tutto a posto, funziona per qualche giorno e poi... boom, di nuovo errore... regolare come un orologio svizzero...
Non capisco se e' il disco o altro... il fatto che torni a posto dopo il fsck, che mi funzioni bene per un tot di tempo e che poi vada in crash mi fa piu' pensare ad un problema di incompatibilita' HD-soft-chipset....
Provero' a giocare un po' con hdparm...

Tasslehoff
28-02-2012, 14:24
Attenzione ad hdparm, è un'arma a doppio taglio, rischi di impelagarti in un mare tempestoso di settaggi astrusi che possono causare più danni che benefici.

Io ti consiglio di usare badblocks per controllare la presenza di problemi fisici all'unità.
Quando lo lanci specifica correttamente la dimensione dei blocchi, puoi recuperarla tramite la utility disktype.