PDA

View Full Version : Salute HDDs e consigli?


psychok9
03-02-2009, 15:59
Da un po' di tempo leggo dei dati S.M.A.R.T., che ad un occhio non esperto, mi sembrano quantomeno di pre-allarme, e vorrei condividerli con voi sperando che mi possiate aiutare a interpretarli correttamente e magari consigliarmi "sul da farsi". Ho 2 Seagate da 160Gb identici, comprati insieme e utilizzati molto spesso in raid, e un Samsung 500Gb utilizzato come "backup".
Tengo a precisare che il pc spesso e volentieri è stato settimane/mesi acceso 24h/24h e quasi sempre con lo spindown disabilitato sia da Win che da Linux.
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: SAMSUNG SpinPoint T166 series
Device Model: SAMSUNG HD501LJ
Serial Number: S0MUJ1DPC08094
Firmware Version: CR100-12
User Capacity: 500,107,862,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 3b
Local Time is: Tue Feb 3 15:52:37 2009 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (8960) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 153) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 100 100 015 Pre-fail Always - 7360
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1266
5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0
8 Seek_Time_Performance 0x0025 253 253 015 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 6789
10 Spin_Retry_Count 0x0033 253 253 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 253 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 869
13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 6945580
187 Reported_Uncorrect 0x0032 048 048 000 Old_age Always - 53
188 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 070 050 000 Old_age Always - 30
194 Temperature_Celsius 0x0022 148 085 000 Old_age Always - 30
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 6945580
196 Reallocated_Event_Count 0x0032 253 253 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 253 099 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0
202 TA_Increase_Count 0x0032 100 100 000 Old_age Always - 3

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 2172 -

SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure revision number = 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model: ST3160827AS
Serial Number: 5MT0KC88
Firmware Version: 3.42
User Capacity: 160,041,885,696 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Tue Feb 3 15:53:10 2009 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 94) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 057 046 006 Pre-fail Always - 180397822
3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1870
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 2
7 Seek_Error_Rate 0x000f 090 060 030 Pre-fail Always - 5270040678
9 Power_On_Hours 0x0032 070 070 000 Old_age Always - 26731
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2243
194 Temperature_Celsius 0x0022 028 048 000 Old_age Always - 28 (0 10 0 0)
195 Hardware_ECC_Recovered 0x001a 057 045 000 Old_age Always - 180397822
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 2
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

SMART Error Log Version: 1
ATA Error Count: 88 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 88 occurred at disk power-on lifetime: 3758 hours (156 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7a 6d 21 48 Error: UNC at LBA = 0x08216d7a = 136408442

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 7a 6d 21 48 00 10:24:41.132 READ DMA
ca 00 56 aa d2 b5 48 00 10:24:41.131 WRITE DMA
c8 00 80 2a 95 b5 48 00 10:24:41.130 READ DMA
ca 00 80 2a d2 b5 48 00 10:24:41.129 WRITE DMA
c8 00 2a 00 95 b5 48 00 10:24:41.127 READ DMA

Error 87 occurred at disk power-on lifetime: 3758 hours (156 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7a 6d 21 48 Error: UNC at LBA = 0x08216d7a = 136408442

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 7a 6d 21 48 00 10:24:36.096 READ DMA
ca 00 80 2a 95 b5 48 00 10:24:36.095 WRITE DMA
c8 00 2a 00 58 b5 48 00 10:24:36.092 READ DMA
ca 00 2a 00 95 b5 48 00 10:24:36.091 WRITE DMA
c8 00 56 aa 57 b5 48 00 10:24:36.090 READ DMA

Error 86 occurred at disk power-on lifetime: 3758 hours (156 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7a 6d 21 48 Error: UNC at LBA = 0x08216d7a = 136408442

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 7a 6d 21 48 00 10:24:30.698 READ DMA
ca 00 2a 00 58 b5 48 00 10:24:30.698 WRITE DMA
c8 00 56 aa 1a b5 48 00 10:24:30.696 READ DMA
ca 00 56 aa 57 b5 48 00 10:24:30.696 WRITE DMA
c8 00 80 2a 1a b5 48 00 10:24:30.695 READ DMA

Error 85 occurred at disk power-on lifetime: 3758 hours (156 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7a 6d 21 48 Error: UNC at LBA = 0x08216d7a = 136408442

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 7a 6d 21 48 00 10:24:25.388 READ DMA
c8 00 56 aa dd b4 48 00 10:24:25.387 READ DMA
ca 00 56 aa 1a b5 48 00 10:24:25.387 WRITE DMA
c8 00 80 2a dd b4 48 00 10:24:25.384 READ DMA
ca 00 80 2a 1a b5 48 00 10:24:25.383 WRITE DMA

Error 84 occurred at disk power-on lifetime: 3758 hours (156 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 7a 6d 21 48 Error: UNC at LBA = 0x08216d7a = 136408442

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 7a 6d 21 48 00 10:24:20.597 READ DMA
ca 00 56 aa dd b4 48 00 10:24:20.596 WRITE DMA
c8 00 80 2a a0 b4 48 00 10:24:20.595 READ DMA
ca 00 80 2a dd b4 48 00 10:24:20.594 WRITE DMA
c8 00 2a 00 a0 b4 48 00 10:24:20.593 READ DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 26232 -
# 2 Extended offline Aborted by host 90% 26228 -
# 3 Extended offline Aborted by host 90% 26228 -
# 4 Short offline Completed without error 00% 22494 -
# 5 Short offline Completed without error 00% 20021 -
# 6 Extended offline Completed without error 00% 11957 -
# 7 Short offline Completed without error 00% 11956 -
# 8 Extended offline Completed without error 00% 11952 -
# 9 Short offline Completed without error 00% 11951 -
#10 Extended offline Interrupted (host reset) 90% 11949 -
#11 Short offline Completed without error 00% 11948 -
#12 Short offline Completed without error 00% 11947 -
#13 Extended offline Completed without error 00% 11945 -
#14 Short offline Completed without error 00% 11944 -
#15 Extended offline Interrupted (host reset) 40% 11919 -
#16 Short offline Completed without error 00% 11918 -
#17 Extended offline Interrupted (host reset) 80% 11917 -
#18 Short offline Completed without error 00% 11917 -
#19 Extended offline Interrupted (host reset) 90% 11917 -
#20 Short offline Completed without error 00% 11917 -
#21 Extended offline Interrupted (host reset) 20% 11916 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model: ST3160827AS
Serial Number: 5MT0NE5D
Firmware Version: 3.42
User Capacity: 160,041,885,696 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Tue Feb 3 15:53:21 2009 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 94) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 065 045 006 Pre-fail Always - 6
3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2117
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 4790419724
9 Power_On_Hours 0x0032 071 071 000 Old_age Always - 25469
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2150
194 Temperature_Celsius 0x0022 031 052 000 Old_age Always - 31 (0 14 0 0)
195 Hardware_ECC_Recovered 0x001a 065 045 000 Old_age Always - 6
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 21183 -
# 2 Short offline Completed without error 00% 18736 -
# 3 Short offline Completed without error 00% 10947 -
# 4 Short offline Completed without error 00% 4395 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Che ne pensate?

Altra domanda da 100.000 dollari: dato che non sono interessato agli eventuali risparmi energetici infimi (il pc è pure un po' occato), secondo voi lo spindown degli hard disk non in uso aiuta ad allungare la vita degli stessi o la peggiora?

psychok9
07-02-2009, 22:33
Uppete

Danilo Cecconi
08-02-2009, 13:28
Hai provato con CrystalDisk a fare una verifica dei suddetti valori?
O anche con HDTune.
Magari dopo posta le schermate.

CRL
08-02-2009, 13:51
Ti chiederei anche io di fare il controllo con HDTune (schermata health) e poi postarla qui, puoi mettere l'immagine su www.imageshack.us o altri simili e poi linkarla qui. E' molto più diretta la lettura dei parametri, grazie.

- CRL -

psychok9
09-02-2009, 04:53
Allora: dopo una giornata infernale con gparted, avviando Windows e Hdtune mi sono ricordato che Windows ha una bella limitazione: non riesce ad accedere direttamente agli hard disk e ai dati SMART, come faccio su Linux, se configurati in raid :muro:
Se volete provo a postarveli con un programma grafico (sempre su linux).
E ho anche scoperto, già ieri, una cosa:
Facendo un test con hdtune, sull'intero array raid 0, accade una cosa a mio modo di vedere anomala e forse preoccupante, e si verifica anche ripetendo il test più volte tutte le volte, ovvero avviene un crollo verticale, raggiungendo quasi lo zero, tra il 40 e il 50% dell'array :eek:
Poco fa ho ripetuto il test con gli stessi risultati:
Seagate 2x160Gb - Raid 0
http://img13.imageshack.us/img13/3710/hdtunebenchmarkintelraimk6.png
Samsung 500Gb (test normale)
http://img13.imageshack.us/img13/8537/hdtunebenchmarksamsunghca5.png

Che ne pensate? Una formattatina a basso livello pensate li possa rigenerare?

psychok9
09-02-2009, 06:56
I due nonni visti da GSmartControl:
5MT0KC88:http://img264.imageshack.us/img264/1433/5mt0kc88gm6.th.png (http://img264.imageshack.us/my.php?image=5mt0kc88gm6.png)
5MT0NE5D:http://img179.imageshack.us/img179/2540/5mt0ne5dac1.th.png (http://img179.imageshack.us/my.php?image=5mt0ne5dac1.png)

psychok9
11-02-2009, 02:20
Uppete.

CRL
11-02-2009, 12:01
2 settori danneggiati + 1 pendente per uno dei due dischi, probabilmente sono a meta' disco dove hai il calo di prestazioni, o forse no.

L'altro disco e' ok. Non c'e' da preoccuparsi, ma se hai tempo e voglia potresti scindere l'array, formattare a basso livello i dischi, e ripartire da zero.

Cmq puoi farlo anche tra un po', 3 settori danneggiati sono una nullita', io ho un maxtor che ne ha credo 300 ed e' fermo li' da anni, lavora senza problemi.

Tieni d'occhio quel reallocated rosso, se il 2 aumenta ne sta trovando altri.

- CRL -

psychok9
11-02-2009, 15:48
2 settori danneggiati + 1 pendente per uno dei due dischi, probabilmente sono a meta' disco dove hai il calo di prestazioni, o forse no.

L'altro disco e' ok. Non c'e' da preoccuparsi, ma se hai tempo e voglia potresti scindere l'array, formattare a basso livello i dischi, e ripartire da zero.

Cmq puoi farlo anche tra un po', 3 settori danneggiati sono una nullita', io ho un maxtor che ne ha credo 300 ed e' fermo li' da anni, lavora senza problemi.

Tieni d'occhio quel reallocated rosso, se il 2 aumenta ne sta trovando altri.

- CRL -

Provando il "Full erase", sembra che non ci sia più il calo :D (devo riverificare di nuovo a giorni), ma non so se è stata una formattazione a basso livello... e non ho trovato nessun'altra voce nelle utility Seagate. La cosa che mi fa pensare questo è che quello pendente è rimasto tale... non dovrebbe "decidersi"? :/
Grazie mille per i consigli :) Lo terrò d'occhio...

psychok9
11-02-2009, 21:51
Ho notato una cosa che non avevo notato prima, forse per il rumore delle ventole: gli hard disk talvolta sembrano fare dei "lavori", nascosti, ovvero senza segnalarli attraverso il led del case... a cosa è dovuto questo comportamento? Fino a qualche giorno fa avevo sentito un rumore più leggero e breve (8-10 secondi) come se facesse un controllo di routine... mentre poco fa ha frullato 1 po' di più, però appunto, senza lampeggiare/segnalare. C'è qualche comando che va all'hard disk senza esser segnalato?
Altra cosa: esistono programmi come HDTune per Linux? C'è hdparm... ma da una velocità media molto approssimata e solo numerica.