ALT Linux hardware support
 help / color / mirror / Atom feed
* [Hardware] softraid resync problems
@ 2008-11-18 12:54 Eugene Prokopiev
  2008-11-19 13:07 ` Michael Shigorin
  0 siblings, 1 reply; 4+ messages in thread
From: Eugene Prokopiev @ 2008-11-18 12:54 UTC (permalink / raw)
  To: Hardware

Здравствуйте!

Система - branch/4.1

Наблюдаю:

Nov 18 16:36:53 my-desktop kernel: md: md0: raid array is not clean --
starting background reconstruction
Nov 18 16:36:53 my-desktop kernel: raid1: raid set md0 active with 2
out of 2 mirrors
Nov 18 16:36:53 my-desktop kernel: md: ... autorun DONE.
...
Nov 18 16:36:53 my-desktop klogd: klogd startup succeeded
Nov 18 16:36:53 my-desktop kernel: md: resync of RAID array md0
Nov 18 16:36:53 my-desktop kernel: md: minimum _guaranteed_  speed:
1000 KB/sec/disk.
Nov 18 16:36:53 my-desktop kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for resync.
Nov 18 16:36:53 my-desktop kernel: md: using 128k window, over a total
of 5116608 blocks.
Nov 18 16:36:53 my-desktop kernel: ata1.00: exception Emask 0x0 SAct
0x0 SErr 0x0 action 0x2 frozen
Nov 18 16:36:53 my-desktop kernel: ata1.00: cmd
ca/00:08:87:38:23/00:00:00:00:00/e0 tag 0 dma 4096 out
Nov 18 16:36:53 my-desktop kernel:          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov 18 16:36:53 my-desktop kernel: ata1.00: status: { DRDY }
Nov 18 16:36:53 my-desktop kernel: ata1: soft resetting link
Nov 18 16:36:53 my-desktop kernel: ata1.00: configured for UDMA/100
Nov 18 16:36:53 my-desktop kernel: ata1: EH complete
Nov 18 16:36:53 my-desktop kernel: sd 0:0:0:0: [sda] 234439535
512-byte hardware sectors (120033 MB)
Nov 18 16:36:53 my-desktop kernel: sd 0:0:0:0: [sda] Write Protect is off
Nov 18 16:36:53 my-desktop kernel: sd 0:0:0:0: [sda] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA

Устройство sda пора выкидывать?

-- 
С уважением,
Прокопьев Евгений

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Hardware] softraid resync problems
  2008-11-18 12:54 [Hardware] softraid resync problems Eugene Prokopiev
@ 2008-11-19 13:07 ` Michael Shigorin
  2008-11-23 18:02   ` Eugene Prokopiev
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Shigorin @ 2008-11-19 13:07 UTC (permalink / raw)
  To: Hardware

On Tue, Nov 18, 2008 at 03:54:08PM +0300, Eugene Prokopiev wrote:
> Nov 18 16:36:53 my-desktop kernel: ata1.00: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x2 frozen
> Nov 18 16:36:53 my-desktop kernel: ata1.00: cmd
> ca/00:08:87:38:23/00:00:00:00:00/e0 tag 0 dma 4096 out
> Nov 18 16:36:53 my-desktop kernel:          res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> Nov 18 16:36:53 my-desktop kernel: ata1.00: status: { DRDY }
> Nov 18 16:36:53 my-desktop kernel: ata1: soft resetting link
> Nov 18 16:36:53 my-desktop kernel: ata1.00: configured for UDMA/100
> Nov 18 16:36:53 my-desktop kernel: ata1: EH complete
> 
> Устройство sda пора выкидывать?

Похоже; глянь на всякий smartctl -d ata -a /dev/sda ещё.
Диску небось порядка трёх лет?

-- 
 ---- WBR, Michael Shigorin <mike@altlinux.ru>
  ------ Linux.Kiev http://www.linux.kiev.ua/


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Hardware] softraid resync problems
  2008-11-19 13:07 ` Michael Shigorin
@ 2008-11-23 18:02   ` Eugene Prokopiev
  2008-11-27 21:40     ` Michael Shigorin
  0 siblings, 1 reply; 4+ messages in thread
From: Eugene Prokopiev @ 2008-11-23 18:02 UTC (permalink / raw)
  To: hardware, shigorin, Hardware

19.11.08, Michael Shigorin написал(а):

> On Tue, Nov 18, 2008 at 03:54:08PM +0300, Eugene Prokopiev wrote:
>  > Nov 18 16:36:53 my-desktop kernel: ata1.00: exception Emask 0x0
>  > SAct 0x0 SErr 0x0 action 0x2 frozen
>  > Nov 18 16:36:53 my-desktop kernel: ata1.00: cmd
>  > ca/00:08:87:38:23/00:00:00:00:00/e0 tag 0 dma 4096 out
>  > Nov 18 16:36:53 my-desktop kernel:          res
>  > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>  > Nov 18 16:36:53 my-desktop kernel: ata1.00: status: { DRDY }
>  > Nov 18 16:36:53 my-desktop kernel: ata1: soft resetting link
>  > Nov 18 16:36:53 my-desktop kernel: ata1.00: configured for UDMA/100
>  > Nov 18 16:36:53 my-desktop kernel: ata1: EH complete
>  >
>
> > Устройство sda пора выкидывать?
>
>  Похоже; глянь на всякий smartctl -d ata -a /dev/sda ещё

# smartctl -d ata -a /dev/sda -s on
smartctl version 5.38 [i586-alt-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.9 family
Device Model:     ST3120814A
Serial Number:    5LS00NJ3
Firmware Version: 2AAA
User Capacity:    120,033,041,920 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun Nov 23 20:48:29 2008 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 ( 430) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  51) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   068   042   006    Pre-fail
Always       -       221193856
  3 Spin_Up_Time            0x0003   100   098   000    Pre-fail
Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age
Always       -       166
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
Always       -       1
  7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail
Always       -       952819715
  9 Power_On_Hours          0x0032   077   077   000    Old_age
Always       -       20608
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age
Always       -       292
187 Reported_Uncorrect      0x0032   100   100   000    Old_age
Always       -       0
189 High_Fly_Writes         0x003a   090   090   000    Old_age
Always       -       10
190 Airflow_Temperature_Cel 0x0022   054   035   045    Old_age
Always   In_the_past 46 (255 255 47 46)
194 Temperature_Celsius     0x0022   046   065   000    Old_age
Always       -       46 (0 20 0 0)
195 Hardware_ECC_Recovered  0x001a   057   049   000    Old_age
Always       -       16983244
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age
Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     19830         -
# 2  Short offline       Completed without error       00%     19806         -
# 3  Short offline       Completed without error       00%     19782         -
# 4  Short offline       Completed without error       00%     19759         -
# 5  Extended offline    Completed without error       00%     19738         -
# 6  Short offline       Completed without error       00%     19735         -
# 7  Short offline       Completed without error       00%     19711         -
# 8  Short offline       Completed without error       00%     19688         -
# 9  Short offline       Completed without error       00%     19664         -
#10  Short offline       Completed without error       00%     19641         -
#11  Short offline       Completed without error       00%     19617         -
#12  Short offline       Completed without error       00%     19593         -
#13  Extended offline    Completed without error       00%     19573         -
#14  Short offline       Completed without error       00%     19570         -
#15  Short offline       Completed without error       00%     19546         -
#16  Short offline       Completed without error       00%     19522         -
#17  Short offline       Completed without error       00%     19499         -
#18  Short offline       Completed without error       00%     19476         -
#19  Short offline       Completed without error       00%     19452         -
#20  Extended offline    Completed without error       00%     19408         -
#21  Short offline       Completed without error       00%     19405         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

>  Диску небось порядка трёх лет?

Где-то так, если не больше :)

Впрочем, компьютер перетряхнули/почистили/помазали и он перестал
заниматься глупостями ...

-- 
С уважением,
Прокопьев Евгений

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Hardware] softraid resync problems
  2008-11-23 18:02   ` Eugene Prokopiev
@ 2008-11-27 21:40     ` Michael Shigorin
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Shigorin @ 2008-11-27 21:40 UTC (permalink / raw)
  To: Eugene Prokopiev; +Cc: hardware

On Sun, Nov 23, 2008 at 09:02:12PM +0300, Eugene Prokopiev wrote:
> >  > Nov 18 16:36:53 my-desktop kernel: ata1.00: exception Emask 0x0
> >  > SAct 0x0 SErr 0x0 action 0x2 frozen
> > > Устройство sda пора выкидывать?
> >  Похоже; глянь на всякий smartctl -d ata -a /dev/sda ещё
> UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     221193856

Обычно здесь поменьше -- у трёхгодичных Hitachi VLA:
1703961
131075
196610
262146

у годовалых Seagate ES (похоже, они разными единицами меряют):
134919036
57858588

>   5 Reallocated_Sector_Ct   1

Минимальный, но ненулевой.

>   7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail
> Always       -       952819715

Seagate:
315881520
325946994

> 194 Temperature_Celsius     46 (0 20 0 0)

Много.

> >  Диску небось порядка трёх лет?
> Где-то так, если не больше :)

Меняй, это хорошо известное эмпирическое правило.

> Впрочем, компьютер перетряхнули/почистили/помазали и он
> перестал заниматься глупостями ...

Это на время, боюсь -- сейчас поремапился, но с таким перегревом
надеяться на долгую и счастливую жизнь не стоит.

-- 
 ---- WBR, Michael Shigorin <mike@altlinux.ru>
  ------ Linux.Kiev http://www.linux.kiev.ua/


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-11-27 21:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-18 12:54 [Hardware] softraid resync problems Eugene Prokopiev
2008-11-19 13:07 ` Michael Shigorin
2008-11-23 18:02   ` Eugene Prokopiev
2008-11-27 21:40     ` Michael Shigorin

ALT Linux hardware support

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://lore.altlinux.org/hardware/0 hardware/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 hardware hardware/ http://lore.altlinux.org/hardware \
		hardware@altlinux.ru hardware@lists.altlinux.org hardware@lists.altlinux.ru hardware@lists.altlinux.com hardware@altlinux.org
	public-inbox-index hardware

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://lore.altlinux.org/org.altlinux.lists.hardware


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git