Posts Tagged ‘Linux’

Software raid 1 – Failing and recovering a disk

Tuesday, April 29th, 2008

A software raid group disk failed in one of my servers yesterday.

The kernel was spewing SCSI errors:

kernel: ata2: status=0xd0 { Busy }
kernel: SCSI error : return code = 0×8000002

# mdadm --display /dev/md0
# mdadm --display /dev/md1

both reported a failed disk sdb*

The procedure to rebuild the md groups is as follows:

Replace bad disk (sdb in this scenario.) Note that if you do not bring down the server to replace the disk, be sure to “remove” the disk from the raid groups using mdadm.

# mdadm --remove /dev/md0 /dev/sdb0
# mdadm --remove /dev/md1 /dev/sdb1

Read the good disk’s partition table (sda in this scenario.)

# fdisk -l /dev/sda
Disk /dev/sda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 fd Linux raid autodetect
/dev/sda2 14 19457 156183930 fd Linux raid autodetect

Install identical partition table on newly replaced disk. Create partitions that start and end on the same listed cylinders and are of type “fd.” Be sure to set the boot flag, and don’t forget to write the changes.

# fdisk /dev/sdb

Add partitions back to the appropriate raid groups.

# mdadm --add /dev/md0 /dev/sdb0
# mdadm --add /dev/md1 /dev/sdb1

Ensure the raid groups are rebuilding properly.

# mdadm --display /dev/md0
# mdadm --display /dev/md1

Searching and executing with find(1)

Thursday, April 17th, 2008

This afternoon I was faced with searching a directory tree for large files that have rotated within the last 24 hours – a symptom of a problem we were experiencing with a service.

Here’s what I put together quickly:

# find -iname name-\*.log -mtime 0 -exec du -sh {} \;

Explanation of the switches (from the find man page):

-iname pattern
Base of file name (the path with the leading directories removed) matches case insensitive shell pattern pattern

-mtime n
data was last modified n*24 hours ago.

-exec command {} \;
run the specified command on the matched files

It’s not complex (and probably not post-worthy,) but someone may find it helpful.