ddrescue read non tried blocks - recovery

I'm trying to rescue a 1TB disk which has read errors. Because I didn't have a free 1TB drive, I created a raid 0 of two 500GB drives.
I used the command line from Wikipedia for the first run:
sudo ddrescue -f -n /dev/sdk /dev/md/md_test /home/user/rescue.map
ddrescue already completed this run after approximately 20 hours and more than 7000 read errors.
Now I'm trying to do a second run
sudo ddrescue -d -f -v -r3 /dev/sdk /dev/md/md_test /home/user/rescue.map
and read the non tried blocks but ddrescue gives me this:
GNU ddrescue 1.23
About to copy 1000 GBytes from '/dev/sdk' to '/dev/md/md_test'
Starting positions: infile = 0 B, outfile = 0 B
Copy block size: 128 sectors Initial skip size: 19584 sectors
Sector size: 512 Bytes
Press Ctrl-C to interrupt
Initial status (read from mapfile)
rescued: 635060 MB, tried: 0 B, bad-sector: 0 B, bad areas: 0
Current status
ipos: 1000 GB, non-trimmed: 0 B, current rate: 0 B/s
opos: 1000 GB, non-scraped: 0 B, average rate: 0 B/s
non-tried: 365109 MB, bad-sector: 0 B, error rate: 0 B/s
rescued: 635060 MB, bad areas: 0, run time: 0s
pct rescued: 63.49%, read errors: 0, remaining time: n/a
time since last successful read: n/a
Copying non-tried blocks... Pass 1 (forwards)
ddrescue: Write error: Invalid argument
I can't figure out what this write errors means, already searched the manual for answers.
Any help is appreciated! Thx!

After a while I found the cause for the write error, the capacity of the corrupt drive is 931,5G but the total capacity of the raid 0 was just 931,3G.
Realized it, while I took a closer look to the output of lsblk command.
So I rebuild the raid 0 array with 3 500G drives and ddrescue now works as expected.

Related

rsync tries to copy although file is copied

I'm trying to understand the following, maybe one of you guys can help me out by explaining on what exactly happens here:
My goal is to write a script, which copies files from the find-command, parallels rsync-commands and backup those, by re-creating the folder-structure from the source on the destination as well.
Somehow my initial script does not work like that, and i don't really have an idea on how to fix it to behave that way.
Initially, i've copied the files with "ls -1 /foldername" as sourceFolder, which is not best-practice. So i've tried to change it to using "find" as described beyond.
find "$sourceFolder" -maxdepth 2 -mindepth 2 -print0 | xargs --verbose -0 -I {} -P $maxThreads -n 1 rsync -valhP {}/ "$destFolder"/ --checksum --human-readable --stats --dry-run >> "$logDir"/result.log
--
If i run this script, it literaly would recopy the file(s), although those exist in the destination folder i would expect it to be.
sending incremental file list
.d..tp..... ./
>f+++++++++ macs.txt
Number of files: 2 (reg: 1, dir: 1)
Number of created files: 1 (reg: 1)
Number of deleted files: 0
Number of regular files transferred: 1
Total file size: 242 bytes
Total transferred file size: 242 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.484 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 83
Total bytes received: 22
So as it turned out, sourceFolder and destFolder have to match, and i can't run anything like sourceFolder/folder1/folder1a destFolder/.
Can anyone help me out on what's wrong and why it behaves like that?
Thanks a Lot,
M.

Gem5 in full system running spec2006 runs out of memory

What I am trying to do is run a spec2006 benchmark (namely the 410.bwaves one) in full system mode.
I have made a .rcS script to pass to the fs.py script and the command I type to start the simulation is as follows:
build/X86/gem5.opt configs/example/fs.py --script="../run_bwaves.rcS" --disk-image=ubuntu-14.04.img --kernel=x86_64-vmlinux-2.6.22.9
The result after some time is:
Free swap: 0kB
131072 pages of RAM
3650 reserved pages
18 pages shared
0 pages swap cached
Out of memory: kill process 807 (bwaves) score 13154 or a child
Killed process 807 (bwaves)
/tmp/script: line 39: 807 Killed ./bwaves
Full linux output here
Gem5 output here
I am guessing it has something to do with this line: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
but I am not sure.
I have tried adding a --mem-size=... flag but it brakes the simulation with a Memory size not divisible by page size error.
If anyone could help me I would be glad.
Edit: As suggested by comment, I used a large enough --mem-size flag divisible by the page size. The error now has turned into
bwaves[807]: segfault at 00007ffee647624c rip 0000000000410eb5 rsp 00007fff664761c0 error 4
/tmp/script: line 39: 807 Segmentation fault ./bwaves

Process information in dump

I learnt that .tlist command in windbg dumps all the processes running in the system at the time of creating crash dump.
I would like to see the Memory Information of each process. So that it will help me to see if the system is over loaded by any specific process.
!process 0 1 will list all the processes and show memory related info for each. I issued this command using livekd and got all the processes. And here's my chrome process (which I picked out from the output):
PROCESS fffffa8007cb4200
SessionId: 1 Cid: 1158 Peb: 7efdf000 ParentCid: 0ff8
DirBase: 1b7962000 ObjectTable: fffff8a00addb010 HandleCount: 135.
Image: chrome.exe
VadRoot fffffa80090a6f80 Vads 169 Clone 0 Private 4037. Modified 3702. Locked 0.
DeviceMap 0000000000000000
Token fffff8a0091f9120
ElapsedTime 00:05:49.161
UserTime 00:00:00.000
KernelTime 00:00:00.000
QuotaPoolUsage[PagedPool] 0
QuotaPoolUsage[NonPagedPool] 0
Working Set Sizes (now,min,max) (8020, 50, 345) (32080KB, 200KB, 1380KB)
PeakWorkingSetSize 10137
VirtualSize 144 Mb
PeakVirtualSize 151 Mb
PageFaultCount 66631
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 5784
Job fffffa8009822e30
Note memory related properties such as "Working Set Sizes", "Virtual Size", etc.
ps. Works with livekd and with system memory dumps (which I believe is what livekd does).
Marc
This information is not contained in process dump. .tlist queries your current system, not the state when the dump was taken. If you can take a system dump, than you can check out processes and their memory usage, as Marc Sherman already answered.

Solaris CPU run queue

Is there a command which can tell me whats in the Solaris run queue?
I can get a count using vmstat, but I need to know what processes/threads are in there.
The run-queue is always changing, so it's almost impossible to get the set of processes in the current run-queue.
That said, you can get an approximation by looking at the STAT (state) field of the process list from ps. When running the command below:
$ ps aux
...the if the STAT field begins with R, then the process is marked RUNNABLE by the kernel, which on most operating systems means that it is in the run-queue. Here's what a runnable process looks like on my machine:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 78179 0.0 0.0 599828 480 s003 R+ 7:51AM 0:00.00 ps aux
On solaris, you can also use the prstat command and look at the STATE column. The value run indicates that the process is on the run-queue. (Also note that the value cpuN indicates that the process is currently running on processor N.
For example:
$ prstat -s cpu -n 5
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
13974 kincaid 888K 432K run 40 0 36:14.51 67% cpuhog/1
27354 kincaid 2216K 1928K run 31 0 314:48.51 27% server/5
14690 root 136M 46M sleep 59 0 0:00.59 2.3% Xsun/1
14797 kincaid 9192K 7496K sleep 59 0 0:00.10 0.9% dtwm/8
14851 kincaid 24M 14M sleep 48 0 0:00.03 0.3% netscape/1
Total: 97 processes, 190 lwps, load averages: 2.18, 2.15, 2.11
I was about to correct 0xfe answer when I saw you already did it. The run queue is containing theads not processes so the -L option is mandatory with the prstat command if you want to have the number of "state run" lines more or less matching the run queue. Beware that sampling artifacts will probably prevent to get accurate matches.
In any case, if you want to precisely know what processes/threads are sitting in the run queue you'd rather go the dtrace way assuming you are running Solaris 10 or newer.
The whoqueue.d script which might already been in /usr/demo/dtrace directory on your machine will be a good start:
# dtrace -s /usr/demo/dtrace/whoqueue.d
Run queue of length 1:
24349/1 (dtrace)
Run queue of length 3:
0/0 (sched)
0/0 (sched)
0/0 (sched)
Run queue of length 4:
22468/30 (java)
22468/17 (java)
22468/23 (java)
22468/10 (java)
Have a look at this page for details.

Comprehensive methods of viewing memory usage on Solaris [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 7 years ago.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Improve this question
On Linux, the "top" command shows a detailed but high level overview of your memory usage, showing:
Total Memory, Used Memory, Free Memory, Buffer Usage, Cache Usage, Swap size and Swap Usage.
My question is, what commands are available to show these memory usage figures in a clear and simple way? Bonus points if they're present in the "Core" install of Solaris. 'sar' doesn't count :)
Here are the basics. I'm not sure that any of these count as "clear and simple" though.
ps(1)
For process-level view:
$ ps -opid,vsz,rss,osz,args
PID VSZ RSS SZ COMMAND
1831 1776 1008 222 ps -opid,vsz,rss,osz,args
1782 3464 2504 433 -bash
$
vsz/VSZ: total virtual process size (kb)
rss/RSS: resident set size (kb, may be inaccurate(!), see man)
osz/SZ: total size in memory (pages)
To compute byte size from pages:
$ sz_pages=$(ps -o osz -p $pid | grep -v SZ )
$ sz_bytes=$(( $sz_pages * $(pagesize) ))
$ sz_mbytes=$(( $sz_bytes / ( 1024 * 1024 ) ))
$ echo "$pid OSZ=$sz_mbytes MB"
vmstat(1M)
$ vmstat 5 5
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr rm s3 -- -- in sy cs us sy id
0 0 0 535832 219880 1 2 0 0 0 0 0 -0 0 0 0 402 19 97 0 1 99
0 0 0 514376 203648 1 4 0 0 0 0 0 0 0 0 0 402 19 96 0 1 99
^C
prstat(1M)
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
1852 martin 4840K 3600K cpu0 59 0 0:00:00 0.3% prstat/1
1780 martin 9384K 2920K sleep 59 0 0:00:00 0.0% sshd/1
...
swap(1)
"Long listing" and "summary" modes:
$ swap -l
swapfile dev swaplo blocks free
/dev/zvol/dsk/rpool/swap 256,1 16 1048560 1048560
$ swap -s
total: 42352k bytes allocated + 20192k reserved = 62544k used, 607672k available
$
top(1)
An older version (3.51) is available on the Solaris companion CD from Sun, with the disclaimer that this is "Community (not Sun) supported".
More recent binary packages available from sunfreeware.com or blastwave.org.
load averages: 0.02, 0.00, 0.00; up 2+12:31:38 08:53:58
31 processes: 30 sleeping, 1 on cpu
CPU states: 98.0% idle, 0.0% user, 2.0% kernel, 0.0% iowait, 0.0% swap
Memory: 1024M phys mem, 197M free mem, 512M total swap, 512M free swap
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
1898 martin 1 54 0 3336K 1808K cpu 0:00 0.96% top
7 root 11 59 0 10M 7912K sleep 0:09 0.02% svc.startd
sar(1M)
And just what's wrong with sar? :)
# echo ::memstat | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 7308 57 23%
Anon 9055 70 29%
Exec and libs 1968 15 6%
Page cache 2224 17 7%
Free (cachelist) 6470 50 20%
Free (freelist) 4641 36 15%
Total 31666 247
Physical 31256 244
"top" is usually available on Solaris.
If not then revert to "vmstat" which is available on most UNIX system.
It should look something like this (from an AIX box)
vmstat
System configuration: lcpu=4 mem=12288MB ent=2.00
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
2 1 1614644 585722 0 0 1 22 104 0 808 29047 2767 12 8 77 3 0.45 22.3
the colums "avm" and "fre" tell you the total memory and free memery.
a "man vmstat" should get you the gory details.
Top can be compiled from sources or downloaded from sunfreeware.com. As previously posted, vmstat is available (I believe it's in the core install?).
The command free is nice. Takes a short while to understand the "+/- buffers/cache", but the idea is that cache and buffers doesn't really count when evaluating "free", as it can be dumped right away. Therefore, to see how much free (and used) memory you have, you need to remove the cache/buffer usage - which is conveniently done for you.