GhostScript use bbox to crop Postscript file - command-line

What I am trying to accomplish is to crop my PostScript file called example.ps using the output described in bbox. I am doing this in a batch process where the bbox might be different for certain files. I have looked at pdfcrop and seen that it uses a similar approach. Here is the command I am using to crop right now.
gs -o cropped.pdf \
-sDEVICE=pdfwrite \
-dDEVICEWIDTHPOINTS=160 \
-dDEVICEHEIGHTPOINTS=840 \
-dFIXEDMEDIA \
-c "0 0 translate 0 0 160 840 rectclip" \
-f example.ps
The issue with this command is that I have to specify what width and height to use. What I want to happen is to some how call bbox first and then call this statement either through code or by using some command line redirection.

First, be aware that not every single page from a multi-page PostScript file will show the exact same "bounding box" values (in fact, this is rather rare). So you probably want to find out the common denominator across all possible bounding boxes (which would include them all).
Second, what you see in the console window when you run gs -sDEVICE=bbox is a mix of stdin and stdout output channels. However, the info you're after is going to stderr. If you redirect the command output to a file, you're capturing stdout, not stderr! To suppress some of the version and debugging info going to stderr add -q to the commandline.
So in order to get a 'clean* output of the bounding boxes for all pages, you have to re-direct the stderr channel first, which you then capture in file info.txt. So run a command like this (or similar):
gs \
-dBATCH \
-dNOPAUSE \
-q \
-sDEVICE=bbox \
example.ps \
2>&1 \
| tee info.txt
or even this, should you not need the info about the HiResBoundingBox:
gs \
-dBATCH \
-dNOPAUSE \
-q \
-sDEVICE=bbox \
example.ps \
2>&1 \
| grep ^%%Bound \
| tee info.txt
Also, BTW, note that can determine the bounding boxes of PostScript as well as PDF input files.
This should give you output like the following, where each line represents a page of the input file, starting with page 1 on the first line:
%%BoundingBox: 36 18 553 802
%%BoundingBox: 37 18 553 804
%%BoundingBox: 36 18 553 802
%%BoundingBox: 37 668 552 803
%%BoundingBox: 40 68 532 757
Lastly, you might want to read up in the following answers for some background info about Ghostscript's bbox device. You'll also find some alternative PostScript code for the cropping job there:
PDF - Remove White Margins
How to crop a section of a PDF file to PNG using Ghostscript

Related

extract mappep reads from sam to classify them

I want to extract mapped reads from a SAM file (from a resistome analysis using AMR++) to taxonomically classify them.
I was searching from samtools manual and stackoverflow mainly and found these steps
samtools view -# 20 -S -b SRR4454621_1.alignment.sam > SRR4454621_1.bam ## to convert SAM to BAM
samtools view -# 20 -c SRR4454621_1.bam ### to count reads: 10000126
samtools view -# 20 -c -F 260 SRR4454621_1.bam ### to count mapped reads: 53189
samtools view -# 20 -b -F 4 SRR4454621_1.bam > SRR4454621_1_mapped.bam ### to get mapped reads
samtools view -# 20 -c SRR4454621_1_mapped.bam ### new check to count mapped reads: 53189
samtools bam2fq SRR4454621_1_mapped.bam | seqtk seq -A > SRR4454621_1_mapped.fa ## to extract sequences
grep ">" SRR4454621_1_mapped.fa | wc -l ### to check whether everything is going rigth: 53063 (lost 126 sequences)
Then I run centrifuge to classify them.
centrifuge -f -x centridb/hpvc testing/SRR4454621_1_mapped.fa -S testing/SRR4454621_1_mapped.tsv -p 24 --report-file testing/SRR4454621_1_mapped_report.tsv
And my problems is that the sum of column "numReads" from SRR4454621_1_mapped_report.tsv is 107682, and I would expect to be the same that the sum of equivalent column from resistome analysis which is 51961.
Where can it be the problem? Are the main steps I described above well done?
Thank you very much for your help,
Manuel

Command line pdftotext decimals don't work

I have a problem with pdftotext, in practice if in the specific command a decimal in the options "w" the reading of the pdf document does not work, here is the example of my command:
pdftotext -layout -y 714 -x 102 -W 14,39 -H 28 example.pdf pdf-example.txt
if, on the other hand, I delete the decimal figure from the command, everything works correctly.
I hope I have been clear enough.
Greetings

How to get -vf to ffmpeg from powershell commandline without parsing

I made a powershell function to recode video with some extra parameters. It basically makes a get-childitem in the directory and feeds every occurrence it finds to a foreach loop. This worked well as long as I have default values inside my function which gets fed to the ffmpeg string in the loop in case I do not provide anything on the commandline (like number of passes, audio quality etc.). Now I wanted to integrate the option to use the -vf ffmpeg filter option. My problem there is, that I usualy dont need that, so there is no sane default option I could use, so I can not have something like -vf $filteroption in my command line. So I am trying to figure out how to get that "-vf" inside the variable without powershell or ffmpeg screwing me over, because atm I get either the error of a missing - in what ffmpeg sees (I guess powershell parses this away) and when I \ escape the - I see it now in the ffmpeg line, but ffmpeg does not recognize it as single parameter.
examples which work:
&$encoder -hide_banner -i $i -c:v libvpx-vp9 -b:v 0 -crf $quality -tile-columns 6 -tile-rows 2 -threads 8 -speed 2 -frame-parallel 0 -row-mt 1 -c:a libopus -b:a $bitrate -af aformat=channel_layouts=$audio -c:s copy -auto-alt-ref 1 -lag-in-frames 25 -y $outfile;
here I provide $quality, $audio etc. with powershell parameters to the function like -quality 31 -audio stereo and it all works.
But now I need to get something like "-vf scale=1920:-1" or "" inside that line and that does not work with something like just this:
&$encoder -hide_banner -i $i -c:v libvpx-vp9 -b:v 0 -crf $quality -tile-columns 6 -tile-rows 2 -threads 8 -speed 2 -frame-parallel 0 -row-mt 1 -c:a libopus -b:a $bitrate -af aformat=channel_layouts=$audio -c:s copy -auto-alt-ref 1 -lag-in-frames 25 -y $extra $outfile;
when I call the function with: "RecodeVP9 -extra -vf scale=1920:-1" powershell takes away the -, if I try it with escaping the - with - ffmpeg whines about it saying that "Unable to find a suitable output format for '-vf'". I also tried "" and "-" with similiar results. So it seems that either powershell screws me over or ffmpeg.
So to sum it up:
I need a way to get extra ffmpeg arguments WITH the parameter name itself from the powershell command line into my powershell function (like -vf scale=1920:-1).

Create a .m4s file with ffmpeg

I'm rying to create .m4s files and I'm using this command with ffmpeg: ffmpeg -i input.mov -c:v copy output.m4s
The file can't be created.
Output: Unable to find a suitable output format for tempM4S.m4s tempM4S.m4s: Invalid argument
I'm guessing the file format .m4s is not supported by ffmpeg which is strange because ffmpeg can create .m4s files when trying to create segments for MPEG-DASH. Is there a workaround this problem? WIll I have to use other tools or check ffmpeg's source code for hints?
m4s files do not work by themselves as they do not contain a moov box required for playback. They require an initialization fragment as well.
I am guessing you want to create m4s to include it as part of m4s series. As #szatmary mentioned, these files are not independent. So you can try this:
Merge the m4s files to one mov file.
Merge your mov file with the step 1 output file.
Split again the output file of step 2 to m4s files.
Here's how I achieved this. My use case is an audio-only stream. The backend service ships MPEG-DASH files to static hosting. I upload the initializing .mp4 segment once, subsequently only uploading each new .m4s segment.
Also, the playlist .mpd is updated every segment, in order to tell a newly arriving listener where to begin playback. The listener will pick up an initializing .mp4 file for the desired bitrate, followed by the current and following .m4s segments.
My source material is a series of 10-second media segments in uncompressed .wav format.
I'll run ffmpeg once per each new media segment
I've specified a segment size of 11 seconds, to ensure that ffmpeg generates exactly 1 output .m4s segment for each source media segment.
ffmpeg \
-i pelicans-1234.wav \
-f hls \
-ac 2 \
-c:a aac \
-b:a 128k \
-minrate 128k \
-maxrate 128k \
-start_number 1234 \
-hls_fmp4_init_filename pelicans-128k-IS.mp4 \
-hls_segment_filename pelicans-128k-%d.m4s \
-hls_segment_type fmp4 \
-hls_time 11 \
temp.m3u8

Do not show directories in rsync output

Does anybody know if there is an rsync option, so that directories that are being traversed do not show on stdout.
I'm syncing music libraries, and the massive amount of directories make it very hard to see which file changes are actually happening.
I'v already tried -v and -i, but both also show directories.
If you're using --delete in your rsync command, the problem with calling grep -E -v '/$' is that it will omit the information lines like:
deleting folder1/
deleting folder2/
deleting folder3/folder4/
If you're making a backup and the remote folder has been completely wiped out for X reason, it will also wipe out your local folder because you don't see the deleting lines.
To omit the already existing folder but keep the deleting lines at the same time, you can use this expression :
rsync -av --delete remote_folder local_folder | grep -E '^deleting|[^/]$'
I'd be tempted to filter using by piping through grep -E -v '/$' which uses an end of line anchor to remove lines which finish with a slash (a directory).
Here's the demo terminal session where I checked it...
cefn#cefn-natty-dell:~$ mkdir rsynctest
cefn#cefn-natty-dell:~$ cd rsynctest/
cefn#cefn-natty-dell:~/rsynctest$ mkdir 1
cefn#cefn-natty-dell:~/rsynctest$ mkdir 2
cefn#cefn-natty-dell:~/rsynctest$ mkdir -p 1/first 1/second
cefn#cefn-natty-dell:~/rsynctest$ touch 1/first/file1
cefn#cefn-natty-dell:~/rsynctest$ touch 1/first/file2
cefn#cefn-natty-dell:~/rsynctest$ touch 1/second/file3
cefn#cefn-natty-dell:~/rsynctest$ touch 1/second/file4
cefn#cefn-natty-dell:~/rsynctest$ rsync -r -v 1/ 2
sending incremental file list
first/
first/file1
first/file2
second/
second/file3
second/file4
sent 294 bytes received 96 bytes 780.00 bytes/sec
total size is 0 speedup is 0.00
cefn#cefn-natty-dell:~/rsynctest$ rsync -r -v 1/ 2 | grep -E -v '/$'
sending incremental file list
first/file1
first/file2
second/file3
second/file4
sent 294 bytes received 96 bytes 780.00 bytes/sec
total size is 0 speedup is 0.00