Interpreting opensmile feature output - audeering-opensmile

My output features have -0 and -1 appended to them, what does it mean? For example, I have both pcm_RMSenergy_sma_amean-0 and pcm_RMSenergy_sma_amean-1, with different values. This happens using GeMAPSv01a.conf and IS10_paraling.conf config files (haven't checked with others).

Related

How to rename partly the downloaded file using wget?

I'd like to download many files (about 10000) from ftp-server. Names of the files are too long. I'd like to save them only with the date in names. For example: ABCDE201604120000-abcde.nc I prefer to be 20160412.nc
Is it possible?
I am not sure if wget provides similar functionality, nevertheless with curl, one can profit from the relatively rich syntax it provides in order to specify the URL of interest. For example:
curl \
"https://ftp5.gwdg.de/pub/misc/openstreetmap/SOTMEU2014/[53-54].{mp3,mp4}" \
-o "file_#1.#2"
will download files 53.mp3, 53.mp4, 54.mp3, 54.mp4. The output file is specified as file_#1.#2 - here, #1 is replaced by curl with the value of the sequence [53-54] corresponding to the file being downloaded. Similarly, #2 is replace with either mp3 or mp4. Thus, e.g., 53.mp3 will be saved as file_53.mp3.
ewcz's answer works fine if you can enumerate the file names as shown in the post. However, if the filenames are difficult to enumerate, for example, because the integers are sparsely populated, this solution would result in a lot of 404 Not Found requests.
If this is the case, then it is probably better to download all the files recursively, as you have shown, and rename them afterwards. If the file names follow a fixed pattern, you can select the substring from the original name and use it as the new name. In the given example, the new file names start at position 5 and are 8 characters long. The following bash command renames all *.nc files in the current directory.
for f in *.nc; do mv "$f" "${f:5:8}.nc" ; done
If the filenames do not follow a fix pattern and might vary in length, you can use more complex pattern substitution using sed, see SO post for an example.

clang-tidy: How to suppress warnings?

I recently started experimenting with the clang-tidy tool of llvm. Now I am trying to suppress false warnings from third party library code. For this I want to use the command line options
-header-filter=<string> or -line-filter=<string>
but so far without success. So for people with limited time I will put the question here at the beginning and explain later what I already tried.
Question
What option do I need to give to the clang-tidy tool to suppress a warning from a certain line and file?
if this is not possible
What option works to suppress warnings from external header files?
What I did so far
My original call to clang-tidy looks like this
clang-tidy-3.8 -checks=-*,clang-analyzer-*,-clang-analyzer-alpha* -p Generated/LinuxMakeClangNoPCH Sources/CodeAssistant/ModuleListsFileManipulator_fixtures.cpp
and the first line of the yielded warning that I want to suppress looks like this
.../gmock/gmock-spec-builders.h:1272:5: warning: Use of memory after it is freed [clang-analyzer-cplusplus.NewDelete]
return function_mocker_->AddNewExpectation(
The gmock people told me that this is a false positive so I want to suppress it. First I tried to use the -line-filter=<string> option. The documentation says:
-line-filter=<string> - List of files with line ranges to filter the
warnings. Can be used together with
-header-filter. The format of the list is a JSON
array of objects:
[
{"name":"file1.cpp","lines":[[1,3],[5,7]]},
{"name":"file2.h"}
]
I assumed that warnings in the given lines are filtered out. But the doc doesent say if they are filterd out or in.
After some fiddeling arround I created a .json file with the content
[
{"name":"gmock-spec-builders.h","lines":[[1272,1272]]}
]
and modified the command line to
clang-tidy-3.8 -checks=-*,clang-analyzer-*,-clang-analyzer-alpha* -p Generated/LinuxMakeClangNoPCH -line-filter="$(< Sources/CodeAssistant/CodeAssistant_ClangTidySuppressions.json)" Sources/CodeAssistant/ModuleListsFileManipulator_fixtures.cpp
which writes the content of the file into the argument. This suppresses the warning, but not only this warning, but all warnings from the ModuleListsFileManipulator_fixtures.cpp file. I tried more stuff but I could not make it work.
So I tried the -header-filter=<string> option. Here the documentation states that one has to give a regular expression that matches all the header files from which diagnostics shall be displayed. Ok, I thought, lets use a regualar expression that matches everything that is in the same folder as the analyzed .cpp file. I can live with that although it may remove warnings that result from me using external headers wrong.
Here I was not sure if the regular expression must match the full (absolute) filename or only a part of the filename. I tried
-header-filter=.*\/CodeAssistant\/.*.h
which matches all absolute header filenames in the CodeAssistant folder but it did not suppress the warnings from the gmock-spec-builders.h file.
So preferably I would like to suppress each warning individually so I can determine for each if it is a real problem or not, but if this is not possible I could also live with suppressing warnings from entire external headers.
Thank you for your time.
I solved the problem by adding // NOLINT to line 1790 of gmock-spec-builders.h
Here is the diff:
--- gmock-spec-builders.orig.h 2016-09-17 09:46:48.527313088 +0200
+++ gmock-spec-builders.h 2016-09-17 09:46:58.958353697 +0200
## -1787,7 +1787,7 ##
#define ON_CALL(obj, call) GMOCK_ON_CALL_IMPL_(obj, call)
#define GMOCK_EXPECT_CALL_IMPL_(obj, call) \
- ((obj).gmock_##call).InternalExpectedAt(__FILE__, __LINE__, #obj, #call)
+ ((obj).gmock_##call).InternalExpectedAt(__FILE__, __LINE__, #obj, #call) // NOLINT
#define EXPECT_CALL(obj, call) GMOCK_EXPECT_CALL_IMPL_(obj, call)
#endif // GMOCK_INCLUDE_GMOCK_GMOCK_SPEC_BUILDERS_H_
It would be nice to either upstream this patch (I see other NOLINT in the code) or post a bug report with the clang-tidy folks.
I have found another non-invasive (without adding // NOLINT to a third-party library) way to suppress warnings. For example, the current version of Google Test fails some cppcoreguidelines-* checks. The following code allows you to validate the current diff excluding lines that contain gtest's macros:
git diff -U3 | sed '
s/^+\( *TEST(\)/ \1/;
s/^+\( *EXPECT_[A-Z]*(\)/ \1/;
s/^+\( *ASSERT_[A-Z]*(\)/ \1/;
' | recountdiff | interdiff -U0 /dev/null /dev/stdin | clang-tidy-diff.py -p1 -path build
It assumes that file build/compile_commands.json is generated before and clang-tidy-diff.py is available from your environment. recountdiff and interdiff from patchutils are the standard tools for manipulating patches.
The script works as follows:
git diff -U3 generates a patch with 3 context lines.
sed ... removes prefix + from the undesired lines, i.e. transform them to the context.
recountdiff correct offsets (in first ranges) in the chunk headers.
interdiff -U0 /dev/null /dev/stdin just removes all context lines from a patch. As a result, it splits the initial hunks.
clang-tidy-diff.py reads only second ranges from chunk headers and passes them to clang-tidy via -line-filter option.
UPD: It's important to provide interdiff with a sufficient number of context lines, otherwise it may produce some artifacts in the result. See the citation from man interdiff:
For best results, the diffs must have at least three lines of context.
Particularly, I have found that git diff -U0 | ... | interdiff generates some spurious literals $!otj after splitting chunks.
Use -isystem instead of -I to set your system and 3rd party include paths. -I should only be used to include code that is part of the project being built.
This is the only thing required to make clang-tidy ignore all errors in external code. All the other answers (at the point of writing) are just poor workarounds for something that is perfectly solved with -isystem.
If you use a build system like CMake or Meson it will automatically set -I and -isystem correctly for you.
-isystem is also the mechanism that is used for telling compilers, at least GCC and Clang, what's not your code. If you start to use -isystem you can also enable more compiler warnings without getting "false positives" from external code.
I couldn't achive what I wanted with the commmand line options, so I will use the // NOLINT comments in the cpp files which was proposed by the accepted answer.
I will also try to push the fix to googletest.
I found out that lines in the -line-filter options are filtered in.
But giving concrete lines is no real solution for my problem anyways.
I rather need a suppression mechanism like it is implemented in Valgrind.

Counting files and directories in a very large subversion repository

Here at work, we have a rather large subversion repository. As part of our internal monitoring, we want a count of all files and directories for every revision in all our repositories. Problem is, one of them has around 29000 revisions, and contains around 300000 directories, with almost 4 million files. Our previous method simply used the output of the 'svnlook' command in a perl script to count everything. I've tried using the output 'svnlook changed' to build a count, and it mostly works, but there is some rather annoying guesswork involved. As a side note, the repos are hosted on a xen vm, so I/O performance is a bit of an issue. Anyone have a better way to do this?
Assuming you are talking about server side repos.
svn list -R --xml file:///svnrepos/myrepo | grep kind=\"file\" | wc -l
its not very fast, but it is accurate.
I'd look into the svnadmin dump delta format. I've played with it a little, but basically it's one huge patch-type file containing all the files and all the revisions. It's text in nature, so relatively straightforward to process with something like Perl, and it is fairly small compared to going through the whole of each revision one at a time.
You'd probably need to have a representation of all the files (if 4 million, maybe use SQLite for this) and update them as you pass through each revision. The delta does the revisions in order, so it ought to be relatively straightforward. (Maybe I am being optimistic.)
How about something like:
find /svndir | wc -l
The output from find on Linux or Unix generates one line per file or directory, and it is recursive. Pipe the output to "wc -l" to count the lines.

showing differences within a line in diff output

This StackOverflow answer has an image of KDiff3 highlighting intra-line differences. Does someone know of a tool which can show the same (ex, via color) on the command line?
Another way to think of this is wanting to diff each difference in a patch file.
I don't know if this is sufficiently command line for your purpose, but vimdiff can do this (even does colour). See for example the image in this related question.
I tried all the tools I found: wdiff, dwdiff, kdiff3, vimdiff to show the difference between two long and slightly different lines. My favourite is diff-highlight (part of git contrib)
it supports diff format - great advantage over tools requiring two files like (dwdiff), e.g. if you need to visualize the output of unit tests
it highlights with black+white or with color if you connect it to colordiff
highlights characterwise - helpful for comparing long lines without spaces (better than wdiff)
Installation
On Ubuntu, you probably already have it as part of git contrib (installed within the git deb package).
Copy or link it into your ~/bin folder from /usr/share/doc/git/contrib/diff-highlight/diff-highlight
Usage example
cat tmp.diff | diff-highlight | colordiff
Result:
Another intuitive way to see all word-sized differences (though not side-by-side) is to use wdiff together with colordiff (you might need to install both). An example of this would be:
wdiff -n {file-A} {file-A} | colordiff
You can optionally pipe this into less -R to scroll through the output (-R is used to show the colors in less).
I had a similar problem and wanted to avoid using vimdiff. I found dwdiff (which is available in Debian) to have several advantages over wdiff.
The most useful feature of dwdiff is that you can customise the delimiters with -d [CHARS], so it's useful for comparing all kinds of output. It also has color built in with the -c flag.
You might be able to use colordiff for this.
In their man page:
Any options passed to colordiff are
passed through to diff except for the
colordiff-specific option 'difftype',
e.g.
colordiff --difftype=debdiff file1
file2
Valid values for 'difftype' are: diff,
diffc, diffu, diffy, wdiff, debdiff;
these correspond to plain diffs,
context diffs, unified diffs,
side-by-side diffs, wdiff output and
debdiff output respectively. Use these
overrides when colordiff is not able
to determine the diff-type
automatically.
I haven't tested it, but the side-by-side output (as produced by diff -y file1 file2) might give you the equivalent of in-line differences.
ccdiff is a convenient dedicated tool for the task. Here is what an example may look like with it:
By default, it highlights the differences in color, but it can be used on a console without color support too.
The package is included in the main repository of Debian:
ccdiff is a colored diff that also colors inside changed lines.
All command-line tools that show the difference between two files fall short in showing minor changes visuably useful. ccdiff tries to give the look and feel of diff --color or colordiff, but extending the display of colored output from colored deleted and added lines to colors for deleted and addedd characters within the changed lines.

How does Perl know a file is binary?

I know you can use the file test operator -B to test if a file is binary, but how does Perl implement this internally?
From perldoc -f -B:
The -T and -B switches work as follows.
The first block or
so of the file is examined for odd characters such as strange
control codes or characters with the high bit set. If too many
strange characters (>30%) are found, it’s a -B file;
otherwise it’s a -T file. Also, any file containing null in
the first block is considered a binary file.
If -T or -B
is used on a filehandle, the current IO buffer is examined
rather than the first block.
Both -T and -B return true on
a null file, or a file at EOF when testing a filehandle.
Because you have to read a file to do the -T test, on most
occasions you want to use a -f against the file first, as in
"next unless -f $file && -T $file".
According to Chapter 11 of the book Learning Perl:
The answer is **Perl cheats**: it opens the file, looks at the first few thousand bytes, and makes an educated guess. If it sees a lot of null bytes, unusual control characters, and bytes with the high bit set, then that looks like a binary file. If there’s not much weird stuff, then it looks like text. It sometimes guesses wrong. If a text file has a lot of Swedish or French words (which may have characters represented with the high bit set, as some ISO-8859-something variant, or perhaps even a Unicode version), it may fool Perl into declaring it binary. So it’s not perfect, but if you need to separate your source code from compiled files, or HTML files from PNGs, these tests should do the trick.