Merge files using Hadd - merge

I am attempting to merge three ntuples (just an example but there are more) stored in a directory that are labeled as the following:
[1] mc16a_SUSY.root
[2] mc16d_SUSY.root
[3] mc16e_SUSY.root
[4] ......
To do this I am using the command hadd (hadd outputfile inputfiles..)
os.system(hadd -f Combined_SUSY_SAMPLES.root mc16*SUSY*.root)
For the output file I want to combine all files with mc16 and SUSY in the file name
But I receive the error:
hadd Target file: Combined_SUSY_SAMPLES.root
hadd compression setting for all output: 1
hadd Source file 1: mc16*SUSY*.root
Error in <TFile::TFile>: file mc16*SUSY*.root does not exist
Error in <TFileMerger::AddFile>: cannot open file mc16*SUSY*.root
hadd exiting due to error in mc16*SUSY*.root
It states that there aren't any files with mc16*SUSY*.root but these files do exist. Any solutions? Thanks for the help in advance.
I use the "*" because there are to many files to list them individually.

I'm late to respond here, but could it be that you don't have the full path to the files? Some databases will send you the file names with certain commands, but not the absolute locations. It could be that you need to append the folder to the beginning of the filename.

Related

mapping values are not allowed in this context in "<unicode string>"

In my loop, I run a dbt command and save the output to a .yml file. The following command works and generates a schema in my .yml file accurately:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > test.yml
done
However, in the example above, I am saving the test.yml file in the root directory. When I try to save the file in another path for example models/l30_mart/test.yml like this, it doesn't work:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > models/l30_mart/test.yml
done
In this case, when I open the test.ymlfile, I see this:
12:06:42 Running with dbt=1.0.1
12:06:43 Encountered an error:
Compilation Error
The schema file at models/l30_mart/test.yml is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
https://docs.getdbt.com/docs/schemayml-files
What am I missing out on?
If I try something like this to save different files with the extracted tablename variable as the filename, it also doesn't work:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > models/l30_mart/$table.yml
done
In this case, the files either have this output:
20:39:44 Running with dbt=1.0.1
20:39:45 Encountered an error:
Compilation Error
The schema file at models/l30_mart/**firsttable.yml** is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
https://docs.getdbt.com/docs/schemayml-files
or this (eg in the secondtablename.yml file):
20:39:48 Running with dbt=1.0.1
20:39:49 Encountered an error:
Parsing Error
Error reading dbt_4flow: l30_mart/firstablename.yml - Runtime Error
Syntax error near line 2
------------------------------
1 | 20:39:44 Running with dbt=1.0.1
2 | 20:39:45 Encountered an error:
3 | Compilation Error
4 | The schema file at models/l30_mart/firsttablename.yml is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
5 |
Raw Error:
------------------------------
mapping values are not allowed in this context
in "<unicode string>", line 2, column 31
Note that the secondtablename.yml mentions the firsttablename.yml.
I don't know dbt but the explanation that seems likely is that dbt for some reason parses all *.yml files in that target directory when you call it. Since the shell opens the pipe to the *.yml file before calling dbt, the file already exists (but initially empty) when dbt is called. Since dbt expects the file to contain a version, you get an error.
To check whether this assessment is correct, write into a temporary file:
for file in models/l30_mart/*.sql; do
target_file=$(mktemp)
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > $target_file
mv $target_file models/l30_mart/test.yml
done
(Be aware of mktemp shenanigans if you're using macOS)
Edit: Since dbt seems to be affected by the files existing, you can also try to generate all files and move them into the correct directory afterwards:
target_dir=$(mktemp -d)
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > $target_dir/$table.yml
done
mv $target_dir/*.yml models/l30_mart/
rmdir $target_dir

Read multiple h5 files but there is an os error couldn't find these files

When I am trying to read many h5 files in one shoot. There is an OS error states like this:
OSError: Unable to open file (unable to open file: name = '/scratch-lustre/hpc-0227/deepcpgData/c{1,2,3,4,5,7,9,11,13}_*.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
I am sure all of these files exist and files' name corresponded with c{1,2,3,4,5,7,9,11,13}_*.h5 rightly. I am also using absolute path. My bash script looks like this:
data_dir="/scratch-lustre/hpc-0227/deepcpgData"
train_files="$data_dir/c{1,2,3,4,5,7,9,11,13}_*.h5"
It works when I use the full name of a single file, for example, c16_1572864-1605632.h5. However, I need to read a lot of files. So I have to read them in one round. I have searched many other answers but none of them deal with multiple files and the os error at the same time.

Load all records that contain `sym value from splayed tables in directory

I have tables called; quotes, trades and sym saved as splayed tables in a directory called splay in my q directory. I cannot figure out how to load these tables using the methods identified on the code.kx.com website. When I check the file properties, it says file type is File, so I do not know what type of file to open after the filename. Once I have managed to load these files, I need to select all records that contain the symbol IBM (in the column sym of the tables). I have tried so far:
q)\cd splay
q)\l quotes
'quotes. OS reports: The system cannot find the file specified.
[0] (.Q.l)
q)\l trades
'trades. OS reports: The system cannot find the file specified.
[0] (.Q.l)
.Q )\l trades.q
'trades.q. OS reports: The system cannot find the file specified.
[2] (<load>)
))\l trades.dat
'trades.dat. OS reports: The system cannot find the file specified.
[4] (.Q.l)
to no avail. the same approach but for the directory itself.
q)\l splay
I have tried to just run the files without loading by being in the directory but this has also not been successful.
q)\cd splay
q)\cd
"C:\\Users\\Lewis\\splay"
q)t:get`:trades
'trades. OS reports: The system cannot find the file specified.
[0] t:get`:trades
^
q)q:get `:quotes
'quotes. OS reports: The system cannot find the file specified.
[0] q:get `:quotes
^
q)load`quotes
'quotes. OS reports: The system cannot find the file specified.
[0] load`quotes
^
One of the ways the code.kx.com website says to do this, and one of my first approaches:
C:\Users\Lewis\q>q/q.exe splay
KDB+ 3.5 2017.10.11 Copyright (C) 1993-2017 Kx Systems
w32/ 4()core . . .
Welcome to kdb+ 32bit edition
For support please see http://groups.google.com/d/forum/personal-kdbplus
Tutorials can be found at http://code.kx.com/q
To exit, type \\
To remove this startup msg, edit q.q
'/q.exe. OS reports: The system cannot find the file specified.
[0] (.Q.l)
.Q )
and the final approach I have had to load these files or directory is:
q)))load `splay
'splay. OS reports: Access is denied.
[6] load `splay
^
q))))\cd splay
q))))load `splay
'splay. OS reports: Access is denied.
[9] load `splay
^
Please, help me!
If you are in the directory /Users/Lewis you should be able to pass the splay as a command line parameter, like this: q splay. There may be an issue with the path you are using to your q application q\q.exe which is causing an error to flag up.
Alternatively you should be able to open it from inside an active q session like: \l splay provided you are in the directory /Users/Lewis OR like \l . if you are in the directory /Users/Lewis/splay, where . is a shortcut for the current directory.
Additionally you stated that you have the tables trade, quote and sym. It all depends how you saved the data to disk but the sym file should not be a table like the other two, which you should see when you load the data in.
The error OS reports: Access is denied. is probably due to the q process not having appropriate permissions to access the file. If you start the process with admin privileges you should be able to get around this error.

unoconv fails to save in my specified directory

I am using unoconv to convert an ods spreadsheet to a csv file.
Here is the command:
unoconv -vvv --doctype=spreadsheet --format=csv --output= ~/Dropbox
/mariners_site/textFiles/expenses.csv ~/Dropbox/Aldeburgh/expenses
/expenses.ods
It saves the output file in the same directory as the source file, not in the specified directory. The error message is:
Output file: /home/richard/Dropbox/mariners_site/textFiles/expenses.csv
unoconv: UnoException during export phase:
Unable to store document to file:///home/richard/Dropbox/mariners_site
/textFiles/expenses.csv (ErrCode 19468)
I'm sure that this worked initially, but it has since stopped.
I have checked for permissions and they are identical for both directories.
I translated ErrCode 19468 for you and it boils down to meaning ERRCODE_SFX_DOCUMENTREADONLY.
You can find more information about the specific meaning of LibreOffice ErrCode numbers from the unoconv documentation at: https://github.com/dagwieers/unoconv/blob/master/doc/errcode.adoc
The clue here is that you have a whitespace-character between --output= and the filename (--output= ~/Dropbox
/mariners_site/textFiles/expenses.csv) and because of that unoconv gets an empty output value (which means the current directory) and is given 2 files. And that explains why you get this specific error IMO

Copy all files with given extension to output directory using CMake

I've seen that I can use this command in order to copy a directory using cmake:
file(COPY "myDir" DESTINATION "myDestination")
(from this post)
My problem is that I don't want to copy all of myDir, but only the .h files that are in there. I've tried with
file(COPY "myDir/*.h" DESTINATION "myDestination")
but I obtain the following error:
CMake Error at CMakeLists.txt:23 (file):
file COPY cannot find
"/full/path/to/myDIR/*.h".
How can I filter the files that I want to copy to a destination folder?
I've found the solution by myself:
file(GLOB MY_PUBLIC_HEADERS
"myDir/*.h"
)
file(COPY ${MY_PUBLIC_HEADERS} DESTINATION myDestination)
this also works for me:
install(DIRECTORY "myDir/"
DESTINATION "myDestination"
FILES_MATCHING PATTERN "*.h" )
The alternative approach provided by jepessen does not take into account the fact that sometimes the number of files to be copied is too high. I encountered the issue when doing such thing (more than 110 files)
Due to a limitation on Windows on the number of characters (2047 or 8191) in a single command line, this approach may randomly fail depending on the number of headers that are in the folder. More info here https://support.microsoft.com/en-gb/help/830473/command-prompt-cmd-exe-command-line-string-limitation
Here is my solution:
file(GLOB MY_HEADERS myDir/*.h)
foreach(CurrentHeaderFile IN LISTS MY_HEADERS)
add_custom_command(
TARGET MyTarget PRE_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${CurrentHeaderFile} ${myDestination}
COMMENT "Copying header: ${CurrentHeaderFile}")
endforeach()
This works like a charm on MacOS. However, if you have another target that depends on MyTarget and needs to use these headers, you may have some compile errors due to not found includes on Windows. Therefore you may want to prefer the following option that defines an intermediate target.
function (CopyFile ORIGINAL_TARGET FILE_PATH COPY_OUTPUT_DIRECTORY)
# Copy to the disk at build time so that when the header file changes, it is detected by the build system.
set(input ${FILE_PATH})
get_filename_component(file_name ${FILE_PATH} NAME)
set(output ${COPY_OUTPUT_DIRECTORY}/${file_name})
set(copyTarget ${ORIGINAL_TARGET}-${file_name})
add_custom_target(${copyTarget} DEPENDS ${output})
add_dependencies(${ORIGINAL_TARGET} ${copyTarget})
add_custom_command(
DEPENDS ${input}
OUTPUT ${output}
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${input} ${output}
COMMENT "Copying file to ${output}."
)
endfunction ()
foreach(HeaderFile IN LISTS MY_HEADERS)
CopyFile(MyTarget ${HeaderFile} ${myDestination})
endforeach()
The downside indeed is that you end up with multiple target (one per copied file) but they should all end up together (alphabetically) since they start with the same prefix ORIGINAL_TARGET -> "MyTarget"