I am trying to get Powershells Restore-DfsrPreservedFiles to restore about 100GB of files that got placed into the DfsrPrivate\PreExisting folder. I've got the Command all worked out and it does work until it hits a file that causes a path name too long error. At this point have 99.9GB restore with 1 file missing I can get from backups later is not a problem.
I can't figure out (if its possible) to make it skip the 1 or 2 files its having problems with and keep going.
Related
Yesterday I installed PostgreSQL 14.5 on a Windows 10 laptop.
I then ran an old script to load images into a table.
The script uses the pg_read_binary_file function.
Some of the images are .jpg files and some are .png files.
Of the 34 files, only 5 were successfully processed (1 .jpg and 4 .png). The other 29 failed with the following error:
[Exception, Error code 0, SQLState XX000] ERROR: could not open file "file absolute path" for reading: Invalid argument
For instance, the following statement executes without errors
select pg_read_binary_file('C:\Users\Jorge\OneDrive\Documents\000\020-logos\adalid.png') as adalid_png;
... and the following statement fails
select pg_read_binary_file('C:\Users\Jorge\OneDrive\Documents\000\020-logos\oper.png') as oper_png;
... with the following error message
[Exception, Error code 0, SQLState XX000] ERROR: could not open file "C:/Users/Jorge/OneDrive/Documents/000/020-logos/oper.png" for reading: Invalid argument
So far, I have not been able to identify any difference in the files that could be the cause of the error. Also, I'm pretty sure the script works on earlier releases of version 14. Unfortunately I have not been able to find a website to download any of those earlier releases to test it again.
Has anyone else found this problem, and its solution?
I think the issue is somehow caused by OneDrive. This laptop is new. When I logged in with my Microsoft account, the OneDrive directory was automatically created and updated. Apparently this operation only updates the directory entries, leaving the contents of the files in the cloud until they are opened. When I zipped the directory that contains all my images, a message from OneDrive appeared saying that in that moment it will restore some files. After that, all the commands in my scripts work.
My theory is that pg_read_binary_file gets the file entry from the directory, so it doesn't give the "No such file or directory" message; but then fails reading the contents, giving the "Invalid argument" message instead.
The unanswered question would be: why does 7-Zip make OneDrive restore the files but pg_read_binary_file does not?
UPDATE
After more testing, and reading Save disk space with OneDrive Files On-Demand for Windows, now I am sure that pg_read_binary_file could fail and send the message "Invalid argument" when the OneDrive file is not a locally available file. In Windows File Explorer such file has a blue cloud icon next to it.
I have a script regularly appending to a log file. When I use entr (discovered here) to monitor that log file, and I then touch the log, everything works fine, but when the script appends to the file, entr fails. This may be because I have noatime set in my fstab - but that only stops the updating of the access time not the modify time, so this confuses me.
I've checked and while atime is not updating, ctime (ls -lc) definitely is. Could entr really be depending on atime? I use noatime because I have an SSD. So what should I do? I just stumbled on lazytime. Would that solve the problem?
Since monitoring the log file was not working, I tried entr -cdr on the directory of files that are updated (a new file is created) at the same time as the log (the log is in a different directory). entr recognizes when the directory contents change, but the -r does not work. The entr process just ends, saying "entr: directory altered".
Any idea how to fix this or whether I should just go back to inotify, would be appreciated.
Edit: I have written it with inotify now, and the event reported when the log file is written to is, sensibly enough, "MODIFY."
It turns out that entr does not respond to IN_MODIFY events, but only to these (in Linux):
IN_CLOSE_WRITE|IN_DELETE_SELF|IN_MOVE_SELF|IN_CREATE
Also, IN_ATTRIB, but only if the file-mode or inode numbers change.
In BSD/OSX, it's:
NOTE_DELETE|NOTE_WRITE|NOTE_RENAME|NOTE_TRUNCATE|NOTE_ATTRIB
Also, the option -r has no effect in the context of the -d option. It only works when entr is monitoring files.
See the developer's comments. Also, more info on entr.
I'm trying to create parquet files for several days locally. The first time I run the code, everything works fine. The second time it fails to delete a file. The third time it fails to delete another file. It's totally random which file can not be deleted.
The reason I need this to work is because I want to create parquet files everyday for the last seven days. So the parquet files that are already there should be overwritten with the updated data.
I use Project SDK 1.8, Scala version 2.11.8 and Spark version 2.0.2.
After running that line of code the second time:
newDF.repartition(1).write.mode(SaveMode.Overwrite).parquet(
OutputFilePath + "/day=" + DateOfData)
this error occurs:
WARN FileUtil:
Failed to delete file or dir [C:\Users\...\day=2018-07-15\._SUCCESS.crc]:
it still exists.
Exception in thread "main" java.io.IOException:
Unable to clear output directory file:/C:/Users/.../day=2018-07-15
prior to writing to it
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:91)
After the third time:
WARN FileUtil: Failed to delete file or dir
[C:\Users\day=2018-07-20\part-r-00000-8d1a2bde-c39a-47b2-81bb-decdef8ea2f9.snappy.parquet]: it still exists.
Exception in thread "main" java.io.IOException: Unable to clear output directory file:/C:/Users/day=2018-07-20 prior to writing to it
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:91)
As you see it's another file than the second time running the code.
And so on.. After deleting the files manually all parquet files can be created.
Does somebody know that issue and how to fix it?
Edit: It's always a crc-file that can't be deleted.
Thanks for your answers. :)
The solution is not to write in the Users directory. There seems to be a permission problem. So I created a new folder in the C: directory and it works perfect.
this problem occurs when you open the destination directory in windows. You just need to close the directory.
Perhaps another Windows process has a lock on the file so it can't be deleted.
I am configuring Blast+ on my mac (os sierra) and am having trouble configuring my nr and nt databases that I also downloaded locally. I am trying to follow NCBI's instructions here, and am getting hung up on the Configuration and Example Execution steps.
They say to change my .bash_profile so that it says:
export PATH=$PATH:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/ncbi-blast-2.6.0+/bin
That works fine, and they say configure a path for BLASTDB "similarly" but to the file where my DB will be, so I have done this:
export BLASTDB=$BLASTDB:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/blastdb/nt.00
which specifies the exact folder that I got when I unzipped the nt tar file from their FTP. With this path, if I run the command...
blastn -query test_query.fa -db nt.00 -task blastn -outfmt "7 qseqid sseqid evalue bitscore" -max_target_seqs 5
then it runs successfully and I get results, but I am worried that these are only being checked against the nt.00 section of the entire nt.00 database file, especially because if I run my test_query.fa sequence on the Web Blast, I get different results.
Also, their instructions say that the path only needs to point to the folder that contains the whole database folder nt.00, from the tar I unzipped--and not the specific nt.00 itself--, which in my case would just be "blastdb/" (As opposed to "blastdb/nt.00/" which then contains nt.00.nhd, nt.00.nal, etc.). That makes sense because when I am working I want to be able to blastn on the nt database but also blastp on the nr one, etc. by changing the -db flag on my command, and there shouldn't be a problem with having them all in this folder, right? But if I must specify the path for BLASTDB with the nt.00 DB added to the end, how could I ever use nr.00 in the same folder (blastdb/)? Essentially, I want to do as the instructions say, and just have this:
export BLASTDB=$BLASTDB:$HOME/Documents/Luke/Research/Pedulla\ 17-18/blast/blastdb/
And then depending on what database I want to use I could just say so after the -db flag on my command. But when I make the path like that above, it gives me this error:
BLAST Database error: No alias or index file found for nucleotide database [nt] in search path [/Users/LJStout::/Users/LJStout/Documents/Luke/Research/Pedulla 17-18/blast/blastdb:]
I have tried running that same blastn command from above and swapping out "nt" for "nt.00", and have tried these commands with the path for BLASTDB ending in both "blastdb/" and "blastdb/nt" and of course "blastdb/nt.00" which is the only one that runs without errors.
Here's an example of another thread I read where the OP is worried about his executions not checking the entire nt.00 folder, this was different than my problem however.
Thanks for you help!
This whole problem came down to having the nt.00 & nr.00 files, the original folders that result from unzipping their respective .tar.gz's, in the same parent folder when it should be that their contents are in the same parent folder. I simply deleted the folders they came in and copied the contents over to my new, singular parent. I was kind of mislead by the instructions, it was a simple mistake. Now, I have one folder, blastdb/ that contains all of the contents of every database I plan on using, including nt,nr, and refseq.
I have a script that reads data, process it, and prints an output. The script is not running in the directory where it is saved. I tried to change directory via OS.chdir, but still I get the error "file not found". I placed the scripts in the phone storage by drag and drop from my PC.