Moses SMT translation difficulty with creating model (German to English) from two parallel file datasets - moses

I try to create the model in Moses SMT from two parallel language files.
I completed all the stages of creation of the model.
But when I run translation:
echo "Um zu bestimmen" | ~/mosesdecoder/bin/moses -f ~/mosesdecoder/0_my_test/align_2016.08.19_14.24.05/model/moses.ini
it gives me some exception (in attached picture - in full detail, and written below (only exception body)), asking to execute the command:
compile-lm --text yes /home/user/mosesdecoder/0_my_test/align_2016.08.19_14.24.05/lm/de_lm_proc.gz /home/user/mosesdecoder/0_my_test/align_2016.08.19_14.24.05/lm/de_lm_proc.gz.arpa
attached picture of the result of running moses command mentioned above
.....
Exception: lm/read_arpa.cc:64 in void lm::ReadARPACounts(util::FilePiece&, std::vector&) threw FormatLoadException because `line == "iARPA"'.
This looks like an IRSTLM iARPA file. You need an ARPA file. Run
compile-lm --text yes /home/user/mosesdecoder/0_my_test/align_2016.08.19_14.24.05/lm/de_lm_proc.gz /home/user/mosesdecoder/0_my_test/align_2016.08.19_14.24.05/lm/de_lm_proc.gz.arpa
first. Byte: 6
But even after I executed that command I get the same exception.
What should I do?

Related

Synapse Spark exception handling - Can't write to log file

I have written PySpark code to hit a REST API and extract the contents in an XML format and later wrote to Parquet in a data lake container.
I am trying to add logging functionality where I not only write out errors but updates of actions/process we execute.
I am comparatively new to Spark I have been relying on online articles and samples. All explain the error handling and logging through "1/0" examples and saving logs in the default folder structure (not in ADLS account/container/folder) which do not help at all. Most of the code written in Pure Python doesn't run as-is.
Could I get some assistance with setting up the following:
Push errors to a log file under a designated folder sitting under a data lake storage account/container/folder hierarchy".
Catching REST specific exceptions.
This is a sample of what I have written:
''''
LogFilepath = "abfss://raw#.dfs.core.windows.net/Data/logging/data.log"
#LogFilepath2 = "adl://.azuredatalakestore.net/raw/Data/logging/data.log"
print(LogFilepath)
try:
1/0
except Exception as e:
print('My Error...' + str(e))
with open(LogFilepath, "a") as f:
f.write("An error occured: {}\n".format(e))
''''
I have tried it both ABFSS and ADL file paths with no luck. The log file is already available in the storage account/container/folder.
I have reproduced the above using abfss path in with open() function but it gave me the below error.
FileNotFoundError: [Errno 2] No such file or directory: 'abfss://synapsedata#rakeshgen2.dfs.core.windows.net/datalogs.logs'
As per this Documentation
we can use open() on ADLS file with a path like /synfs/{jobId}/mountpoint/{filename}.
For that, first we need to mount the ADLS.
Here I have mounted it using ADLS linked service. you can mount either by Storage account access key or SAS as per your requirement.
mssparkutils.fs.mount(
"abfss://<container_name>#<storage_account_name>.dfs.core.windows.net",
"/mountpoint",
{"linkedService":"<ADLS linked service name>"}
)
Now use the below code to achieve your requirement.
from datetime import datetime
currentDateAndTime = datetime.now()
jobid=mssparkutils.env.getJobId()
LogFilepath='/synfs/'+jobid+'/synapsedata/datalogs.log'
print(LogFilepath)
try:
1/0
except Exception as e:
print('My Error...' + str(e))
with open(LogFilepath, "a") as f:
f.write("Time : {}- Error : {}\n".format(currentDateAndTime,e))
Here I am writing date time along with the error and there is no need to create the log file first. The above code will create and append the error.
If you want to generate the logs daily, you can generate date file names log files as per your requirement.
My Execution:
Here I have executed 2 times.

For some reason, a warning is issued when calling the procedure SYSPROC.ADMIN_CMD ('EXPORT to ...')

I have the following problem:
I am using the following command:
EXPORT TO "D:\ExportFiles\ACTIVATE_DICT.csv" OF DEL MODIFIED BY TIMESTAMPFORMAT="YYYY/MM/DD HH:MM:SS" STRIPLZEROS MESSAGES "D:\ExportFiles\FMessage.txt" SELECT * FROM DB2INST4.ACTIVATE_DICT;
In the Command Editor of the program, the Control Center successfully exported data from the ACTIVATE_DICT table to a CSV file ACTIVATE_DICT.csv.
But for a number of reasons, I need you to execute this command in the IBM Data Studio or DataGrip program, and there it cannot be executed in this form.
Therefore, I read the following manual enter link description here
and based on it wrote the following command:
CALL SYSPROC.ADMIN_CMD('EXPORT to /lotus/ExportFiles/ACTIVATE_DICT.csv OF DEL MODIFIED BY TIMESTAMPFORMAT="YYYY/MM/DD HH:MM:SS" STRIPLZEROS MESSAGES /lotus/ExportFiles/FMessage.txt SELECT * FROM DB2INST4.ACTIVATE_DICT');
Here is the message on the result of the command:
[2018-10-11 15:15:23] [ ][3107] There is at least one warning
message in the message file.. SQLCODE=3107, SQLSTATE= ,
DRIVER=4.23.42 [2018-10-11 15:15:23] 1 row retrieved starting from 1
in 75 ms (execution: 29 ms, fetching: 46 ms)
And in the / lotus / ExportFiles / directory there is no ACTIVATE_DICT.csv file and there is no FMessage.txt file in the / lotus / ExportFiles / directory.
Question: How then to correctly execute this command ??? Maybe I'm doing something wrong?
sqlcode 3107 is a warning message:
SQL3107W At least one warning message was encountered during LOAD processing.
Explanation
You can load data into a database from a file, tape, or named pipe using the LOAD command. You can specify that any warnings or errors from the LOAD processing be printed to a message file. If no message file is specified, the warnings or errors are printed to standard out (unless the database manager instance is configured as a partitioned-database environment.)
It is to tell you to read message log in the message file you specified. In your case: /lotus/ExportFiles/FMessage.txt
Please read into the file to see what error is logged and if you need help understand what is logged, please post the content of the file.
This message is returned when at least one warning was received during processing. If a message file is being used, the warnings and errors will be printed there.
This warning does not affect processing.
User response
Review the message file warning.
EXPORT command using the ADMIN_CMD procedure
See use of the 'MESSAGES ON SERVER' clause, and how to get these messages using the result set returned by this routine in this case.

Cudafy chapter 3 example has path issue how to fix?

Using Cudafy version 1.29, which can be downloaded from here
I am executing the examples that are found in the install folder CudafyV1.29\CudafyByExample\
Specifically, "chapter 3" example that begins line 42 of program.cs calls the following:
simple_kernel.Execute();
which is this:
public static void Execute()
{
CudafyModule km = CudafyTranslator.Cudafy(); // <--exception thrown!
GPGPU gpu = CudafyHost.GetDevice(CudafyModes.Target, CudafyModes.DeviceId);
gpu.LoadModule(km);
gpu.Launch().thekernel(); // or gpu.Launch(1, 1, "kernel");
Console.WriteLine("Hello, World!");
}
The indicated line throws this exception:
Compilation error: CUDAFYSOURCETEMP.cu
'C:\Program' is not recognized as an internal or external command,
operable program or batch file. .
Which is immediately obvious that the path has spaces and the programmer did not double quote or use ~ to make it operational.
So, I did not write this code. And I cannot step through the sealed code contained within CudafyModule km = CudafyTranslator.Cudafy();In fact I don't even know the full path that is causing the exception, it is cut-off in the exception message.
Does anyone have a suggestion for how to fix this issue?
Update #1: I discovered where CUDAFYSOURCETEMP.cu lives on my computer, here it is:
C:\Users\humphrt\Desktop\Active Projects\Visual Studio
Projects\CudafyV1.29\CudafyByExample\bin\Debug
...I'm still trying to determine what the program is looking for along the path to 'C:\Program~'.
I was able to apply a workaround to bypass this issue. The workaround is to reinstall all components of cudafy in to folders with paths with no ' ' (spaces). My setup looks like the below screenshot. Notice that I also installed the CUDA TOOLKIT from NVIDIA in the same folder - also with no spaces in folder names.
I created a folder named "C:\CUDA" and installed all components within it, here is the folder structure:

Model Error: It appears that build process was unable to locate some Utility(e.g. make,compiler,linker, etc.)

I am new to S-Functions and Real TIme WorkShop. The S-Function has been created using an Embedded Matlab Function. The S-Function Works fine, but when i try to build the same, i get the following error report:
It appears that the build process was unable to locate some utility (e.g. make, compiler, linker, etc.). Please verify your path and tool environment variables are correct. You should be able to execute the make command: .\Radius_S_func2.bat at an MS DOS Command Prompt in the directory: C:\Users\skaushik\Desktop\Matlab Models\WIP\TESTs\Sfunc_2\Radius_S_func2_Source Currently, this generates the following error:
C:\Users\skaushik\Desktop\Matlab Models\WIP\TESTs\Sfunc_2\Radius_S_func2_Source>set MATLAB=C:\MATLAB\R2010b
C:\Users\skaushik\Desktop\Matlab Models\WIP\TESTs\Sfunc_2\Radius_S_func2_Source>.......\u_Utils\Build\make -f Radius_S_func2.mk GENERATE_REPORT=0 INCLUDE_MDL_TERMINATE_FCN=0 COMBINE_OUTPUT_UPDATE_FCNS=1 MAT_FILE=0 MULTI_INSTANCE_CODE=0 INTEGER_CODE=0 PORTABLE_WORDSIZES=0 GENERATE_ERT_S_FUNCTION=0 GENERATE_ASAP2=0 EXT_MODE=0 EXTMODE_STATIC_ALLOC=0 EXTMODE_STATIC_ALLOC_SIZE=1000000 EXTMODE_TRANSPORT=0 TMW_EXTMODE_TESTING=0 MODELLIB=Radius_S_func2lib.lib RELATIVE_PATH_TO_ANCHOR=.. MODELREF_TARGET_TYPE=NONE OPTS="-DRT -DUSE_RTMODEL -DERT" The system cannot find the path specified.
PLease guide me how to understand the error, so i can take care of it myself.
Thank you!!
Solution: The build target was not specified therefore matlab gave the above mentioned error. To resolve this one should specify the build environment for a specific target.

Tesseract problem with mftraining stage

I've successfully created a box file with tesseract
now after running the unicharset_extractor
having it creating the unicharset file that looks like:
...
n 3 NULL -1
s 3 NULL 23
t 3 NULL 43
...
I've continued with this command
mftraining -U unicharset -O testlang.unicharset testlang.tr
only to get the next error
Reading testlang.tr ...
testlang has no defined properties.
Error: Illegal short name for a feature!
I've never worked with Tesseract, but it seems there is an open issue in the bug database that looks a lot like your problem : http://code.google.com/p/tesseract-ocr/issues/detail?id=385
It seems that it is related to scientific notation not being correctly supported by some functions.
On the issue page a user suggests a solution, and another one proposes a patch. You could try applying the patch to see if it helps.