I am trying to create an email with the some job status information, which I wish to put across multiple lines. However, whatever I do, I get the output in one line. Have changed the MIME type to HTML, used "\n", "\r", "\r\n", String Objects newline. Nothing seems to work.
Although I noticed that these characters do get processed, even though the outcome isn't as expected. I don't see them in the email body, which suggests that the text processor accepts them. Just doesn't process them they way it should. Do I see a bug in the component?
I am on Talend Open Studio 7.0.1, on Ubutntu 16.04.4 VM, on Windows 10 system (if that helps).
HTML < BR > works.
I tried it earlier but looks like I didn't structure my html tags well so it failed. Did it from start and got it right.
Guess what - The more you try, the more you learn. :)
Related
I'm trying to do a mail merge, and having issues at the PRE-merge stage..ie, as soon as write the .docx.
The xml tag in question is: «customer_name»
In the XML, the tags, and single quotes, behave fine..
..UNTIL a single quote sits DIRECTLY next to the mail merge delimiter..
ie «customer_name»’
Then, when i look at the xml, the tag is split:
<w:t>«</w:t>
<w:t>customer_name»</w:t>
<w:t>’s</w:t>
Does anyone have any pointers?
Many thanks,
Chris
I am importing a dataset from Google Cloud Storage (parameterized) into Dataprep. So far, this worked perfectly fine and one of the feature that I liked is that it auto detects that the first row in my (application/octet-stream) .csv file are my headers.
However, today I tried to import a new dataset and it did not detect the headers, but it auto assigned column1, column2...
What has changed and or why is this the case. I have checked the box auto-detect and use UTF-8:
While the auto-detect option is usually pretty good, there are times that it fails for numerous reasons. I've specifically noticed this when the field names contain certain characters (e.g. comma, invisible characters like zero-width-non-joiners, null bytes), or when multiple different styles of newline delimiters are used within the same file.
Another case I saw this is when there were more columns of data than there were headers.
As you already hit on, you can use the following snippet to do mostly the same thing:
rename type: header method: filter sanitize: true
. . . or make separate recipe steps to convert the first row to header and then bulk-rename to your own liking.
More often than not, however, I've found that when auto-detect fails on a previously working file, it tends to be a sign of some sort of issue with the source file. I would look for mismatched data, as well as misplaced commas within the output, as well as comparing the header and some data rows to the original source using a plaintext editor.
When all else fails, you can try a CSV validator . . . but in my experience they tend to be incredibly opinionated when it comes to the formatting options of the file—so depending on the system generating the CSV, it could either miss any errors or give false-positives. I have had two experiences where auto-detect fails for no apparent reason on perfectly clean files, so it is possible that process was just skipped for some reason.
It should also be noted that if you have a structured file that was correctly detected but want to revert it, you can go to the dataset details, select the "..." (More) button, and choose "Remove structure..." (I'm hoping that one day they'll let you do the opposite when you want to add structure to a raw dataset or work around bugs like this!)
Best of luck!
Can be resolved as a transformation within a Flow:
rename type: header method: filter sanitize: true
I have used hudson in that past and am very happy with it. It seemed to work well.
I recently installed jenkins and set up the editable email plug in.
Jenkins Version: 1.513
Email-ext plugin version: 2.28
Unfortunately when I try to add other tokens/over ride the default email it just appends all the tokens to the same line.
This is confusing. I have the email set up for html.
Any hints on how to format this nicer?
The default email sent (not the editable one) works ok, but I would like more useful information.
Unfortunately the format of this email makes it close to useless.
here is my editable content:
$BUILD_TAG
$BUILD_ID
$SVN_REVISION
$CHANGES
$CAUSE
$DEFAULT_CONTENT
$WARNINGS_NEW
$WARNINGS_COUNT
Here is the email received:
jenkins-DotNet-43 2013-05-13_16-09-40 7481 [kevin] -help layout Started by an SCM change DotNet - Build # 43 - Successful: Check console output at http://[buildserver]:8080/job/DotNet/43/ to view the results. [kevin] -help layout Started by an SCM change [...truncated 142 lines...] CopyFilesToOutputDirectory: Copying file from "obj\Release\Model.Wpf.dll" to "bin\Release\Model.Wpf.dll". Model.Wpf -> C:\Jenkins.jenkins\jobs\DotNet\workspace\dotnet\Messenger\Model\Model.Generic\bin\Release\Model.Wpf.dll Copying file from "obj\Release\Model.Wpf.pdb" to "bin\Release\Model.Wpf.pdb". Done Building Project "C:\Jenkins.jenkins\jobs\DotNet\workspace\dotnet\Messenger\Model\Model.Ge
EDIT
Note: when I put in "< BR >" entries between items they are separated by linefeeds in the email. Unfortunately though within the tokens themselves (like the change list) the are NO line separators - for example multiple commits are listed all on one line.
The content is there, but it is difficult to decipher. It seems there is a bug in the mail plugin or some other related system.
You already noticed that you need to actually use HTML line breaks between tokens so they don't show up on the same line, so I'll just answer the part about the multiple change log entries on the same line.
From the Content Token Reference, bold emphasis mine:
${CHANGES, showPaths, showDependencies, format, pathFormat}
Displays the changes since the last build.
showDependencies - if true, changes to projects this build depends on are shown.
Defaults to false.
showPaths - if true, the paths modified by a commit are shown.
Defaults to false.
format - for each commit listed, a string containing %X, where %X is one of %a for author, %d for date, %m for message, %p for paths,
or %r for revision. Not all revision systems support %d and %r. If
specified, showPaths is ignored.
Defaults to "[%a] %m\n".
pathFormat - a string containing %p to indicate how to print paths.
Defaults to "\t%p\n".
The unparameterized ${CHANGES} token is set up for display in a plain text email. You need to configure it so it displays properly in an HTML environment.
Example: <ul>${CHANGES, format="<li>[%a] %m</li>"}</ul>
One may try
mimeType:'HTML/text'
with the emailext plugin and use HTML <br> tag for new lines.
Surprisingly mimeType:'text/html' didn't work in my case whereas mimeType:'HTML/text' did.
I'm not sure how this can be a FOP issue, but I've never seen it with PDFs from any other source, so I've tried to investigate further.
Our application creates PDFs via xsl-fo, using FOP. This has worked great for a couple of years -- occasionally a user will have trouble printing a specific document, and see a very particular type of corruption, wherein most characters are "incremented". That is to say, 1 becomes 2, M becomes N, period becomes a slash, and the word invoice becomes the mildly amusing "jowpjdf". The document displays fine (typically in Adobe Reader). We've generally worked around it, but now an even odder case presents itself.
A new addition to our application creates 2 substantially similar PDFs created with FOP, then concatenates them using Perl's PDF::Reuse to grab the files from the filesystem and create a new document, which is then sent to the user by email. User opens document fine in Reader, hits print, and something new happens... Page 1 prints perfectly, but page 2 is corrupt in exactly the manner described above.
If it was a consistent print driver issue, I'd expect to see both pages corrupted. If it was a FOP issue, likewise. If it was a PDF::Reuse issue, I'd expect to see more fundamental breakage, and this breakage is not new since we started concatenating documents. I'm at a loss where to investigate next.
Has anyone seen similar corruption in PDFs, especially when generating using Apache FOP?
tl;dr PDFs created using FOP sometimes print with every character shifted by 1, e.g. A->B, 3->4
After a lot of experiments, I still can't get the following script working. I need some guidance on how to diagnoze this particular Perl problem. Thanks in advance.
This script is for testing the use of Office 2007 OCR API:
use warnings;
use strict;
use Win32::OLE;
use Win32::OLE::Const;
Win32::OLE::Const->Load("Microsoft Office Document Imaging 12\.0 Type Library")
or
die "Cannot use the Office 2007 OCR API";
my $miDoc = Win32::OLE->new('MODI.Document')
or die "Cannot create a MODI object";
#Loads an existing TIFF file
$miDoc->Create('OCR-test.tif');
#Performs OCR with the OCR language set to English
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
#Get the OCR result
my $OCRresult = $miDoc->{Images}->Item(0)->{Layout}{Text};
print $OCRresult;
I did a small test. I loaded an .MDI file containing the OCR information. I deleted the OCR method line and ran the script and I got the expected text output of "print $OCRresult". But otherwise, Perl throws me the error saying
Use of uninitialized value $OCRresult in print at E:\OCR-test.pl line 15
I'm suspecting that something's wrong with the line
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
I tried leaving the parens empty or using three paraments, like 'miLANG_ENGLISH',1,1 etc but without any luck.
I also tried using Microsfot Office Document Imaging to test if the TIF I'm experimenting with was text recognizable and the result was positive.
So what other diagnostic methods do I have?
Or can someone who happens to have Office 2007 test my code with a whatever jpg,bmp or tif pictures that have text content and see if something's wrong?
Thanks in advance.
UPDATE
Haha, I've finally figured out where the problem is and how I can solve it. #hobbs, thank you for leaving the comment :) Things are interesting. When I was trying to respond to your comment, I added the link of the url of Office Document Imaging 2003 VBA Language Reference and I took yet another look at the stuff there. And the following information caught my eyes:
LangId can be one of the following MiLANGUAGES constants.
miLANG_CHINESE_SIMPLIFIED (2052, &H804)
I changed the following OCR method line:
$miDoc->OCR('miLANG_ENGLISH',1,1);
to this:
$miDoc->OCR(2052,1,1);
A few notes:
1. I'm running ActivePerl 5.10.0 on Windows XP (Chinese version)
2. Before this, I already tried $miDoc->(9) but without luck
And suddenly and kind of magically that pesky ERROR saying "Use of uninitialized value $OCRresult in print at E:\OCR-test.pl line 15" disappeared completely and the OCRed text appeared on the screen. The OCR result was not satisfying but the parameter "2052" refers to Chinese and the TIF image contains all English. So I changed the parameter to
$miDoc->OCR(9,1,1) but this time without luck. Windows threw me this error:
unknown software exception (0x0000000d)
I changed the TIF image to one that contains all Chinese characters and changed the parameter to "$miDoc->OCR(2052,1,1);" again and this time everything worked just like expected. The OCR result was satisfying.
Now I think there's something weird about my Office 2007 OCR API and if someone who happens to run Windows XP (English version) and have installed Office 2007 would probably not encounter that exception error with the parameter
$miDoc->OCR(9,1,1);
Anyway, I'm really happy that I've finally get things working :D
For starters I would try dumping the value of $miDoc->{Images} -- does it exist? If it exists and it's a collection does it contain anything? If it contains anything, what is it? An error? Or maybe just a different structure than you're expecting? warn, Dumper, and a little exploration can go a long way.
Incidentally, if you want to do the "modern" thing and don't mind grabbing a nifty tool off of CPAN, try Devel::Dwarn -- it makes dumping to stderr even more fun than it was already :)