Does anyone know how I can determine within a DataStage job if the input sequential file has EOL markers of MicroSoft or Unix such that it can direct the path through the rest of the job ?
Thanks
If you just want to make sure you can read all the data regardless of the line endings, this might be useful to do it in one job instead of maintaining separate jobs for each DOS and UNIX files. So this answer does not match the exact question but might still be the answer.
In the Sequential File Stage, you can define a filter sed 's/\r//' to convert DOS (Windows) line breaks (\r\n) to UNIX/Linux line breaks (\n).
In the Format tab, define UNIX newline as Record delimiter.
Note that when defining a filter in a sequential file, it has some drawbacks:
The Option First Line is Column Names is not available when using a filter.
If your input file has column names in the first line, you need to remove it manually by extending the filter with a tail command: sed 's/\r//' | tail -n +2
The Option Report Progress is not available when using a filter.
You might want to set the type of the last column in your column definition as VarChar or NVarChar, I've seen strange behaviour defining fixed length types like NChar or Char as last column's type. Though I didn't do deeper research on this, but I believe it has to do with using a filter.
Tested in DS 11.7.1
I am wondering if I have some code with line numbers embedded,
1 int a;
2 MyC b;
3 YourC c;
etc., and then I copy them and try to paste them in Eclipse, how to get rid of these line numbers to make the source code valid? Is there any convenient way, or a short-cut key?
Thank you.
Simply use the Alt+Shift+A (Eclipse 3.5 M5 and above) shortcut to toggle block selection mode. Then select the column with line numbers and delete it!
To make it easier you could setup a macro, but for that you need additional plug-in. I'm not aware of how to do it even easier.
Try this link. This is a dynamic online tool, where it is very easy to just copy paste code and get code without line numbers:
http://remove-line-numbers.ruurtjan.com/
You could use some script to do the work. For instance, using sed
I removed line numbers by find and replace with regular expression option.
Replacing regular expression \d+\s\s with empty string where \d+ means any combination of numbers and \s is actually a space (This is to avoid any numbers present in the code).
Best way is use SED command. Here you can specify as many as digit you want to replace.
in below example open copied code in VI editor and assuming its containing upto 1000 lines.
:%s/^[0-9][0-9|10-99|100-999]//g
if you want to use more lines then put one more or condition.
ask_question MC16_Phase2 : 3156 occurences (100.00%) : module abc_testbench/abc_top_0/abc**
This statement is in a file. There are multiple entries of this statement and other stuff is also present. I need to read it from there and put it in another file in the following manner:
3156 abc_testbench/abc_top_0/abc**
Fixed entities in that statement are:
ask_question
occurences
module
could you please more elaborate the statement. i am new in perl, could you please make me understand the whole scenario from the very beginning starting from reading the file to grabbing the things in the given manner. Thanks Ray Toal.
You will want a regex with two capturing groups. Based on the information given, the regex would be:
/ask_question[^:]*:\s*(\d+)\s*occurences[^:*]:\s*module\s*([^*]*\*\*)/
Apply this regex throughout the input, and write the captures, separated by a space, to your output file.
In my application I have a simple logbook where the user can save simple posts about an event. The format is like this:
Date, duration(seconds), distance(km), a comment AND categories with a variable number between 0 - 4 AND circumstances/conditions with a variable number between 0 - 4
An example would be:
Header of CSV file
Date,Duration,Distance,Comment
Then multiple rows like this
07.02.11,7800,300,"A comment"
07.02.11,7800,300,"A comment"
07.02.11,7800,300,"A comment"
But how can I add the categories and conditions to this format and how would I know where in the categories/conditions end in the CSV if I at a later point in the application want to import the file again?
(I do not need help with how to save etc this to file, already done that, but I could need guidiance on how to format it, thank you)
(This seems pretty odd)
Header of CSV file
Date,Duration,Distance,Comment, Category, Category, Category, Category, Condition, Condition, Condition, Condition
Then multiple rows like this
07.02.11,7800,300,"A comment", "Categoryname", "Categoryname", "Categoryname","Categoryname", "Condtion", "Condition", "Condtion", "Condition"
(Would this be better)
Header of CSV file
Date,Duration,Distance,Comment, Category, Condition
Then multiple rows like this
07.02.11,7800,300,"A comment", "Multiple category names separted by -", "Multiple condition names separted by -"
I think you last proposal of separating conditions or categories using a special separator symbol (the hyphen in your example) is the right one.
By the way I would suggest two extra things:
use a less common separator, that is something you can forbid the user to use without limiting user choice; probably the hyphen is a character you don't want to forbid, use a different sequence such as three pipes: ||| which is not common.
if possible (but be careful in this case about final destination of the CSV file) you can avoid using the standard "comma separator". The reason for this is that if comma is used inside the fields content, then this content must be separated by double quotes. This is some time problematic if you need to do some custom parsing by other software. Normally when I know that my CSV will not be used as source import from other software (e.g. Numbers or Excel) I prefer to use a different separator, e.g. a sequence of 2 hash (##) or something more "strange". Note that in this case you are no more strict-CSV compliant! but there is some software, like OpenOffice, which is more flexible with this special formats.
The second solution you are proposing will work but practically defeats the idea of using a standard format, since you will need to do the parsing of categories and conditions on your own instead of using a standard CSV parser. Writing your own parser is never good.
I would personally do this differently: not trying to put everything in a single file and have two files, one for events (each event has a unique id) and the other for categories/conditions (each category condition is associated to an event through the event's id, multiple categories/events for a given event would appear on multiple lines associated sharing the same event id). Both files would be standard CSV files.
As an alternative, if you are not tied to CSV for any reason, you might think of using JSON, which allows for a richer set of data types, including arrays, and offers plenty of code that you can reuse. This will not require much change to your code.
Another option, more "canonical" (IMO) but also more expensive in terms of code rewrite, would be using sqlite3.
If I had to choose, I would go for JSON, but I don't know if this is ok for you.
I am trying to query a Sybase ASA database using the dbisqlc.exe command-line on a Windows system and would like to collect the column headers along with the associated table data.
Example:
dbisqlc.exe -nogui -c "ENG=myDB;DBN=dbName;UID=dba;PWD=mypwd;CommLinks=tcpip{PORT=12345}" select * from myTable; OUTPUT TO C:\OutputFile.txt
I would prefer it if this command wrote to stdout however that does not appear to be an option aside from using dbisql.exe which is not available in the environment I am in.
When I run it in this format the header and data is generated however in an unparsable format.
Any assistance would be greatly appreciated.
Try adding the 'FORMAT SQL' clause to the OUTPUT statement. It will give you the select statement containing the column names as well as the data.
In reviewing the output from the following dbisqlc.exe command, it appears as though I can parse the output using perl.
Command:
dbisqlc.exe -nogui -c "ENG=myDB;DBN=dbName;UID=dba;PWD=mypwd;CommLinks=tcpip{PORT=12345}" select * from myTable; OUTPUT TO C:\OutputFile.txt
The output appears to break in odd places using text editors such as vi or TextPad however the output from this command is actually returned with specific column widths.
The second line of the output includes a set of ='s signs which are contained for the width of each column. What I did was build a "template" string based on the ='s which can be passed to perls unpack function. I then use this template to build an array of column names and parse the result set using unpack.
This may not be the most efficient method however I think it should give me the results I am looking for.