Removing Unwanted commas from a csv - progress-4gl

I'm writing a program in Progress, OpenEdge, ABL, and whatever else it's known as.
I have a CSV file that is delimited by commas. However, there is a "gift message" field, and users enter messages with "commas", so now my program will see additional entries because of those bad commas.
The CSV fields are not in double qoutes so I CAN NOT just use my main method with is
/** this next block of code will remove all unwanted commas from the data. **/
if v-line-cnt > 1 then /** we won't run this against the headers. Otherwise thhey will get deleted **/
assign
v-data = replace(v-data,'","',"\t") /** Here is a special technique to replace the comma delim wiht a tab **/
v-data = replace(v-data,','," ") /** now that we removed the comma delim above, we can remove all nuisance commas **/
v-data = replace(v-data,"\t",'","'). /** all nuisance commas are gone, we turn the tabs back to commas. **/
Any advice?
edit:
From Progress, I cal call Linux commands. So I should be able to execute C++/PHP/Shell etc all from my Progress Program. I look forward to advice, until then I shall look into using external scripts.

You are not providing quite enough data for a perfect answer but given what you say I think the IMPORT statement should handle this automatically.
In my example here commaimport.csv is a comma-separated csv-file with quotes around text fields. Integers, logical variables etc have no quotes. The last field contains a comma in one line:
commaimport.csv
=======================
"Id1", 123, NO, "This is a message"
"Id2", 124, YES, "This is a another message, with a comma"
"Id3", 323, NO, "This is a another message without a comma"
To import this file I define a temp-table matching the file layout and use the IMPORT statement with comma as delimiter:
DEFINE TEMP-TABLE ttImport NO-UNDO
FIELD field1 AS CHARACTER FORMAT "xxx"
FIELD field2 AS INTEGER FORMAT "zz9"
FIELD field3 AS LOGICAL
FIELD field4 AS CHARACTER FORMAT "x(50)".
INPUT FROM VALUE("c:\temp\commaimport.csv").
REPEAT :
CREATE ttImport.
IMPORT DELIMITER "," ttImport.
END.
INPUT CLOSE.
FOR EACH ttImport:
DISPLAY ttImport.
END.
You don't have to import into a temp-table. You could import into variables instead.
DEFINE VARIABLE c AS CHARACTER NO-UNDO FORMAT "xxx".
DEFINE VARIABLE i AS INTEGER NO-UNDO FORMAT "zz9".
DEFINE VARIABLE l AS LOGICAL NO-UNDO.
DEFINE VARIABLE d AS CHARACTER NO-UNDO FORMAT "x(50)".
INPUT FROM VALUE("c:\temp\commaimport.csv").
REPEAT :
IMPORT DELIMITER "," c i l d.
DISP c i l d.
END.
INPUT CLOSE.
This will render basically the same output:

You don't show what your data file looks like. But if the problematic field is the last one, and there are no quotes, then your best bet is probably to read it using INPUT UNFORMATTED to get it a line at a time, and then split the line into fields using ENTRY(). That way you can treat everything after the nth comma as a single field no matter how many commas the line has.
For example, say your input file has three columns like this:
boris,14.23,12 the avenue
mark,32.10,flat 1, the grange
percy,1.00,Bleak house, Dartmouth
... so that column three is an address which might contain a comma and is not enclosed in quotes so that IMPORT DELIMITER can't help you.
Something like this would work in that case:
/* ...skipping a lot of definitions here ... */
input from "datafile.csv".
repeat:
import unformatted v-line.
create tt-thing.
assign tt-thing.name = entry(1, v-line, ',')
tt-thing.price = entry(2, v-line, ',')
tt-thing.address = entry(3, v-line, ',').
do v=i = 4 to num-entries(v-line, ','):
tt-thing.address = tt-thing.address
+ ','
+ entry(v-i, v-line, ',').
end.
end.
input close.

Related

Converting numbers into timestamps (inserting colons at specific places)

I'm using AutoHotkey for this as the code is the most understandable to me. So I have a document with numbers and text, for example like this
120344 text text text
234000 text text
and the desired output is
12:03:44 text text text
23:40:00 text text
I'm sure StrReplace can be used to insert the colons in, but I'm not sure how to specify the position of the colons or ask AHK to 'find' specific strings of 6 digit numbers. Before, I would have highlighted the text I want to apply StrReplace to and then press a hotkey, but I was wondering if there is a more efficient way to do this that doesn't need my interaction. Even just pointing to the relevant functions I would need to look into to do this would be helpful! Thanks so much, I'm still very new to programming.
hfontanez's answer was very helpful in figuring out that for this problem, I had to use a loop and substring function. I'm sure there are much less messy ways to write this code, but this is the final version of what worked for my purposes:
Loop, read, C:\[location of input file]
{
{ If A_LoopReadLine = ;
Continue ; this part is to ignore the blank lines in the file
}
{
one := A_LoopReadLine
x := SubStr(one, 1, 2)
y := SubStr(one, 3, 2)
z := SubStr(one, 5)
two := x . ":" . y . ":" . z
FileAppend, %two%`r`n, C:\[location of output file]
}
}
return
Assuming that the "timestamp" component is always 6 characters long and always at the beginning of the string, this solution should work just fine.
String test = "012345 test test test";
test = test.substring(0, 2) + ":" + test.substring(2, 4) + ":" + test.substring(4, test.length());
This outputs 01:23:45 test test test
Why? Because you are temporarily creating a String object that it's two characters long and then you insert the colon before taking the next pair. Lastly, you append the rest of the String and assign it to whichever String variable you want. Remember, the substring method doesn't modify the String object you are calling the method on. This method returns a "new" String object. Therefore, the variable test is unmodified until the assignment operation kicks in at the end.
Alternatively, you can use a StringBuilder and append each component like this:
StringBuilder sbuff = new StringBuilder();
sbuff.append(test.substring(0,2));
sbuff.append(":");
sbuff.append(test.substring(2,4));
sbuff.append(":");
sbuff.append(test.substring(4,test.length()));
test = sbuff.toString();
You could also use a "fancy" loop to do this, but I think for something this simple, looping is just overkill. Oh, I almost forgot, this should work with both of your test strings because after the last colon insert, the code takes the substring from index position 4 all the way to the end of the string indiscriminately.

How do I parse out a number from this returned XML string in python?

I have the following string:
{\"Id\":\"135\",\"Type\":0}
The number in the Id field will vary, but will always be an integer with no comma separator. I'm not sure how to get just that value from that string given that it's string data type and not real "XML". I was toying with the replace() function, but the special characters are making it more complex than it seems it needs to be.
is there a way to convert that to XML or something that I can reference the Id value directly?
Maybe use a regular expression, e.g.
import re
txt = "{\"Id\":\"135\",\"Type\":0}"
x = re.search('"Id":"([0-9]+)"', txt)
if x:
print(x.group(1))
gives
135
It is assumed here that the ids are numeric and consist of at least one digit.
Non-regex answer as you asked
\" is an escape sequence in python.
So if {\"Id\":\"135\",\"Type\":0} is a raw string and if you put it into a python variable like
a = '{\"Id\":\"135\",\"Type\":0}'
gives
>>> a
'{"Id":"135","Type":0}'
OR
If the above string is python string which has \" which is already escaped, then do a.replace("\\","") which will give you the string without \.
Now just load this string into a dict and access element Id like below.
import json
d = json.loads(a)
d['Id']
Output :
135

Progress 4GL for each and select * from cust

I often do the following progress 4GL code
output to /OUTText.txt.
def var dRow as char.
dRow = "cmpid|CustNum|Cur".
put unformatted dRow skip.
for each Cust no-lock:
dRow = subst("&1|&2|&3", Cust.CmpId, Cust.CustNum, Cust.Curr).
put unformatted dRow skip.
end.
output close.
in order to mimic
select * from cust (in MS SQL)
my question is is there a way to make this block of code, even closely resemblance "select *" using 4GL. Such that I don't have to type each column name and it will print all values in all columns. my thinking is. something like this.
output to /OUTText.txt.
def var dRow as char.
dRow = "cmpid|CustNum|Cur".
put unformatted dRow skip.
for each Cust no-lock:
if row = 1 then do:
for each Column in Cust:
**'PRINT THE COLUMN HEADER**
end.
end.
else do:
**'PRINT EACH CELL**
end.
end.
output close.
If there is such thing. then I don't have to keep explicit column name in dRow.
You can do what you're after if you first output all field labels (or names) and then use EXPORT to output the table content.
To change to field name instead of label: change :LABEL below to :NAME
For instance:
DEFINE VARIABLE i AS INTEGER NO-UNDO.
OUTPUT TO c:\temp\somefile.txt.
DO i = 1 TO BUFFER Customer:NUM-FIELDS.
PUT QUOTER(BUFFER Customer:BUFFER-FIELD(i):LABEL).
IF i < BUFFER Customer:NUM-FIELDS THEN
PUT UNFORMATTED ";".
ELSE IF i = BUFFER Customer:NUM-FIELDS THEN
PUT SKIP.
END.
FOR EACH Customer NO-LOCK:
EXPORT DELIMITER ";" Customer.
END.
OUTPUT CLOSE.
You could put the header part in a separate program to call dynamically every time you want to do something similar:
DEFINE STREAM str.
OUTPUT STREAM str TO c:\temp\somefile.txt.
RUN putHeaders.p(INPUT BUFFER Customer:HANDLE, INPUT ";", INPUT STREAM str:HANDLE).
FOR EACH Customer NO-LOCK:
EXPORT STREAM str DELIMITER ";" Customer.
END.
OUTPUT STREAM str CLOSE.
putHeaders.p
============
DEFINE INPUT PARAMETER phBufferHandle AS HANDLE NO-UNDO.
DEFINE INPUT PARAMETER pcDelimiter AS CHARACTER NO-UNDO.
DEFINE INPUT PARAMETER phStreamHandle AS HANDLE NO-UNDO.
DEFINE VARIABLE i AS INTEGER NO-UNDO.
DO i = 1 TO phBufferHandle:NUM-FIELDS.
PUT STREAM-HANDLE phStreamHandle UNFORMATTED QUOTER(phBufferHandle:BUFFER-FIELD(i):LABEL).
IF i < phBufferHandle:NUM-FIELDS THEN
PUT STREAM-HANDLE phStreamHandle UNFORMATTED pcDelimiter.
ELSE IF i = phBufferHandle:NUM-FIELDS THEN
PUT STREAM-HANDLE phStreamHandle SKIP.
END.
output to "somefile".
for each customer no-lock:
display customer.
end.
I wouldn't generally mention this as the embedded SQL-89 within the 4GL is the highway to hell (that dialect of SQL only works for the most basic and trivial of purposes and really shouldn't be used at all in production code), but as it happens:
output to "somefile".
select * from customer.
does just happen to work to the spec of the original question (although, like the DISPLAY solution, it also does not support a delimiter...)

Excel Export as Text Using Progress 4GL

I need help with an Excel Export. I'm trying to export a column as text using Progress 4GL. I need numbers in the column which have a leading "0" that excel keeps deleting when opens.
I tried it with using STRING function to make the variable to be String before it goes to export. It did not work. Is there any other way to export with leading 0s?
I assume that you are saving the file in progress as a CSV and when the file is opened in Excel it loses the leading 0.
When outputting the string you can enclose it as follows so that excel reads it in as a string.
put unformatted '="' string("00123") '"'
If you're writing directly to Excel, you can put a ' character at the beginning of the number, and then Excel will interpret it as number formatted with text.
You can see it in action here:
def var ch-excel as com-handle no-undo.
def var ch-wrk as com-handle no-undo.
create "Excel.Application" ch-excel no-error.
ch-excel:visible = no no-error.
ch-excel:DisplayAlerts = no no-error.
ch-wrk = ch-excel:workbooks:add.
ch-excel:cells(1,1) = "'01".
ch-wrk:SaveAs("c:\temp\test.xlsx", 51, "", "", false, false, ) no-error. /* 51 = xlOpenXMLWorkbook */
ch-excel:DisplayAlerts = yes.
ch-excel:quit().
release object ch-wrk.
release object ch-excel.
Since I've be using excel to generate reports for a while, I've create a small lib that generates an excel based on a temp-table definition, and I think it might be helpful, you can check it up at: https://github.com/rodolfoag/4gl-excel
When you import manually into excel select the columns as TEXT and not GENERAL, then the leading zero will not dissapear
You can set the format of the cell, something like this:
h-excel:Range("A12")::numberformat = FILL("0",x).
where x would be the length of the variable you want to insert.

Autohotkey: Splitting a concatenated string into string and number

I am using an input box to request a string from the user that has the form "sometext5". I would like to separate this via regexp into a variable for the string component and a variable for the number. The number then shall be used in a loop.
The following just returns "0", even when I enter a string in the form "itemize5"
!n::
InputBox, UserEnv, Environment, Please enter an environment!, , 240, 120
If ErrorLevel
return
Else
FoundPos := RegExMatch(%UserEnv%, "\d+$")
MsgBox %FoundPos%
retur
n
FoundPos, as its name implies, contains the position of the leftmost occurrence of the needle. It does not contain anything you specifically want to match with your regex.
When passing variable contents to a function, don't enclose the variable names in percent signs (like %UserEnv%).
Your regex \d+$ will only match numbers at the end of the string, not the text before it.
A possible solution:
myText := "sometext55"
if( RegExMatch(myText, "(.*?)(\d+)$", splitted) ) {
msgbox, Text: %splitted1%`nNumber: %splitted2%
}
As described in the docs, splitted will be set to a pseudo-array (splitted1, splitted2 ...), with each element containing the matched subpattern of your regex (the stuff that is in between round brackets).