How to append CSV file into dbf? - append

Do you know how to append a .csv file into DBF by using FoxPro command , instead of using import wizard?
I have tried the code below, but it is not working.
Code:
Append from getfile() Delimited With ,

You can use append from:
local lcFileName
lcFilename = getfile()
if !empty(m.lcFileName)
append from (m.lcFileName) type delimited
endif
This would do appending including the header line. However, if you have a field that is not character, then you could use it to skip header line. For example say there is a column named theDate of type date and should not be empty, then you could say:
append from (m.lcFileName) type delimited for !empty(theDate)
Or if you are appending into an empty cursor, you could say - ... for recno()>1.

Related

Getting Python to accept a csv into postgreSQL table with ":" in the headers

I receive a .csv export every 10 minutes that I'd like to import into a postgreSQL server. Working with a test csv, I got everything to work, but didn't take notice that my actual csv file has a forced ":" at the end of each column header (but not on the first header for some reason)(built into the back-end of the exporter, so I cant get it removed, already asked the company). So I added the ":"s to my test csv as shown in the link,
My insert into functions no longer work and give me syntax errors. First I'm trying to add them using the following code,
print("Reading file contents and copying into table...")
with open('C:\\Users\\admin\\Desktop\\test2.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
columns = next(readCSV) #skips the header row
query = 'insert into test({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
for data in readCSV:
cursor.execute(query, data)
con.commit()
Resulting in '42601' error near ":" in the second column header.
The results are the same while actually listing column headers and ? ? ?s out in the INSERT INTO section.
What is the syntax to get the script to accept ":" on column headers? If there's no way, is there a way to scan through headers and remove the ":" at the end of each?
Because : is a special character, if your column is named year: in the DB, you must double quote its name --> select "year:" from test;
You are getting a PG error because you are referencing the unquoted column name (insert into test({0})), so add double quotes there.
query = 'insert into test("year:","day:", "etc:") values (...)'
That being said, it might be simpler to remove every occurrence of : in your csv's 1st line
Much appreciated JGH and Adrian. I went with your suggestion to remove every occurrence of : by adding the following line after the first columns = ... statement
columns = [column.strip(':') for column in columns]
It worked well.

Import length-delimited file with PowerShell and export as csv file

I have a source file which is in .txt format. It looks like a semi-colon separated file:
100;200;ThisisastringcolumnA;4;
101;400;Thisisastringc;lumnA;5;
102;600;ThisisastringcolumnB;6;
104;600;Thisisa;;ringcolumnB;6;
However, it is determined by length. So it is a length-delimited file.
Fist column for example is from first value to the third (100), then a semi-colon follows.
Second column starts at 5th position (including), until (including) 7th position. A string column can contain a semi-colon.
Now I want to import this length-delimited txt file with Powershell and export it as a csv file. This file should be really semi-colon separated. The result should look like
100;200;ThisisastringcolumnA;4;
101;400;"Thisisastringc;lumnA";5;
102;600;ThisisastringcolumnB;6;
104;600;"Thisisa;;ringcolumnB";6;
But I have simply no idea how to do it? I googled it, but I did not find that much useful code examples for importing length-delimited txt files with PowerShell.
Unfortunately, I cannot use Python. I am not sure, if this task is generally possible using Powershell? Because when exporting, Powershell also needs to recognize that there are string values containing the separator, so it has to pay attention to the quoting: "Thisisa;;ringcolumnB". I think it would be also ok for me, if the whole column is quoted, so every entry in a string column gets quotes added.
You can use regex to describe a string in which the 3rd "column" contains a ; and then inject the quotation marks with the -replace operator:
$lines = Get-Content path\to\file.txt
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;'
The expression (.{20}(?<=;.{0,19})) is going to match the 20-char 3rd column value only if it contains at least one semi-colon - so lines with no semicolon in that column will be left alone:
# let's try it out with your test data
$lines = #'
100;200;ThisisastringcolumnA;4;
101;400;Thisisastringc;lumnA;5;
102;600;ThisisastringcolumnB;6;
104;600;Thisisa;;ringcolumnB;6;
'# -split '\r?\n'
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;'
Which yields the following four strings:
100;200;ThisisastringcolumnA;4;
101;400;"Thisisastringc;lumnA";5;
102;600;ThisisastringcolumnB;6;
104;600;"Thisisa;;ringcolumnB";6;
To write the output back to file, use Set-Content:
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;' |Set-Content path\to\fixed_output.scsv

Adding field delimiter ";" in last column on header file

I'm new in datastage and trying to create a sequential file with ";" as delimeter.
I would like to add my delimeter just after the last column in the headers
please see below exemple for more understanding
Actully i have this in my sequential file :
SERVICE_ID;OFFER_ID;MINIMUM;MAXIMUM
19441;162887;;;
19442;162889;;;
Expected result with delimiter after last column in header :
SERVICE_ID;OFFER_ID;MINIMUM;MAXIMUM;
19441;162887;;;
19442;162889;;;
How can i do that please ?
Use the Final Delimiter property in the Sequential File stage format properties.

Converting csv to parquet in spark gives error if csv column headers contain spaces

I have csv file which I am converting to parquet files using databricks library in scala. I am using below code:
val spark = SparkSession.builder().master("local[*]").config("spark.sql.warehouse.dir", "local").getOrCreate()
var csvdf = spark.read.format("org.apache.spark.csv").option("header", true).csv(csvfile)
csvdf.write.parquet(csvfile + "parquet")
Now the above code works fine if I don't have space in my column headers. But if any csv file have spaces in the column headers, it doesn't work and errors out stating invalid column headers. My csv files are delimited by ,.
Also, I cannot change the spaces of column names of the csv. The column names has to be as they are even if they contain spaces as those are given by end user.
Any idea on how to fix this?
per #CodeHunter's request
sadly, the parquet file format does not allow for spaces in column names;
the error that it'll spit out when you try is: contains invalid character(s) among " ,;{}()\n\t=".
ORC also does not allow for spaces in column names :(
Most sql-engines don't support column names with spaces, so you'll probably be best off converting your columns to your preference of foo_bar or fooBar or something along those lines
I would rename the offending columns in the dataframe, to change space to underscore, before saving. Could be with select "foo bar" as "foo_bar" or .withColumnRenamed("foo bar", "foo_bar")

Python PostgreSQL using copy_from to COPY list of objects to table

I'm using Python 2.7 and psycopg2 to connect to my DB server ( PostgreSQL 9.3 ) and I a list of objects of ( Product Class ) holds the items which i want to insert
products_list = []
products_list.append(product1)
products_list.append(product2)
And I want to use copy_from to insert this products list to the product table. I tried some tutorials and i had a problem with converting the products list to CSV format because the values contain single quote, new lines, tabs and double quotes. For example ( Product Description ) :
<div class="product_desc">
Details :
Product's Name : name
</div>
The escaping corrupted the HTML code by adding single quote before any single quote and it, So i need to use a save way to convert the list into CSV to COPY it? OR using any other way to insert the list without converting it to CSV format??
I figured it out, First of all i created a function to convert my object to csv row
import csv
#staticmethod
def adding_product_to_csv(item, out):
writer = csv.writer(out, quoting=csv.QUOTE_MINIMAL,quotechar='"',delimiter=',',lineterminator="\r\n")
writer.writerow([item.name,item.description])
Then in my code i created a csv file using Python IO to store the data in it to COPY it and stored every object in the csv file using my previous function:
file_name = "/tmp/file.csv"
myfile = open(file_name, 'a')
for item in object_items:
adding_product_to_csv(item, myfile)
Now I created the CSV file and it's ready to be copied using copy_from which exists in psycopg2 :
# For some reason it needs to be closed before copying it to the table
csv_file.close()
cursor.copy_expert("COPY products(name, description) from stdin with delimiter as ',' csv QUOTE '\"' ESCAPE '\"' NULL 'null' ",open(file_name))
conn.commit()
# Clearing the file
open(file_name, 'w').close()
And it's working now.