Perl - Scanning CSV files for rows that match user-specified criteria? - perl

I am trying to write/learn a simple Perl parser for some CSV files that I have and I need some help.
In a directory I have a series of date-indexed CSV files in the form of Text-Date.csv. The date is in the form of Month-DD-YYYY (ex., January-07-2011). For each weekday there is a CSV file generated.
The Perl script should open each file look for a particular row that matches a user-entered criteria and return that row. Each row is stock price data with different stocks in different rows. What the script should do is return the price of a particular stock (ex., IBM) across all dates that CSVs are generated.
I have the parser working for a specific CSV/date that I choose, but I want to be able to pluck out the row in all CSVs. Also when I print the IBM price for each dated CSV I want to display the date next to the price (ex., January-07-2011 IBM 147.93).
Can you help me get this done?

If your question is how to crawl a bunch of files and run some function on each one, you probably want File::Find. To parse CSV, definitely use Text::xSV and not a custom parser. There is more to parsing CSV than calling split(",").

To parse CSV files, use the Text::CSV module.
It is more complex to decide how you are going to apply the criteria - you'll need to determine what the user specifies and work out how to translate that into Perl code that evaluates the condition correctly.

Related

Dataprep : Invalid array type after run job to excel file

I try to use array type column in dataprep and it is look good in dataprep display ui as the picture below.
But when I run job output with .csv file, there are invalid value in the array column.
Why does the .csv output different from dataprep display?
Array in Dataprep display
Array in csv output
It looks like these two columns each contain the complete record...? I also see some non-English characters in there. I suspect something to do with line breaks and/or encoding.
What do you see if you open the CSV file in a plaintext editor, instead of Excel?
What edition of Dataprep are you using (click Help => About Dataprep => see the Edition heading)?
What version of Excel are you using to open the CSV file?
Assuming that this is a straight-forward flow with a single dataset and recipe, could you post a few rows of data and the recipe itself (which you can download), for testing purposes?

jaspersoft print specific records from csv file

I have a Label set up in Jaspersoft Studio that references a data adapter file in CSV format. The csv file contains thousands of records. I want the end user to be able to select or key enter the "order no" for the specific records to print. If one order no is entered - record is found and printed. If ten order no's are entered - 10 records will print.
Thank You.
You could use Parameters to let user make input. But because CSV is not query language, you can use Field Expression in Data Source
Here's how to add Parameter
https://community.jaspersoft.com/wiki/using-report-parameters
And this is how to use Field Expression
https://community.jaspersoft.com/wiki/how-apply-parameters-csv-data-source
Detail: Filter by attribute:
In query dialog, you select Filter Expression tab and fill the field with code like this
$F{order_no}.equals($P{order_no})
This code will filter the csv row that has order_no field that equal to order_no parameters
Filter by multiple order_no:
JasperReport are allowed user to do some scripting in Java or Groovy (depending or their settings). So you can do more complicated task like split input into array and use it to search rows.
What I have in mind, are ask end user to seperate order_no by space and use this script to filter the data.
Arrays.asList($P{order_no}.split(" ")).indexOf($F{order_no}) > -1
I have not tested this code yet, but hopefully you get the idea. (Experiment with the script).

MS Access mporting dates

At the end of importing a .txt file through the help of the wizard i get a message that some elements were not imported correctly. I have a column in the .txt which should contain dates, but for some reason when i select the column containing dates, and i set its type to date and time, for some reason access cannot recognize them as dates. I'm thinking that it's because of the language difference. I use dates like: 1.1.2011, whereas access uses 1/1/2011.
Where can i change the format?
You can in the Advanced section of the Import Wizard.
If that doesn't work, don't import but link the file and specify the date field as text.
Then create a simple select query where you use the linked table as source. Select all the fields you need.
For the date field, use this expression:
TrueDate: CDate(Replace([YourTextDateField], ".", "/"))
Clean up other fields as well.
Now use this query for the further processing of the data.

Data Type Cast Won't Stick in SSIS

I'm trying to automate a process with SSIS that exports data into a flat file (.csv) that is then saved to a directory, where it will be scanned and imported by some accounting software. The software (unfortunately) only recognizes dates that are in MM/DD/YYYY fashion. I have tried every which way to cast or convert the data pulled from SQL to be in the MM/DD/YYYY, but somehow the data is always recognized as either a DT_Date or DT_dbDate data type in the flat file connection, and saved down as YYYY-MM-DD.
I've tried various combinations of data conversion, derived columns, and changing the properties of the flat file columns to string in hopes that I can at least use substring operations to get this formatted correctly, but it never fails to save down as YYYY-MM-DD. It is truly baffling. The preview in the OLE DB source will show the dates as "MM/DD/YYYY" but somehow it always changes to "YYYY-MM-DD" when it hits the flat file.
I've tried to look up solutions (for example, here: Stubborn column data type in SSIS flat flat file connection manager won't change. :() but with no luck. Amazingly if I merely open the file in Excel and save it, it will then show dates in a text editor as "MM/DD/YYYY", only adding more mystery to this Bermuda Triangle-esque caper.
If there are any tips, I would be very appreciative.
This is a date formatting issue.
In SQL and in SSIS, dates have one literal string format and that is YYYY-MM-DD. Ignore the way they appear to you in the data previewer and/or Excel. Dates are displayed to you based upon your Windows regional prefrences.
Above - unlike the US - folks in the UK will see all dates as DD/MM/YYYY. The way we are shown dates is NOT the way they are stored on disk. When you open in Excel it does this conversion as a favor. It's not until you SAVE that the dates are stored - as text - according to your regional preferences.
In order to get dates to always display the same way. We need to save them not as dates, but as strings of text. TO do this, we have to get the data out of a date column DT_DATE or DT_DBDATE and into a string column: DT_STR or DT_WSTR. Then, map this new string column into your csv file. Two ways to do this "date-to-string" conversion...
First, have SQL do it. Update your OLE DB Source query and add one more column...
SELECT
*,
CONVERT(VARCHAR(10), MyDateColumn, 101) AS MyFormattedDateColumn
FROM MyTable
The other way is let SSIS do it. Add a Derived Column component with the expression
SUBSTRING([MyDateColumn],6,2) + "/" + SUBSTRING([MyDateColumn],8,2) + "/" + SUBSTRING([MyDateColumn],1,4)
Map the string columns into your csv file, NOT the date columns. Hope this helps.
It's been a while but I just came across this today because I had the same issue and hope to be able to spare someone the trouble of figuring it out. What worked for me was adding a new field in the Derived Column transform rather than trying to change the existing field.
Edit
I can't comment on Troy Witthoeft's answer, but wanted to note that if you have a Date type input, you wouldn't be able to do SUBSTRING. Instead, you could use something like this:
(DT_WSTR,255)(MONTH([Visit Date])) + "/" + (DT_WSTR,255)(DAY([Visit Date])) + "/" + (DT_WSTR,255)(YEAR([Visit Date]))

Changing text in CSV file using Perl

I know nothing about Perl. I have looked at some online tutorials and am at a loss for the following.
I do a query in PostgreSQL that saves to a CSV file. However, one element needs to be changed after the CSV file is created, and I have no idea how to do it.
The existing query results are like this
phone date time staff email and customer ID -- my explanation
1112223333,10/21/2013,3:00 AM,sklund#myemail.comSMIB010170 -- data in csv
After query is completed, the data in the time field must be converted to:
1112223333,10/21/2013,03:00am,sklund#myemail.comSMIB010170
As you can see, the time needs to be ammended to include a 0 if the hour is less than ten, and the AM must be changed to am.
Is there a simple Perl script that can do this? The lines of data, of course, will be different, as each line would reflect results of the query for the day.
If someone can point me to a tutorial, link, or help in this I'd be very grateful.
This will do what you need.
I assume you want the space before AM removed as well? You don't mention it in your question.
perl -pe 's/,(\d{1,2}):(\d\d)\s+([AP]M),/sprintf ",%02d:%02d%s,",$1,$2,lc $3/ei' mylogfile > newlogfile