SED and/or awk help required - sed

I am using openvms but have access to versions of aWk and /or sed on this platform. Wondered if anyone can help with a text file processing job.
My file looks like
START-OF-DATA
Stock ID|XYZ
START-TIME 11:30
END_TIME 12:30
11:31|BID|12.5|ASK|12.7
11:34|BID|12.6|ASK|12.7
END-OF-DATA
START-OF-DATA
Stock ID|ABC
START-TIME 11:30
END_TIME 12:30
11:40|BID|.245|ASK|.248
11:34|BID|.246|ASK|.249
END-OF-DATA
Basically I want to pre-pend the BID/ASK data records with the Stock ID so the above file should look like
START-OF-DATA
Stock ID|XYZ
START-TIME 11:30
END_TIME 12:30
XYZ|11:31|BID|12.5|ASK|12.7
XYZ|11:34|BID|12.6|ASK|12.7
END-OF-DATA
START-OF-DATA
Stock ID|ABC
START-TIME 11:30
END_TIME 12:30
ABC|11:40|BID|.245|ASK|.248
ABC|11:34|BID|.246|ASK|.249
END-OF-DATA
Can any one help ?

Like this:
awk -F'|' 'BEGIN{OFS="|"} /^Stock/{S=$2} /BID|ASK/{print S,$0}' file
Explanation (with thanks to Olivier Dulac)
It updates "S" variable each time it encounters a line stating with "Stock", and then prepends S to lines CONTAINING "BID" or "ASK" (using | as a separator for reading and for outputting).

try this:
awk -F'|' 'NF==2{pre=$2}NF>2{$0=pre FS $0}7' file
it works for the given example.

Using awk
awk '/Stock ID/{s=$2}/BID|ASK/{$0=s FS $0}1' FS=\| file
START-OF-DATA
Stock ID|XYZ
START-TIME 11:30
END_TIME 12:30
XYZ|11:31|BID|12.5|ASK|12.7
XYZ|11:34|BID|12.6|ASK|12.7
END-OF-DATA
START-OF-DATA
Stock ID|ABC
START-TIME 11:30
END_TIME 12:30
ABC|11:40|BID|.245|ASK|.248
ABC|11:34|BID|.246|ASK|.249
END-OF-DATA

This might work for you (GNU sed):
sed -r '/Stock ID/h;/BID|ASK/{G;s/(.*)\n.*\|(.*)/\2|\1/}' file
Save the Stock ID in the hold space and prepend it to records containing BID or ASK.

Related

Problem with displaying reformatted string into a four-digit year in Stata 17

I turned to a Stata video "Data management: How to create a date variable from a date stored as a string by Chuck Huber" to make sure my date variable were formatted properly, however, I cannot get to show me the reformatted variable (school_year2) to display as a year (e.g. 2018).
Can someone let me know what I may be missing here?
Thank you,
.do file
gen school_year2 = date(school_year,"Y")
format %ty school_year2
list school_year school_year2 in 1/10
+---------------------+
| school~r school~2 |
|---------------------|
1. | 2016 2.0e+04 |
2. | 2016 2.0e+04 |
3. | 2016 2.0e+04 |
4. | 2016 2.0e+04 |
5. | 2016 2.0e+04 |
|---------------------|
6. | 2016 2.0e+04 |
7. | 2016 2.0e+04 |
8. | 2016 2.0e+04 |
9. | 2016 2.0e+04 |
10. | 2016 2.0e+04 |
+---------------------+
.
end of do-file
The value of the underlying data is still days from 1 Jan 1960 as you are using the date() function. So keep %td as you are working with days here, not years. But then you can decide for it to display only the year using %tdCCYY C standing for century and Y for year. But remember, the underlying data point is still the day 1 Jan 2016 and not 2016
clear
input str4 school_year
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
end
gen school_year2 = date(school_year,"Y")
format %tdCCYY school_year2
list school_year school_year2 in 1/10
If year is all you want to work with then use the year() function to get the year from the date. The examples below details steps you can play around with.
clear
input str4 school_year
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
end
gen school_year2 = date(school_year,"Y")
gen school_year3 = year(school_year2)
format %tdCCYY school_year2
format %ty school_year3
list in 1/10
Note that in the last example, all values look the same to you. But the first variable is a string with the text "2016", the second is a date stored as the number of days from 1 Jan 1960 with only its year value displayed, and the last is a number with the number of years from year 0 displayed as a year (which in this case would have been the same had it been displayed as its underlying number).
#TheiceBear has already explained the main point, but here is the story told a little differently in case that is helpful.
The fallacy here is that changing the (display) format is just that, a change in format. It has no effect on what is stored, that is, on the value of data held within variables in the question.
You are using generate to create new variables, which is fine, but the basic principles can be seen directly using di (display) on scalar constants. That's also a good way to check understanding of Stata's rules.
The date() function -- despite its historic name -- is for creating numeric daily dates (only). If you tell date() that your input is a string containing the year only, then it imputes 1 January as day and month. The result is an integer, counted from the origin of the scale at 1 January 1960.
. di date("2016", "Y")
20454
. di date("1 Jan 2016", "DMY")
20454
. di date("1 Jan 1960", "DMY")
0
It is a fair bet that few are willing or able to work out what 20454 is on such a scale, but you can specify a daily date display format so that you and readers of your code can see directly.
. di %td 20454
01jan2016
There are many minor variations on that to display daily dates (or parts of them, such as monthly or yearly dates). The different format names for daily dates all start %td.
Conversely, if you say that the value 20454 is to be displayed using a yearly format, you are referring to the year 20454, several thousand years into the future. Stata doesn't act puzzled, except that it doesn't expect such values as years and just shows you a year rounded to 2.0e+04, that is 20000. If you had good reason to work with dates thousands or millions of years into the future, date display formats are likely to be neither needed nor helpful.
. di %ty 20454
2.0e+04
This paper riffs on the idea that a change in display format is only that and that doesn't affect stored values.

Manipulate time on System.DateTime object

I have an small (probably) problem in a function that i am writing. The input to the function is two (start & stop) of the type System.DateTime acquired by the function Get-Date. The output is an array of System.DateTime objects (start & stop) that are between the input objects. Ie.
Start Stop
2018-01-14 13:54:15 2018-01-14 13:54:15
2018-01-15 13:54:15 2018-01-15 13:54:15
Works. However in the process of building this array i need to manipulate the time on the start and stop object and to be honest i havent the slightest clue as to how to set the hour, minute and second on an existing object. Desired output is Ie.
Start Stop
2018-01-14 08:00 2018-01-14 17:00
2018-01-15 08:00 2018-01-15 17:00
I've tried using the ParseExact method but it touches the date as well. I can probably send in more than the time to it, but that seems like 'code from hell' ...
PS C:\tmp> (Get-Date).AddDays(-7)
den 10 januari 2018 14:07:13
PS C:\tmp> [datetime](Get-Date).AddDays(-7)::ParseExact("09:00","hh:mm",$null)
den 17 januari 2018 09:00:00
PS C:\tmp>
How on earth can i manipulate the time, and only the time, on an existing System.DateTime object?
As the function itself serves no real value to the problem it has been excluded.
Use the Date portion of the timestamp, then add the desired number of hours (8 for 8AM, 17 for 5PM):
PS> (Get-Date).Date.AddHours(8)
den 17 januari 2018 08:00:00
your question is not very clear is it what you're looking for ?
$date1=[datetime]"2018-01-14 13:54:15"
PS>$date1.AddSeconds(-15)
Sunday, January 14, 2018 1:54:00 PM

Get DayOFYear in XXX format in Powershell

I am trying to get the day of the year in XXX format. Like on 55th day it should show 055 and on 110th day it should just show 110.
(Get-Date).DayOfYear gives just 55, but the string format that I need to work with has to have a format of XXX
Try This:
"{0:D3}" -f (Get-Date).DayofYear
more info here

Add date (24 format) to file name?

This line dose a fine job of renaming my file but it only use the hours 00-12 and not 13-24. After 12 it begins with 01. Sins it does not append AM or PM you cant know bu just looking at the file name when it was created. I would like it to use 24h format.
dir C:\script\logged_in_users-.csv | Rename-Item -NewName {$_.BaseName+(Get-Date -f yyyy-MM-dd-hh)+$_.Extension}
In your format string, use HH instead of hh, to output hours in 24-hour format.
Swap hh for HH for 24-hour clock timestamp.
See: http://technet.microsoft.com/en-us/library/ee692801.aspx

day difference between dates and check convert the text in date

My intention of writing a shell-script (ksh) is to list all the files in a directory and check the creted date. If the date exceeds 30 days, the files are zipped in another location.
ksh code :
--extracts the day and date of the file
ls -al | awk '{print $6$7}'
output
May23 Jun13 .......
Now, when i extract the day and date, i believe it is in text. Now, my requirement is to change the text into date and check the created date whether less than 30 days or greater.
However, i googled out an found some good suggestions but none satisfoes mine(as far as i searched).
Could you please suggest as what is required to do?
Thanks in advance.
Don't use ls for this. Use find, e.g.
find . -type f -ctime +30
or similar-type command.