SPARQL how to make DAY/MONTH always return 2 digits - date

I have a datetime in my SPARQL-query that I want to transform to a date.
Therefore I do:
BIND(CONCAT(YEAR(?dateTime), "-",MONTH(?dateTime), "-", DAY(?dateTime)) as ?date)
This part of code works but returns for example 2022-2-3, I want it to be 2022-02-03. If the dateTime is 2022-11-23, nothing should change.

You can take the integers you get back from the YEAR, MONTH, and DAY functions and pad them with the appropriate number of zeros (after turning them into strings):
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT * WHERE {
BIND(2022 AS ?yearInt) # this would come from your YEAR(?dateTime) call
BIND(2 AS ?monthInt) # this would come from your MONTH(?dateTime) call
BIND(13 AS ?dayInt) # this would come from your DAY(?dateTime) call
# convert to strings
BIND(STR(?yearInt) AS ?year)
BIND(STR(?monthInt) AS ?month)
BIND(STR(?dayInt) AS ?day)
# pad with zeros
BIND(CONCAT("00", ?year) AS ?paddedYear)
BIND(CONCAT("0000", ?month) AS ?paddedMonth)
BIND(CONCAT("00", ?day) AS ?paddedDay)
# extract the right number of digits from the padded strings
BIND(SUBSTR(?paddedYear, STRLEN(?paddedYear)-3) AS ?fourDigitYear)
BIND(SUBSTR(?paddedDay, STRLEN(?paddedDay)-1) AS ?twoDigitDay)
BIND(SUBSTR(?paddedMonth, STRLEN(?paddedMonth)-1) AS ?twoDigitMonth)
# put it all back together
BIND(CONCAT(?fourDigitYear, "-", ?twoDigitMonth, "-", ?twoDigitDay) as ?date)
}

#gregory-williams gives a portable answer. An alternative is functions from F&O (XPath and XQuery Functions and Operators 3.1) "fn:format-...."
I'm not sure of the coverage in various triplestores - Apache Jena provides fn:format-number, which is needed for the question, but not fn-format-dateTime etc
See
https://www.w3.org/TR/xpath-functions-3/#formatting-the-number
https://www.w3.org/TR/xpath-functions-3/#formatting-dates-and-times
For example:
fn:format-number(1,"000") returns the string "001".
Apache Jena also has a local extension afn:sprintf using the C or Java syntax of sprintf:
afn:sprintf("%03d", 1) returns "001".

Related

Regex expression in q to match specific integer range following string

Using q’s like function, how can we achieve the following match using a single regex string regstr?
q) ("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13") like regstr
>>> 0111110b
That is, like regstr matches the foo-strings which end in the numbers 8,9,10,11,12.
Using regstr:"foo[8-12]" confuses the square brackets (how does it interpret this?) since 12 is not a single digit, while regstr:"foo[1[0-2]|[1-9]]" returns a type error, even without the foo-string complication.
As the other comments and answers mentioned, this can't be done using a single regex. Another alternative method is to construct the list of strings that you want to compare against:
q)str:("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")
q)match:{x in y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
0111110b
If your eventual goal is to filter on the matching entries, you can replace in with inter:
q)match:{x inter y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
A variation on Cillian’s method: test the prefix and numbers separately.
q)range:{x+til 1+y-x}.
q)s:"foo",/:string 82,range 7 13 / include "foo82" in tests
q)match:{min(x~/:;in[;string range y]')#'flip count[x]cut'z}
q)match["foo";8 12;] s
00111110b
Note how unary derived functions x~/: and in[;string range y]' are paired by #' to the split strings, then min used to AND the result:
q)flip 3 cut's
"foo" "foo" "foo" "foo" "foo" "foo" "foo" "foo"
"82" ,"7" ,"8" ,"9" "10" "11" "12" "13"
q)("foo"~/:;in[;string range 8 12]')#'flip 3 cut's
11111111b
00111110b
Compositions rock.
As the comments state, regex in kdb+ is extremely limited. If the number of trailing digits is known like in the example above then the following can be used to check multiple patterns
q)str:("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13"; "foo3x"; "foo123")
q)any str like/:("foo[0-9]";"foo[0-9][0-9]")
111111100b
Checking for a range like 8-12 is not currently possible within kdb+ regex. One possible workaround is to write a function to implement this logic. The function range checks a list of strings start with a passed string and end with a number within the range specified.
range:{
/ checking for strings starting with string y
s:((c:count y)#'x)like y;
/ convert remainder of string to long, check if within range
d:("J"$c _'x)within z;
/ find strings satisfying both conditions
s&d
}
Example use:
q)range[str;"foo";8 12]
011111000b
q)str where range[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
This could be made more efficient by checking the trailing digits only on the subset of strings starting with "foo".
For your example you can pad, fill with a char, and then simple regex works fine:
("."^5$("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")) like "foo[1|8-9][.|0-2]"

Trying to convert the number numeric value in this string to a percent. Is there any easy way to do this in powershell?

Trying to convert the number numeric value in this string to a percent. Is there any easy way to do this in powershell?
"Percentage of records ","0.02"
So, the output would look like :
Percentage of records , 2%
Thanks in advance for any suggestions you can provide.
Yes, you can convert the string to a float data type (single, double, decimal), and then convert it back using a format string, like so:
"Percentage of records ", ([double]'0.02').ToString('P0')
And if you want it to output in a single line, you could use:
"Percentage of records: $(([double]'0.02').ToString('P0'))"
Explanation:
Convert your string to a float datatype: [double]'0.02'
Convert that float back into a string: .ToString()
But we want to format it as a percentage, so we supply P0 as a parameter.
i. P - means to format the value as a percentage, this performs the N * 100 operation for you and then adds on the percent sign
ii. 0 - controls the number of decimal places to show. In your case, you want to show zero decimal places.
Note: The percentage format string will round your value to the nearest decimal that you specify.
Example:
0.021.ToString('P0')
# returns 2%
0.025.ToString('P0')
# returns 3%
As #mklement0 pointed out in the comments. I hadn't considered that your sample may be a single string, like:
'"Percentage of records ","0.02"'
I assumed it was two strings, which you separated with a comma.
In the event it is a single string, then you need to extract the number to use it. Once you have isolated the number, then you can use my advice above:
$yourString = '"Percentage of records ","0.02"'
# probably the more "proper" way
$pctValue = ($yourString -split ',' -replace '"')[1]
# or
# a hacky way I just thought of that happens to work in this scenario
$pctValue = (iex $yourString)[1]
Explanation of first example:
-split ',' - Take the string, and break it out into multiple strings, separating them by comma
-replace '"','' - Replace all instances of " with blank. The second parameter is optional since you are removing. Could be written as -replace '"'
(...)[1] - This is saying to take the SECOND string that it returned (starts at zero). In this case it would be your 0.02 value.
Explanation of second example (this is a bit of a hack, but thought it would be fun to include anyway):
iex - alias for Invoke-Expression - it's telling powershell to run whatever is inside of the string verbatim. So it's the equivalent of typing "Percentage of records ","0.02" into powershell and hitting enter. Which in PowerShell terms, that is the equivalent of passing it a list of strings.
Use -f (format operator) in powerhsell for build your string :
"Percentage of records, {0:0%} " -f 0.02
or in percentage :
"Percentage of records, {0:P0} " -f 0.02

Using fscanf in MATLAB to read an unknown number of columns

I want to use fscanf for reading a text file containing 4 rows with an unknown number of columns. The newline is represented by two consecutive spaces.
It was suggested that I pass : as the sizeA parameter but it doesn't work.
How can I read in my data?
update: The file format is
String1 String2 String3
10 20 30
a b c
1 2 3
I have to fill 4 arrays, one for each row.
See if this will work for your application.
fid1=fopen('test.txt');
i=1;
check=0;
while check~=1
str=fscanf(fid1,'%s',1);
if strcmp(str,'')~=1;
string(i)={str};
end
i=i+1;
check=strcmp(str,'');
end
fclose(fid1);
X=reshape(string,[],4);
ar1=X(:,1)
ar2=X(:,2)
ar3=X(:,3)
ar4=X(:,4)
Once you have 'ar1','ar2','ar3','ar4' you can parse them however you want.
I have found a solution, i don't know if it is the only one but it works fine:
A=fscanf(fid,'%[^\n] *\n')
B=sscanf(A,'%c ')
Z=fscanf(fid,'%[^\n] *\n')
C=sscanf(Z,'%d')
....
You could use
rawText = getl(fid);
lines = regexp(thisLine,' ','split);
tokens = {};
for ix = 1:numel(lines)
tokens{end+1} = regexp(lines{ix},' ','split'};
end
This will give you a cell array of strings having the row and column shape or your original data.
To read an arbitrary line of text then break it up according the the formating information you have available. My example uses a single space character.
This uses regular expressions to define the separator. Regular expressions powerful but too complex to describe here. See the MATLAB help for regexp and regular expressions.

Function to split string in matlab and return second number

I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str

Perl Xpath: search item before a date year

I have an xml database that contains films, for example:
<film id="5">
<title>The Avengers</title>
<date>2012-09-24</date>
<family>Comics</family>
</film>
From a Perl script I want to find film by date.
If I search films of an exacly year, for example:
my $query = "//collection/film[date = 2012]";
it works exactly and return all films of 2012 year, but if I search all film before a year, it didn't work, for example:
my $query = "//collection/film[date < 2012]";
it returns all film..
Well, as usual, there's more than one way to do it. ) Either you let XPath tool know that it should compare dates (it doesn't know from the start) with something like this:
my $query = '//collection/film[xs:date(./date) < xs:date("2012-01-01")]';
... or you just bite the bullet and just compare the 'yyyy' substrings:
my $query = '//collection/film[substring(date, 1, 4) < "2012"]';
The former is better semantically, I suppose, but requires an advanced XML parser tool which supports XPath 2.0. And the latter was successfully verified with XML::XPath.
UPDATE: I'd like to give my explanation of why your first query works. ) See, you don't compare dates there - you compare numbers, but only because of '=' operator. Quote from the doc:
When neither object to be compared is a node-set and the operator is =
or !=, then the objects are compared by converting them to a common
type as follows and then comparing them. If at least one object to be
compared is a boolean, then each object to be compared is converted to
a boolean as if by applying the boolean function. Otherwise, if at
least one object to be compared is a number, then each object to be
compared is converted to a number as if by applying the number
function.
See? Your '2012-09-24' was converted to number - and became 2012. Which, of course, is equal to 2012. )
This doesn't work with any other comparative operators, though: that's why you need to either use substring, or convert the date-string to number. I supposed the first approach would be more readable - and faster as well, perhaps. )
Use this XPath, to check the year
//collection/film[substring-before(date, '-') < '2012']
Your Perl script will be,
my $query = "//collection/film[substring-before(date, '-') < '2012']";
OR
my $query = "//collection/film[substring-before(date, '-') = '2012']";
Simply use:
//collection/film[translate(date, '-', '') < 20120101]
This removes the dashes from the date then compares it for being less than 2012-01-01 (with the dashes removed).
In the same way you can get all films with dates prior a given date (not only year):
//collection/film[translate(date, '-', '') < translate($theDate, '-', '']