Find rows where string contains certain character at specific place - postgresql

I have a field in my database, that contains 10 characters:
Fx: 1234567891
I want to look for the rows where the field has eg. the numbers 8 and 9 in places 5 and 6
So for example,
if the rows are
a) 1234567891
b) 1234897891
c) 1234877891
I only want b) returned in my select.
The type of the field is string/character varying.
I have tried using:
where field like '%89%'
but that won't work, because I need it to be 89 at a specific place in the string.

The fastest solution would be
WHERE substr(field, 8, 2) = '89'
If the positions are not adjacent, you end up with two conditions joined with AND.

You should be able to evaluate the single character using the underscore(_) character. So you should be able to use it as follows.
where field like '____89%'

Related

KDB Apply where phrase only if column exists

I'm looking for a way to write functional select in KDB such that the where phrases is only apply if the column exists (on order to avoid error). If the column doesn't exist, it defaults to true.
I tried this but it didn't work
enlist(|;enlist(in;`colname;key flip table);enlist(in;`colname;filteredValues[`colname]));
I tried to write a simple boolean expression and use parse to get my functional form
(table[`colname] in values)|(not `colname in key flip table)
But kdb doesn't have short circuit so the left-hand expression is still evaluated despite the right-hand expression evaluating to true. This caused a weird output boolean$() which is a list of booleans all evaluating to false 0b
Any help is appreciated. Thanks!
EDIT 1: I have to join a series of condition with parameter specified in the dictionary filters
cond,:(,/) {[l;k] enlist(in;k;enlist l[k])}[filters]'[a:(key filters)]
Then I pass this cond on and it gets executed on a few different selects on different tables. How can I make sure that whatever conditional expression I put in place of enlist(in;k;enlist l[k] will only get evaluated as the select statement gets executed.
You can use the if-else conditional $ here to do what you want
For example:
q)$[`bid in cols`quotes;enlist (>;`bid;35);()]
> `bid 35
q)$[`bad in cols`quotes;enlist (>;`bad;35);()]
Note that in the second example, the return is an empty list, as this column isn't in quotes table
So you can put this into the functional select like so:
?[`quotes;$[`bid in cols`quotes;enlist (>;`bid;35);()];0b;()]
and the where clause will be applied the the column is present, otherwise no where clause will be applied:
q)count ?[`quotes;$[`bid in cols`quotes;enlist (>;`bid;35);()];0b;()]
541 //where clause applied, table filtered
q)count ?[`quotes;$[`bad in cols`quotes;enlist (>;`bad;35);()];0b;()]
1000 //where clause not applied, full table returned
Hope this helps
Jonathon
AquaQ Analytics
EDIT: If I'm understanding your updated question correctly, you might be able to do something a like the following. Firstly, let's define an example "filters" dictionary:
q)filters:`a`b`c!(1 2 3;"abc";`d`e`f)
q)filters
a| 1 2 3
b| a b c
c| d e f
So here we are assuming a few different columns of different types, for illustration purposes. You can build up your list of where clauses like so:
q)(in),'flip (key filters;value filters)
in `a 1 2 3
in `b "abc"
in `c `d`e`f
(this is equivalent to the code you had to generate cond, but it's a little neater & more efficient - you also have the values enlisted, which isn't necessary)
You could then use a vector conditional to generate your list of where clauses to apply to a given table e.g.
q)t:([] a:1 2 3 4 5 6;b:"adcghf")
q)?[key[filters] in cols[t];(in),'flip (key filters;value filters);count[filters]#()]
(in;`a;,1 2 3)
(in;`b;,"abc")
()
As you can see, in this example the table "t" has columns a and b, but not c. So using the vector conditional, you get the where clauses for a and b but not c.
Finally to actually apply this list of output where clauses to the table, you can make use of an over to apply each in turn:
q)l:?[key[filters] in cols[t];(in),'flip (key filters;value filters);count[filters]#()]
q){?[x;$[y~();y;enlist y];0b;()]}/[t;l]
a b
---
1 a
3 c
One thing to note here is that in the where clause of the functional select we need to check if y is an empty list - this is so we can enlist it if it is not an empty list
Hope this helps

Matlab split text column in a table

I have a table object in MatLab with a text column. This text column is a "tag" and contains underscores two split the tag.
I'd like to create a column with the second element of the tag. I used strsplit but It didn't work. Also I tried regexp but it gives me a cell object with 126 cells objects inside, and I don't know how to extract the second element of every cell.
Any suggestion?
Example:
a = {'a_b'; 'a_c';'a_n';'a_t'}
t = table(a)
I just want a vector with the second element.
Thanks.
How about
t=[t rowfun(#(x) x{1}(3),t)]
with 1 being the column and 3 being the element you want. For undefined length of the string parts it gets a little bit more tricky
t=[t rowfun(#(X) X{1}(strfind(X{1},'_')+1:end),t,'OutputFormat','cell')];
strfind() gets the '_' element so (find+1:end) is the rest of the string. as they can be of different length everything has to a cell as Output and then be added to the table. if the column changes you have to adopt the code in both {1}

Scalding: Create list from column in Pipe

I need to take a pipe that has a column of labels with associated values, and pivot that pipe so that there is a column for each label with the correct values in each column. So f example if I have this:
Id Label Value
1 Red 5
1 Blue 6
2 Red 7
2 Blue 8
3 Red 9
3 Blue 10
I need to turn it into this:
ID Red Blue
1 5 6
2 7 8
3 9 10
I know how to do this using the pivot command, but I have to explicitly know the values of the labels. How can I can dynamically read the labels from the “label” column into a list that I can then pass into the pivot command? I have tried to create list with:
pipe.groupBy('id) {_.toList('label) }
, but I get a type mismatch saying it found a symbol but is expecting (cascading.tuple.Fields, cascading.tuple.Fields). Also, from reading online, it sounds like using toList is frowned upon. The number of things in 'label is finite and not that big (30-50 items maybe), but may be different depending on what sample of data I am working with.
Any suggestions you have would be great. Thanks very much!
I think you're on the right track, you just need to map the desired values to Symbols:
val newHeaders = lines
.map(_.split(" "))
.map(a=>a(1))
.distinct
.map(f=>Symbol(f))
.toList
The Execution type will help you to combine with the subsequent pivot, for performance reasons.
Note that I'm using a TypedPipe for the lines variable.
If you want your code to be super-concise, you could combine lines 1 & 2, but it's just a stylistic choice:
map(_.split(" ")(1))
Try using Execution to get the list of values from the data. More info on executions: https://github.com/twitter/scalding/wiki/Calling-Scalding-from-inside-your-application

reading via matlab a number after a specific string in a txt file

I re explain my pb in a large a.txt file i have
Amount of Food is 1
Desired Travel is 5
I need to read the 1 after the 'Amount of Food is ' expression and the 5 after the 'Desired Travel is' expression, Thanks again
You can have a look at this: with regexpi you can simply look for numbers in your strings.
The syntax is as simple as this:
startIndex = regexpi(str,expression)
where the expression parameter is a regex expression (i.e. '\d*' to retrieve consecutive digits).
In your specific case a way to perform this with regular expressions would be:
First you have to decide what strings are valid in your search
for example:
firstpar = 'First parameter is [0-9]+';
means that you are looking for a string 'First parameter is '
that ends with a sequence of digits.
Then you could use regexp or regexpi in the following way:
results = regexp(mystring, firstpar, 'match');
Where mystring is the text you perform the search on and 'match' means that you want parts of the text as output, not indexes.
Now, results is a cell matrix with each cell containing a string that appeared in your text and fulfilled your firstpar definition. In order to extract just the numbers from cell matrix of strings you could use regexp again, but now helping yourself with cellfun, which iteratively applies your command to all cells of a cell matrix:
numbers = cellfun(#(x) str2num(regexp(x, '[0-9]+', 'match', 'once')), results);
numbers is an array of numbers that you were looking for.
You can do the same for different string patterns - if you want to have a more general string definitions (instead of straightforward firstpar that we used here) read matlab documentation about regular expressions (alexcasalboni pasted it in his comment), scroll down to Input Arguments and expand 'expressions'.
The difference between regexp and regexpi is that the latter is case insensitive.

Postgresql, treat text as numbers for getting MAX function result

Still didnt fix issue with dates written as strings here comes another problem.
I have text column where only numbers as writen (like text).
By using function MAX I get incorrect result because there 9 is bigger than 30.
Is here any inline function like VAL or CINT or something that I can compare and use textual data (only numbers) like numbers in queries like SELECT, MAX and other similar?
How than can look like in following examples:
mCmd = New OdbcCommand("SELECT MAX(myTextColumn) FROM " & myTable, mCon)
You need to use max(to_number(myTextColumn, '999999'))
More details are in the manual: http://www.postgresql.org/docs/current/static/functions-formatting.html
If all "numbers" are integers, you can also use the cast operator: max(myTextColumn::int)
If your text values are properly formatted you can simply cast them to double, e.g.: '3.14'::numeric.
If the text is not formatted according to the language settings you need to use to_number() with a format mask containing the decimal separator: to_number('3.14', '9.99')
To get the MAX works poterly you need to first convert your text field in numeric format
mCmd = New OdbcCommand("SELECT MAX(TO_NUMBER(myTextColumn, '99999')) FROM " & myTable, mCon)