in the first function, I'm making the job column lowercase and then searching through but it's not finding any data. Why? Thanks. Just FYI since you don't have the database, all records in the JOB column are uppercase (that's why isn't returning anything), but that's also why I'm making it lowercase first.
In the second function, I'm trying to concat only ename with specific criteria --anything that has an r in the ENAME column (there are multiple records with the r in it), but isn't working (no data found), why? How do I get it done? Thanks.
SELECT LOWER(JOB) FROM EMP
WHERE JOB = LOWER('MANAGER');
SELECT CONCAT('My name is ',ename)
FROM EMP
WHERE ENAME LIKE '%r%';
I tested both of your SQL statements and they work fine for me. Are you sure the records are in db? Are you sure the names of the rows are correct?
EDIT : OK, so the name of column is in lower case but in your WHERE its in uppercase. Thats all :)
Related
I am working on a solution that involves merging two queries in Power Query to retrieve a single data table back to Excel. The first query is always populated but the other query comes from an ERP and might be empty (empty table) from time to time.
Appending the two queries involves making the header names the same in the two queries before the appending takes place. As the second query sometimes results in an empty table, the error arises in the steps when Power Query is modifying the header names in the second table (it cannot modify the header names as there are no headers).
"Error message: Expression.Error: The column 'PartMtl_Company' of the table wasn't found.
Details: PartMtl_Company" where the PartMtl_Company is the leftmost column in my table.
I am kind of thinking that I would need to evaluate whether the second table is empty and skip the renaming steps if that is the case. I assume merging the populated first table with an empty table would cause no problem and would only result in the first table. I have tried to look around for a suitable M-code but have not come across such.
I'm thinking you might be able to use Table.RowCount to solve this. Something along the lines of:
= if Table.RowCount(Table2) > 0 then...
You would modify the headers only if there is data in the second table. Same goes for the appending of the tables: you would only append if there is data in the second table, since you won't have renamed any headers otherwise.
Thank you Marc! That did the trick.
In the end, I wrote some in the lines of
= if Table.RowCount(Table2) > 0 then... (code that works on a non-empty table) ...else Table2
, which returns the empty table if it is empty to begin with. Appending the second table into the first table did not throw an error but returned only the first table like planned.
I would like to count all tables in the same instance.
I have not used kdb for a while and I forgot how to make this work.
This is what I got:
tablelist:tables[]
{select count i from x} each tablelist
but I got a type error
Your statement doesn't contain a trailing semi colon ; at the end of the first line which will cause an error in an IDE like qpad (assuming you are running it as written).
If not running from an IDE I would check my hdb for any possible missing data and run some sanity checks (i.e can I select from each of my tables normally, do types match across partitions, i is a virtual column representing row count so issues with non-conforming types in your other columns is probably not a cause but investigating may yield the right answer)
One way to achieve what you're trying is (using dummy data):
q){flip select counts:count i,tab:1#x from x}each tablelist:tables[]
counts tab
-------------
5469 depth
3150 quotes
3005 trades
Here I select the count for each table, but also add on the name of the table, flip each result into a dictionary, which results in a list of dictionaries of conforming types and key names which is in fact a table, hence my result. In this way you have a nice way to track what you're actually counting.
Each select query you run is returning a table in the form:
x
-
3
It would be better to use exec as opposed to select to simply return the value of the count e.g:
q){exec count i from x} each tables[]
3 2
Your current method would be attempting to return a list of tables: e.g:
q){select count i from x} each tables[]
+(,`x)!,,3
+(,`x)!,,2
However, the type error makes me think there may be an issue with your tables as this should not error for in-memory tables.
Here's one way
count each `. tables[]
I am using 3.6 2018.05.17 and your expression worked for me. I then change the select to an exec to return just a list of counts.
q){exec count i from x} each tables[]
Below code helps us get the count of each table along with tablename.
q)flip (`table;`msgcount)! flip {x, count value x}#'tables[]
To get only the count and not the tablename along with it.
q){count value x}#'tables[]
I have a CLOB(2000000) field in a db2 (v10) database, and I would like to run a simple UPDATE query on it to replace each occurances of "foo" to "baaz".
Since the contents of the field is more then 32k, I get the following error:
"{some char data from field}" is too long.. SQLCODE=-433, SQLSTATE=22001
How can I replace the values?
UPDATE:
The query was the following (changed UPDATE into SELECT for easier testing):
SELECT REPLACE(my_clob_column, 'foo', 'baaz') FROM my_table WHERE id = 10726
UPDATE 2
As mustaccio pointed out, REPLACE does not work on CLOB fields (or at least not without doing a cast to VARCHAR on the data entered - which in my case is not possible since the size of the data is more than 32k) - the question is about finding an alternative way to acchive the REPLACE functionallity for CLOB fields.
Thanks,
krisy
Finally, since I have found no way to this by an SQL query, I ended up exporting the table, editing its lob content in Notepad++, and importing the table back again.
Not sure if this applies to your case: There are 2 different REPLACE functions offered by DB2, SYSIBM.REPLACE and SYSFUN.REPLACE. The version of REPLACE in SYSFUN accepts CLOBs and supports values up to 1 MByte. In case your values are longer than you would need to write your own (SQL-based?) function.
BTW: You can check function resolution by executing "values(current path)"
I am trying to figure out what would be the best way to go ahead and locate duplicates in a 5 column csv data. The real data has more than million rows in it.
Following is the content of mentioned 6 columns.
Name, address, city, post-code, phone number, machine number
Data does not have fixed length, data might in certain columns might be missing in certain instances.
I am thinking of using perl to first normalize all the short forms used in names, city and address. Fellow perl enthusiasts from stackoverflow have helped me a lot.
But there would still be a lot of data which would be difficult to match.
So I am wondering is it possible to match content based on "LIKELINESS / SIMILARITY" (eg. google similar to gugl) the likeliness would be required to overcome errors that creeped in while collecting data.
I have 2 tasks in hand w.r.t. the data.
Flag duplicate rows with certain identifier
Mention the percentage match between similar rows.
I would really appreciate if I could get suggestions as to what all possible methods could be employed and which would propbably be best because of their certain merits.
You could write a Perl program to do this, but it will be easier and faster to put it into a SQL database and use that.
Most SQL databases have a way to import CSV. For this answer, I suggest PostgreSQL because it has very powerful string functions which you will need to find your fuzzy duplicates. Create your table with an auto incremented ID column if your CSV data doesn't already have unique IDs.
Once the import is done, add indexes on the columns you want to check for duplicates.
CREATE INDEX name ON whatever (name);
You can do a self-join to look for duplicates in whatever way you like. Here's an example that finds duplicate names.
SELECT id
FROM whatever t1
JOIN whatever t2 ON t1.id < t2.id
WHERE t1.name = t2.name
PostgreSQL has powerful string functions including regexes to do the comparisons.
Indexes will have a hard time working on things like lower(t1.name). Depending on the sorts of duplicates you want to work with, you can add indexes for these transforms (this is a feature of PostgreSQL). For example, if you wanted to search case insensitively you can add an index on the lower-case name. (Thanks #asjo for pointing that out)
CREATE INDEX ON whatever ((lower(name)));
// This will be muuuuuch faster
SELECT id
FROM whatever t1
JOIN whatever t2 ON t1.id < t2.id
WHERE lower(t1.name) = lower(t2.name)
A "likeness" match can be achieved in several ways, a simple one would be to use the fuzzystrmatch functions like metaphone(). Same trick as before, add a column with the transformed row and index it.
Other simple things like data normalization are better done on the data itself before adding indexes and looking for duplicates. For example, trim out and squish extra whitespace.
UPDATE whatever SET name = trim(both from name);
UPDATE whatever SET name = regexp_replace(name, '[[:space:]]+', ' ');
Finally, you can use the Postgres Trigram module to add fuzzy indexing to your table (thanks again to #asjo).
I have the following words in table database:
name id
aba 0
abac 1
abaca 2
abace 3
I want to make the following select:
Select id from words where name='abaca';
I tried but it doesn't work. I want the exact match of the word I'm entering to match.
I tried with your entries and following query does work perfectly fine for an Exact match
SELECT * FROM Match WHERE title='abaca';
Where Match is the table name with following schema
CREATE TABLE Match (title text,recordId integer)
Following is the record inserted
Though I made the same mistake by Taking Match as Table name as you did by taking id as the column name, but I only realized it while posting the code. But that should really not make much difference here.
Have you used it in some code to fetch result from database? in case yes you should consider avoiding KEYWORDS as variable,column names.. Just a suggestion.