The following command is correctly importing the data from csv file. But the problem is that there are 2 entries for the same number.
I need both the entries for 417176718 in the same document (so no $set). How do I keep both these values using mongo import?
cat final.txt
number, date, duration, type, destination
417176718 , 2013-01-23 20:09:00 , 1 , NORMAL_CLEARING , 61998487
409334392 , 2013-01-24 11:25:18 , 40 , NO_ANSWER , 09821973636
919480909 , 2013-01-25 20:58:00 , 40 , NORMAL_CLEARING , 09919480909
417176718 , 2013-01-24 20:09:00 , 1 , FAILED , 61998487
mongoimport -d mydb -c vcalls --type csv --file final.txt --headerline
This is exactly what a map reduce is for.
Once you've got this in the db, run a map reduce like this:
mapper= function(){emit(this.number, {'data':[{'date':this.date, 'duration':this.duration, 'type':this.type, 'destination':this.destination}]});}
reducer = function(k,v){
data=[];
for (i=0;i<v.length;i++){
for (j=0;j<v[i].data.length;j++){
data.push(v[i].data[j]);
}
}
return {'data':data}
}
db.vcalls.mapReduce(mapper, reducer, 'reducedcalls')
This should give you data that a single record per number with a list that contains the calls.
Related
It is taking almost 10hrs to finish loading into tables.
Here is my ctl file.
OPTIONS (
skip =1,
ERRORS=4000000,
READSIZE=5000000,
BINDSIZE=8000000,
direct=true
unrecoverable
)
load data
INFILE 'weeklydata1108.csv'
INSERT INTO TABLE t_location_data
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(f_start_ip,
f_end_ip,
f_country,
f_count_code ,
f_reg,
f_stat,
f_city,
f_p_code ,
f_area,
f_lat,
f_long,
f_anon_stat ,
f_pro_detect date "YYYY-MM-DD",
f_date "SYSDATE")
And sqlldr command for running it is
sqlldr username#\"\(DESCRIPTION=\(ADDRESS=\(HOST=**mydbip***\)\(PROTOCOL=TCP\)\(PORT=1521\)\)\(CONNECT_DATA=\(SID=Real\)\)\)\"/geolocation control='myload.ctl' log='insert.log' bad='insert.bad'
db.fs.files.distinct( "metadata.user" )
[
"5027",
"6048",
"6049",
]
The below X represents where I would like the numbers from the above query to appear.
db.fs.files.find({ 'metadata.user' : X, 'metadata.folder' : 'inbox' }).count()
I'm trying to find a way to iterate through each of the users in the first query and count the total number of results in the second query. Is there an easy way to craft this query in the MongoDB Shell?
The output I would be looking for would be (just looking for pure numbers):
User Inbox
5027 9872
6048 12
6049 125
Update:
I was able to accomplish something pretty close to what I was looking for:
# for x in $(psql -d ****** postgres -c "select user_name from users where user_name ~ '^[0-9]+$' order by user_name;" | tail -n +3 | head -n -2); do mongo vmdb --quiet --eval "db.fs.files.find({ 'metadata.user' : '$x'}).count()"; done| sort -nr
1381
1073
982
However, i'm missing out on the username part. The point is to generate a list of users with the number of messages in their mailboxes.
Please try this
var myCursor = db.collection.find( "metadata.user" );
while(myCursor.hasNext()) {
db.fs.files.find({ 'metadata.user' : X, 'metadata.folder' : 'inbox' }).count();
}
Is it possible a copy command to evaluate expressions upon insertion?
For example consider the following table
create table test1 ( a int, b int)
and we have a file to import
5 , case when b = 1 then 100 else 101
25 , case when b = 1 then 100 else 101
145, case when b = 1 then 100 else 101
The following command fill fail
COPY test1 FROM 'file' USING DELIMITERS ',';
with the following error
ERROR: invalid input syntax for integer
which means that it can not evaluate the case expression. Is there any workaround?
The command COPY only copies data (obviously) and does not evaluate SQL code, as explained in the documentation: http://www.postgresql.org/docs/9.3/static/sql-copy.html
As far as I know there is not workarounds to making COPY evaluating sql code.
You must preprocess your csv file and convert it to a standard sql script with INSERT statements in this form:
INSERT INTO your_table VALUES(145, CASE WHEN 1 = 1 THEN 100 ELSE 101 END);
Then execute the sql script with the client you are using. I.e. with psql you would use the -f option:
psql -d your_database -f your_sql_script
I'm fairly new to postgres. I am trying to copy over a file from my computer to a postgres server. I first initialize the table with
CREATE TABLE coredb (
id text, lng numeric(6,4), lat numeric(6,4),
score1 numeric(5,4), score2 numeric(5,4));
And my CSV looks like this:
ID lng lat score1 score2
1 -72.298 43.218 0.561 0.894
2 -72.298 43.218 0.472 0.970
3 -72.285 43.250 0.322 0.959
4 -72.285 43.250 0.370 0.934
5 -72.325 43.173 0.099 0.976
6 -72.325 43.173 0.099 0.985
However, when I try to copy the CSV over, I get the following error
COPY coredb FROM '/home/usr/Documents/filefordb.csv' DELIMITER ',' CSV;
ERROR: invalid input syntax for type numeric: "lng"
CONTEXT: COPY nhcore, line 1, column lng: "lng"
Oddly enough the csv imports just fine when I set the CREATE TABLE parameters to text for all the columns. Could someone explain why this is happening? I am using psql 9.4.1
You have to use HEADER true to tell COPY to skip the header line.
I am trying to import csv file into postgresql data base
I already tried set datestyle = mdy
\copy "Recon".snapdeal_sales (REFERENCES , ORDER_CODE ,SUB_ORDER_CODE ,
PRODUCT_NAME , ORDER_VERIFIED_DATE , ORDER_CREATED_DATE, AWBNO ,
SHIPPING_PROVIDER , SHIPPING_CITY , SHIPPING_METHOD , INVOICE_NUMBER ,
INVOICE_DATE , IMEI_SERIAL , STATUS , MANIFEST_BY_DATE , SHIPPED_ON ,
DELIVERED_ON , RETURN_INITIATED_ON , RETURN_DELIVERED_ON , SKU_CODE ,
PACKAGE_ID ,PRODUCT_CATEGORY, ATTRIBUTES , IMAGE_URL , PDP_URL , FREEBIES
,TRACKING_URL , ITEM_ID , MANIFEST_CODE , PROMISED_SHIP_DATE ,
NON_SERVICABLE_FROM , HOLD_DATE , HOLD_REASON , MRP
,EXPECTED_DELIVERY_DATE ,TAX_PERCENTAGE , CREATED ,RPI_DATE
,RPI_ISSUE_CATEGORY , RPR_DATE) FROM 'C:\Users\YAM\Documents\SALES.csv' DELIMITER ',' CSV HEADER;
First, run this query.
SET datestyle = dmy;
In my case I was getting this error :
psycopg2.errors.DatetimeFieldOverflow: date/time field value out of range 23-09-2021
soln : check what format the date is present in your db and change the date format in your query accordingly.
In my case it was in a format : yyyy-mm-dd, I was query with dd-mm-yyyy that caused the error, just simply change this.
Ex : http://localhost:port/tweets?date=2021-09-23