So I know the command to export via Mongo shells is
mongoexport --host localhost --db dbname --collection name --csv > test.csv
but I also need to do a "join" (in quotes because I know Mongo doesn't actually do joins so to speak". I haven't found documentation on doing this in mongo shell.
I have collections called users and another called details.
I want to export: users (name,email) and details (city,country) where users (place [object ID]) = details (_id [object ID]).
Tables look like this:
users
+----+------+---------------+-----------------------+
| id | name | email | place |
+----+------+---------------+-----------------------+
| 1 | bob | bob#bob.com | ObjectId("123456abc") |
| 2 | mark | mark#mark.com | ObjectId("654321abc") |
| 3 | dave | dave#dave.com | ObjectId("987655abc") |
+----+------+---------------+-----------------------+
details
+----+-------+---------------+-----------------------+
| id | city | country | _id |
+----+-------+---------------+-----------------------+
| 1 | Perth | Australia | ObjectId("123456abc") |
| 2 | Tokyo | Japan | ObjectId("654321abc") |
| 3 | NY | United States | ObjectId("987655abc") |
+----+-------+---------------+-----------------------+
And the result I'm trying to achieve is this:
+----+------+---------------+-------+---------------+
| id | name | email | city | country |
+----+------+---------------+-------+---------------+
| 1 | bob | bob#bob.com | Perth | Australia |
| 2 | mark | mark#mark.com | Tokyo | Japan |
| 3 | dave | dave#dave.com | NY | United States |
+----+------+---------------+-------+---------------+
Related
Note: I've already gone over related questions like following that don't address my query
SQL: how to pick one row for each set of rows with duplicate value in one column?
Fill missing values with first non-null following value in Redshift
I have a sparse, unclean dataset like this
| id | operation | title | channel_type | mode |
|-----|-----------|----------|--------------|------|
| abc | Start | | | |
| abc | Start | recovery | | Link |
| abc | Start | recovery | SMS | |
| abc | Set | | Email | |
| abc | Verify | | Email | |
| pqr | Start | | | OTP |
| pqr | Verfiy | sign_in | Push | |
| pqr | Verify | | | |
| xyz | Start | sign_up | | Link |
and I need to fill up empty rows of each id with non-empty data available from other rows
| id | operation | title | channel_type | mode |
|-----|-----------|----------|--------------|------|
| abc | Start | recovery | SMS | Link |
| abc | Start | recovery | SMS | Link |
| abc | Start | recovery | SMS | Link |
| abc | Set | recovery | Email | Link |
| abc | Verify | recovery | Email | Link |
| pqr | Start | sign_in | Push | OTP |
| pqr | Verfiy | sign_in | Push | OTP |
| pqr | Verify | sign_in | Push | OTP |
| xyz | Start | sign_up | | Link |
notes
some ids can have a certain field as empty in all rows
and while most ids will have same non-empty values for each field, edge cases could have different values. For such groups, filling up any non-empty value in all rows is acceptable. [this is too rare in my dataset and can be ignored]
another extra bit of pattern is that certain fields are mostly only present only against rows of certain operations, for e.g. mode is only present against operation='Start' rows
I've tried grouping rows by id while performing listagg over title, channel_type and mode columns, followed by coalesce, something along the lines of this:
WITH my_data AS (
SELECT
id,
operation,
title,
channel_type,
mode
FROM
my_db.my_table
),
list_aggregated_data AS (
SELECT
id,
listagg(title) AS titles,
listagg(channel_type) AS channel_types,
listagg(mode) AS modes
FROM
my_data
GROUP BY
id
),
coalesced_data AS (
SELECT DISTINCT
id,
coalesce(titles) AS title,
coalesce(channel_types) AS channel_type,
coalesce(modes) AS mode
FROM
list_aggregated_data
),
joined_data AS (
SELECT
md.id,
md.operation,
cd.title,
cd.channel_type,
cd.mode
FROM
my_data AS md
LEFT JOIN
coalesced_data AS cd ON cd.id = md.id
)
SELECT
*
FROM
joined_data
ORDER BY
id,
operation
But for some reason this is resulting in concatenation of values (presumably from coalesce operation), where I get
| id | operation | title | channel_type | mode |
|-----|-----------|------------------|--------------|------|
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Set | recoveryrecovery | Email | Link |
| abc | Verify | recoveryrecovery | Email | Link |
| pqr | Start | sign_in | Push | OTP |
| pqr | Verfiy | sign_in | Push | OTP |
| pqr | Verify | sign_in | Push | OTP |
| xyz | Start | sign_up | | Link |
What's the correct way to approach this problem?
I'd start with the first_value() window function with the ignore nulls option. You will partition by the first 2 columns and will need to work out the edge cases with some data massaging, likely in the order by clause of the window function.
Am new to Postgres SQL.
I have a big table that can be splitted to multiple table
ID_Student | Name_Student | Departement_Student | is_Student_works | job_title | Work_Departement | Location|
=============================================================================================================
1 | Rolf | Software Eng | Yes | intern SE | data Studio
| london |
2 | Silvya | Accounting | Yes | Accounter | TORnivo
| New York |
I want to split it into 3 tables ( student, departement, work) :
STUDENT TABLE
ID_Student | Name_Student | is_Student_works | Location|
========================================================
1 | Rolf | yes | london |
2 | Silvya | Yes | New York|
DEPARTEMENT TABLE
ID_DEPARTEMENT | Name_DEPARTEMENT |
===================================
1 | Software Eng |
2 | Accounting |
WORK TABLE
ID_WORK | Name_WORK |
===================================
1 | intern SE |
2 | Accounter |
I need Only the Query that Split the table into multiple tables, THE CREATION OF TABLES ARE NOT NEEDED.
I am trying to understand how to pivot data within T-SQL but can't seem to get it working. I have the following table structure
+-------------------+-----------------------+
| Name | Value |
+-------------------+-----------------------+
| TaskId | 12417 |
| TaskUid | XX00044497 |
| TaskDefId | 23 |
| TaskStatusId | 4 |
| Notes | |
| TaskActivityIndex | 0 |
| ModifiedBy | Orange |
| Modified | /Date(1554540200000)/ |
| CreatedBy | Apple |
| Created | /Date(2121212100000)/ |
| TaskPriorityId | 40 |
| OId | 2 |
+-------------------+-----------------------+
I want to pivot the name column to be columns expected output
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
| TASKID | TASKUID | TASKDEFID | TASKSTATUSID | NOTES | TASKACTIVITYINDEX | MODIFIEDBY | MODIFIED | CREATEDBY | CREATED | TASKPRIORITYID | OID |
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
| | | | | | | | | | | | |
| 12417 | XX00044497 | 23 | 4 | | 0 | Orange | /Date(1554540200000)/ | Apple | /Date(2121212100000)/ | 40 | 2 |
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
Is there an easy way of doing it? The columns are fixed (not dynamic).
Any help appreciated
Try this:
select * from yourtable
pivot
(
min(value)
for Name in ([TaskID],[TaskUID],[TaskDefID]......)
) as pivotable
You can also use case statements.
You must use the aggregate function in the pivot table.
If you want to learn more, here is the reference:
https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-2017
Output (I only tried three columns):
DB<>Fiddle
I have mysql table with text field "email", which can contain "user#example.com" and "user1#example.com;user2#example.com;user3#example.com".
| Name | Email |
| user | user#example.com |
| user1 | user1#example.com;user2#example.com;user3#example.com |
How can i do output with Talend such this:
| Name | Email |
| user | user#example.com |
| user1 | user1#example.com |
| user1 | user2#example.com |
| user1 | user3#example.com
The tNormalize component does exactly this. You can provide a character for separation, in your case ; and get rows as result afterwards.
EDIT
AxelH pointed out that it is possible as well to use a String for separation, this is not a Character.
I can see that it is possible to add metadata to a Rackspace virtual machine instance.
I want to get a list of running instances, filtered by a particular metatag value.
I can't see how to do so in the documentation however.
is it possible?
You should be able to do so using the openstack client... but it depends on which metatag you're interested in.
You can get a list of all servers:
openstack server list
Will spit something like
+--------------------------------------+------------------+--------+-----------------------------------------------------------------------------------------------------------+
| ID | Name | Status | Networks |
+--------------------------------------+------------------+--------+-----------------------------------------------------------------------------------------------------------+
| 97606ae9-7f18-4a3c-903a-1583d446119b | trysmallwin | ERROR | |
| cb78b8d5-2f03-4a3f-ab26-f389acbd0b76 | Win-try again | ERROR | public=2607:f298:5:101d:f816:3eff:fe9e:5cd4, 208.113.133.90, 2607:f298:5:101d:f816:3eff:fe36:da45, |
| | | | 208.113.133.93, 2607:f298:5:101d:f816:3eff:fe40:57d5, 208.113.133.95 |
| 040751d1-c4c5-47aa-8dec-1d69a468be1c | hnxhdkwskrvwvdwr | ACTIVE | public=2607:f298:5:101d:f816:3eff:fe60:324, 208.113.130.52 |
+--------------------------------------+------------------+--------+-----------------------------------------------------------------------------------------------------------+
note the ID of the server and investigate deeper:
openstack server show 040751d1-c4c5-47aa-8dec-1d69a468be1c
+--------------------------------------+------------------------------------------------------------+
| Field | Value |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | iad-2 |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2016-07-26T17:32:01.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | public=2607:f298:5:101d:f816:3eff:fe60:324, 208.113.130.52 |
| config_drive | True |
| created | 2016-07-26T17:31:51Z |
| flavor | gp1.semisonic (50) |
| hostId | e1efd75d1e8f6a7f5bb228a35db13647281996087d39c65af8ce83d9 |
| id | 040751d1-c4c5-47aa-8dec-1d69a468be1c |
| image | Ubuntu-14.04 (03f89ff2-d66e-49f5-ae61-656a006bbbe9) |
| key_name | stef |
| name | hnxhdkwskrvwvdwr |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | d2fb6996496044158cf977c2129c8660 |
| properties | |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| updated | 2016-07-26T17:32:01Z |
| user_id | 5b2ca246f39a425f9a833460bf322603 |
+--------------------------------------+------------------------------------------------------------+
openstack --f json will output the same stuff but in json format that you can more easily manipulate programmatically.
HTH