How to filter rows based on a condition and if the condition isn't met, grab another row in Talend? - talend

It was hard to think of a title for this question, so hopefully that did make sense.
I will explain further. I have a flow of data from an Excel file and each row has one of two words in the last column. It will either contain "Open" or "Current".
So lets say I have an input that looks like this:
NAME | SSN | TYPE
John | 12345| Current
Katy | 99999| Current
Sam | 33333| Current
John | 12345| Open
Cody | 55555| Open
And the goal is grab only a person once. Each person has their unique id as their SSN. I want to grab Open rows if both Open and Current exist for that person. If only Current exists, then grab that.
So the final output should look like this:
NAME | SSN | TYPE
Katy | 99999| Current
Sam | 33333| Current
John | 12345| Open
Cody | 55555| Open
NOTE: As you can see, the first entry for John has been removed since he had an Open row.
I have attempted this already but it is sloppy and I figure there must be a better way. Here is an image of what I have done:
Talend flow

Here's how you can do it:
First sort the data by Name, and Type descending (this is important so that for each person, the Open record is on the top); then in the tMap filter it like this:
Numeric.sequence(row2.name, 1, 1) == 1
Only let the record through if this is the first we're seeing this name.

Related

Lookup in LibreOffice Calc to match partial string of the search criterion

I'm putting my banking information in a LibreOffice Calc spreadsheet.
I have columns as follows, that I imported from my bank account:
Date | USD | Description | Category
----------+-------+---------------------------------------------+----------
2/28/2019 | 44.00 | POS 0123 2345 123456 FRED-MEYER #02 | groceries
2/27/2019 | 2.50 | PANDA EXPRESS #123 TIGARD OR - 123546789012 | lunch
These descriptions are very unpredictable, they're whatever merchants want them to be, but they can tell me useful information about what I spent my money on. I've used this information to manually enter the categories. But I'm looking to automate this for common things.
So, I created a separate sheet with lookup values, like so:
Expression | Category
-----------+----------
FRED-MEYER | groceries
PANDA | lunch
I am looking for a formula in the "Category" column of the first sheet to automatically determine categories, based on the lookup table in the second sheet. (Obviously I don't plan for the lookup table to be exhaustive, but whatever I don't put in there, I can enter in the first sheet manually, thus overwriting the formulas.)
I had this working fine in Excel using a nifty construct of SEARCH and MATCH. (I don't even understand it anymore and I don't have Excel to check.) But since I'm now a Linux user, I'm trying to use LibreOffice, and I've not been able to make this work with formulas. I tried SEARCH, MATCH, LOOKUP, VLOOKUP, FIND, with and without regexes and with different options on/off. But no success so far.
I think this is very similar to this question, though it was only answered for Excel (I'm using Calc).

Is it possible to make multiple fields default to the same date, but also be individually editable?

I am VERY new to Access - I was sort of thrust into designing a database for a research project I'm involved in. So, please bear with me because I know next to nothing :) The problem I am having is thus:
My database is for a medical research project, and is very time and date dependent, by which I mean I need to capture the date and time for each piece of data so that we end up with a sort of timeline of events for each subject.
As is, I have something like the following for each piece of data: (Each in it's own field)
ArrivalDate
ArrivalTime
HeartRateDate
HeartRateTime
HeartRateData
TemperatureDate
TemperatureTime
TemperatureData
BloodPressureDate
BloodPressureTime
BloodPressureData
There are around 200 similar pieces of data that I need to collect for each patient. To avoid having to re-enter the same data over and over, and also to reduce the potential for error, I would like to have all of the date fields in a given patient record default to the first one that is entered, in this case "Arrival Date". However, I also need each date field to be editable without affecting the others. The reason for this is that in the event that a patient's visit occurs over the span of a few days we can accurately record that.
I have tried messing around with the default value setting, as well as setting the control source to reference the "Arrival Date" field, but then of course any changes to one field affect them all. I am not even sure that what I am trying to do is possible but I will appreciate any help and/or suggestions!
Thank you in advance
Having all this data in separate columns of a big table isn't going to work. You don't measure things like temperature or blood pressure only once per patient, do you?
This is a classic one-to-many relation.
You should have a separate Measurements table, looking e.g. like this:
+--------+-----------+---------------+------------------+-----------+
| MeasID | PatientID | MeasType | MeasDateTime | MeasValue |
+--------+-----------+---------------+------------------+-----------+
| 1 | 1 | Temperature | 2017-05-17 14:30 | 38.2 |
| 2 | 1 | BloodPressure | 2017-05-17 14:30 | 130/90 |
| 3 | 1 | Temperature | 2017-05-17 18:00 | 38.5 |
| 4 | 2 | Temperature | etc. | |
+--------+-----------+---------------+------------------+-----------+
As Barmar wrote, there is no reason to have separate columns for date and time.
In the form where measurements are entered, you can use the BeforeInsert event to set MeasDateTime to the current time, with the Now() function.
So the user never has to enter it manually, but they can edit it if the measurement was at a different time than entering the data.

Crystal Reports 2011 - Suppressing Information Based on Certain Criteria

I'm going to attempt to word this question without being too confusing.
We have a report we want to show each patient and their insurance. Each of the insurance in the patient's record is number by an Order Number. However, we don't only want to show that; I want to put in certain criteria so that if Insurance A has order number 1 under the Patient ID, show all of this patient's insurance. If the patient does not have Insurance A in Order Number 1, do not show this patient nor any of their information on the report.
In the code below, guarantor is referring to insurance. So order number and guarantor name is what we're focusing on. Here's the code I've put into Section Expert for the Suppress option. What I assume is if it meets the criteria, TRUE will suppress the information, else FALSE will allow the information. However, this is not sufficient as it suppresses all of the other information.
if {billing_guar_order_no_ep.guarantor_order_number} = "1" AND
{billing_guar_order_no_ep.guarantor_name} = "Medicare" then
false
else
true
What I'm assuming is it will need to iterate or loop through every patient and if it finds this information, list ALL of the patient's information and move forward, else suppress and move forward. I hope this makes sense.
Example:
|Patient ID|Order Number|Guarantor Name|
| -------- | ---------- | ------------ |
|1 | 1|Medicare |
|1 | 2|Medicaid |
|2 | 1|Medicaid |
|2 | 2|Medicare |
In the above example, what I want is for the report to show everything from Patient #1 (including all order numbers) and to not even show Patient #2 in the report. However, what's happening is Patient #1 does show up, but only Order Number 1; it suppresses all the other information.
What am I missing?
The query that you want will be an adaptation of this:
select *
from data d
where not exists (
select 1
from data
where pat_id=d.pat_id
and order_id=1
and guarantor_name='Medicaid'
)
The 'Linking Expert' doesn't support this syntax, so you'll need to use a Command instead.
Process:
get the current query by selecting Database | Show SQL Query ...
create a new report
select 'add command' from within the database expert
paste the query, then adapt it

merging rows in postgres with matching fields

I have a table format that appears in this type of format:
email | interest | major | employed |inserttime
jake#example.com | soccer | | true | 12:00
jake#example.com | | CS | true | 12:01
Essentially, this is a survey application and users sometimes hit the back button to add new fields. I later changed the INSERT logic to UPSERT so it just updated the row where email=currentUsersEmail , however for the data inserted prior to this code change there are many duplicate entries for single users. i have tried some group by's with no luck, as it continually says the
ID column must appear in the GROUP BY clause or be used in an
aggregate function.
Certainly there will be edge cases where there may be clashing data, in this case maybe user entered true for the employed column and then in the second he/she could have enter false. For now I am not going to take this into account yet.
I simply want to merge or flatten these values into a single row, in this case it would look like:
email | interest | major | employed |inserttime
jake#example.com | soccer | CS | true | 12:01
I am guessing I would take the most recent inserttime. I have been writing the web application in scala/play, but for this task i think probably using a language like python might be easier if i cannot do it directly through psql.
You can GROUP BY and flatten using MAX():
SELECT email, MAX(interest) AS interest,
MAX(major) AS major,MAX(employed) AS employed,
MAX(inserttime) AS inserttime
FROM your_table
GROUP BY email

How to create a form for a 2D/multi-dimensional table?

I need to create a form in CakePHP to allow users enter data(numbers) into the cells shown on the table below. The input screen needs look like the table shown below. The users should be able to select any cell they want to enter/update the value, then type in the value, click submit and submit the value/s. Each "Metric section (e.g. Metric A, Metric B..)" will have a submit button so the users can edit/update each section on the table.
___________________________ ____________________
|___ ___|___2006_______|__2007____|___2008___|
|_METRIC A__|______________|__________|__________|
| item A1 | 1 | 5 | 7 |
| item A2 | 15 | 18 | 21 |
| item A3 | 3 | 6 | 11 |
| item A4 | 1 | 1 | 3 |
|___________|______________|__________|__________|
|_METRIC B__|______________|__________|__________|
| item B1 | 12 | 18 | 31 |
| item B2 | 1 | 4 | 6 |
| item B3 | 0 | 0 | 2 |
--------------------------------------------------
As you can see each metric section is a two dimensional table. So I would like to capture the input data in a 2D array. Currently I have successfully created the display for the data (which was much easier). I simply created an array of metrics, which is an array of 2D arrays. Then I passed that array of 2D arrays to the view file to display the table.
I am kind of lost on how to get the user input for this table. Anyone had any similar experiences? Any suggestion will be greatly helpful to me.
You might want to start be researching some Javascript grid solutions with editing capabilities. Ext's Grid comes to mind, but there will most definitely be alternatives for your JavaScript framework of choice (eg. jQuery). This will handle all of the onclick goodness on the client side leaving you to implement an AJAX action that the form can submit data at. There it is up to you to determine which models you update with each part of the data.
In case this becomes useful to anyone else, I'm posting my solution here..
I managed to tackle this one by using 'Parallel two dimensional arrays'.
It's a simply trick. You create the 2D array for the display of data. So you have an array for each row in the table. For example if the columns are 'Year 2006', 'Year 2007' & 'Year 2008' like in my question above. You will create a row of data with a value for each year.
$data_for_row_array = array('10', '15', '35');
Like this you create a data array for each row in your table and you will get an array of rows:
$rows_array = array(data_for_row_1_array, data_for_row_2_array, ..) etc
This will do for the display array.. Now if you want to capture users input for each of those cells in the table all you need to do is create a similar second 2D array with the ids of each data cell taken from the database.. if an id does not exist yet, just leave it blank. And when the user enters data and submits it, just loop through the two 2D arrays and use the ID from one array and the data from the other array to match with it. because in both arrays the matching ID and it's data value will be at the same position. So if you find and ID in the "ids array" and a data value in the same position in the "data array", you just use it to update the database table. This is the concept of 'parallel two dimensional arrays'.
And if you find and empty id value in the IDs array yet some data value at the same position in the data array that means the user has entered a completely new value so you save it as a new data cell in the database.
Hope this gives you some idea.. if it isn't clear just let me know and i will explain it in more detail.