Am working on upgradation of Sybase 15.7 from 12.5 where I encountered a peculiar problem.Below query
select rs.EmpId, rs.Date, rs.Currency, rs.Salary,
from #Results rs, #EmpSort es
where rs.EmpId = es.EmpId
order by es.EmpCode, rs.Currency
while executing, result is grouped as currency and sorted in according to salary desc in Sybase 15.7, where as in Sybase 12.5 result is grouped as currency and sorted according to date asc. Am not sure why this happening.
I detected the following issues
1.- Remove the last comma from the first line
select rs.EmpId, rs.Date, rs.Currency, rs.Salary (,)
2.- You are ordering by es.EmpCode but the the fields that you are listing are
rs.EmpId, rs.Date, rs.Currency, rs.Salary,
add es.EmpCode to the fields you are listing.
Related
Context: I'm fairly new to coding as a whole and is learning SQL. This is one of my practice/training session
I'm trying to create a Dimension Table called "Employee Info" using the Adventureworks2019 public Database. Below is my attempt query to fetch all the data needed for this table.
SELECT
e.BusinessEntityID AS EmployeeID,
EEKey = ROW_NUMBER() OVER(ORDER BY(SELECT NULL)),
p.FirstName,
p.MiddleName,
p.LastName,
p.PersonType,
e.Gender,
e.JobTitle,
ep.Rate,
ep.PayFrequency,
e.BirthDate,
e.HireDate,
ep.RateChangeDate AS PayFrom,
e.MaritalStatus
From HumanResources.Employee AS e FULL JOIN
Person.Person AS p ON p.BusinessEntityID = e.BusinessEntityID FULL JOIN
Person.BusinessEntityAddress AS bea ON bea.BusinessEntityID = e.BusinessEntityID FULL JOIN
HumanResources.EmployeePayHistory AS ep ON ep.BusinessEntityID = e.BusinessEntityID
Where
PersonType='SP'
OR PersonType='EM'
ORDER BY EmployeeID;
Query result
Each employee (EE for short) will have a unique [EmployeeID]. The [EEKey] is simply used to mark ordinal numbers of each record.
EEs are paid different rates shown in the [Rate] column. There will be duplicate records if any EE receives a change in his/her pay rate.
There is currently a [PayFrom] column indicating the first date a pay rate is being applied to each record.
Current requirements: Create a [PayTo] column on the right of [PayFrom] to return the last date each EE is getting paid their corresponding pay rate. There should be 2 scenarios:
If the EE being checked has multiple records, meaning his/her pay rate was adjusted at some point. [PayTo] will return the [PayFrom] date of the next record minus 1 day.
If the EE being checked does not have any additional record indicating pay rate changes. [PayTo] will return a fixed day that was specified (Say 31/12/2070)
Example:
[EmployeeID] no. 4 - Rob Walters with 3 consecutive records in Line 4,5,6. In Line 4, the [PayTo] column is expected to return the [PayFrom] date of Line 5 minus 1 day (2010-05-30). The same rule should be applied for Line 5, returning (2011-12-14).
As for Line 6, since there is no additional similar record to fetch data from, it will return the specified date (2070-12-31), using the same rule as every single-record EE.
As I have mentioned, I am a fresher and completely new to coding, so my interpretation and method might be off. If you can kindly point out what I'm doing wrong or show me what should I do to solve this issue, it will be much appreciated.
I have the following sample from a table with students results with date for a school entry exam
First student passed exam - This is the most common record found for most students
Second student failed 1st time entry and passed second time based on the date
3rd student had a failed input entry and was corrected based on the Version
I need the results to like like the picture above, so we take into regard using the latest date and highest version!
My basic query thus far is
select studentid
,examdate --(Date)
,result -- (charvar)
from StudentEntryExam
How should I approach this issue?
demo:db<>fiddle
SELECT DISTINCT ON (studentid)
*
FROM mytable
ORDER BY studentid, examdate DESC, version DESC
DISTINCT ON returns the first record of an ordered group. In that case the groups are the studentids. You must find the correct order to set the required record first. So, you need to order by studentid, of course. Then you need the most recent examdate first, which can be achieved with DESC order. If there are two records on the same date, you need to order the highest version first as well using the DESC modifier, too.
When I aggregate values in Google Data Studio with a date dimension on a PostgreSQL Connector, I see buggy behaviour. The symptom is that performing COUNT(DISTINCT) returns the same value as COUNT():
My theory is that it has something to do with the aggregation on the data occurring after the count has already happened. If I attempt the exact same aggregation on the same data in an exported CSV instead of directly from a PostgreSQL Connector Data Source, the issue does not reproduce:
My PostgreSQL Connector is connecting to Amazon Redshift (jdbc:postgresql://*******.eu-west-1.redshift.amazonaws.com) with the following custom query:
SELECT
userid,
submissionid,
date
FROM mytable
Workaround
If I stop using the default date field for the Date Dimension and aggregate my own dates directly in within the SQL query (date_byweek), the COUNT(DISTINCT) aggregation works as expected:
SELECT
userid,
submissionid,
to_char(date,'YYYY-IW') as date_byweek
FROM mytable
While this workaround solves my immediate problem, it sucks because I miss out on all the date functionality provided by Data Studio (Hierarchy Drill Down, Date Range filtering, etc.). Not to mention reducing my confidence at what else may be "buggy" within the product 😞
How to Reproduce
If you'd like to re-create the issue, using the following data as a PostgreSQL Data Source should suffice:
> SELECT * FROM mytable
userid submissionid
-------- -------------
1 1
2 2
1 3
1 4
3 5
> COUNT(DISTINCT userid) -- ERROR: Returns 5 when data source is PostgreSQL
> COUNT(DISTINCT userid) -- EXPECTED: Returns 3 when data source is CSV (exported from same PostgreSQL query above)
I'm happy to report that as of Sep 17 2020, there's a workaround.
DataStudio added the DATETIME_TRUNC function (see here https://support.google.com/datastudio/answer/9729685?), that allows you to add a custom field that truncs the original date to whatever granularity you want, without causing the distinct bug.
Attempting to set the display granularity in the report still causes the bug (i.e., you'll still set Oct 1 2020 12:00:00 instead of Oct 2020).
This can be solved by creating a SECOND custom field, which just returns the first, and then you can add IT to the report, change the display granularity, and everything will work OK.
I have the same issue with MySQL Connector. But my problem is solved, when I change date field format in DB from DATETIME (YYYY-MM-DD HH:MM:SS) to INT (Unixtimestamp). After connection this table to the Googe Datastudio I set type for this field as Date (YYYYMMDD) and all works, as expected. Hope, this may help you :)
In this Google forum there is a curious solution by Damien Choizit that involves combining your data source with itself. It works well for me.
https://support.google.com/datastudio/thread/13600719?hl=en&msgid=39060607
It says:
I figured out a solution in my case: I used a Blend Data joining twice the same data source with corresponding join key(s), then I specified a data range dimension only on the left side and selected the columns I wanted to CTD aggregate as "dimensions" (and not metric!) on the right side.
We are using Microsoft Reporting for generating a daily report. I want to add another column to one of the tables we have. Initially, I had set this up correctly and the report worked fine. However, due to technicalities I have to use a different table (with exact the same data) so I edited the query and once I do that I star getting "#Error" in the cell calues of my column.
The cell expression:
=Lookup(Fields!fldFlight.Value, Fields!OutboundFlightNumber.Value, Fields!OnTime.Value, "DataSet")
I use the following query to form DataSet:
SELECT
turnarounds_staging.OutboundFlightNumber
,turnarounds_staging.VisitDatabaseID AS [turnarounds_staging VisitDatabaseID]
,turnarounds_staging.STDDate
,events_staging.VisitDatabaseID AS [events_staging VisitDatabaseID]
,events_staging.OnTime
,events_staging.Event
FROM
turnarounds_staging
LEFT OUTER JOIN events_staging
ON turnarounds_staging.VisitDatabaseID = events_staging.VisitDatabaseID
WHERE
events_staging.Event ='PDC'AND
turnarounds_staging.STDDate= #Date
Where #Date is a parameter indicating yesterday.
If I change the query to the original table (identical). It works fine.
Any ideas why this happens when turnarounds_staging is identical to the original table?
I have a report that I am in the process of converting from Crystal Reports to JasperReports. I am designing the report using iReport 4.5 and JasperReport Server 4.5. Oracle Stored Procedure that returns REF_CURSORS is used to populate data. Below is a sample SQL:
Select First_Name, Last_Name, DOB, City From PPL Order By DOB;
When I use this SQL in the report designer and design a report and I create groups as follows
Parent Group is First_Name
Second Group is City
Basically I want to group all the people with similar first name in the all the cities it applies.
Expected results:
First_Name Last_Name DOB City
Alan Kum 10/01/2010 Mumb
Alan Boss 01/10/2001 Mumb
Alan Cross 10/10/2000 Irvn
But since the SQL has an order by clause my data is not displayed in the expected manner as displayed above. How do I overcome this issue?
The issue is I can not change the procedure as it's being used in the application and also an Excel version of the report uses the same query where they want to see the data in order by DOB.
Well... the "correct" solution is to change the query to order by the fields you want to order by: First_Name, City (and then perhaps by DOB or Last_Name to have a fully deterministic ordering).
But since you don't have that option available to you, you can instead do the sorting in JasperReports. Edit your query and then click the button "Sort options...". This should allow you to re-sort the data as you like. It will be slower sorting in the report engine, but slower and correct is far better than a quick result which doesn't meet your needs.