Issue with displaying of information in a Tables/Matrix visual - Power BI - visualization

Hi I'm new to Power BI desktop, but have come across an issue when displaying information, Hopefully it's due to my lack of knowledge, but I can't seem to find a way to display values in rows one after the other similar to Pivot tables functionality.
For example so if I had the following table
Location | Salary | Number
A | 100 | 1
A | 200 | 2
B | 100 | 3
B | 400 | 4
C | 400 | 5
D | 800 | 6
What I'd like to produce is something like .....
A | B | C | D
300 | 500 | 400 | 800 <-- Salary Sum
3 | 7 | 5 | 6 <-- Number Sum
I have a direct link with my data source, please suggest a way to display the same with tables/matrix
Thank you in advance

Unfortunately this is currently not supported in Power BI, but maybe there is some light at the end of the tunnel... The Power BI team have started working on this much requested feature. See here

As Tom said, this is available with the August release. You can check which version you have by going to File -> Help -> About. If you have an older verion, you can go here to download the right one for you (32-bit vs 64-bit).
Once you have made sure you are running the August version, simply create a matrix with Location in the Columns field and Salary and Number in the Values field. Then go into the formatting pane and under Values, turn Show on rows to on.

Try this : Go to query editor, select the first column of the desired table, location in your case and from transform tab, select unpivot other columns.
That's it! Now go and drop your visual.

Related

Is it possible to make multiple fields default to the same date, but also be individually editable?

I am VERY new to Access - I was sort of thrust into designing a database for a research project I'm involved in. So, please bear with me because I know next to nothing :) The problem I am having is thus:
My database is for a medical research project, and is very time and date dependent, by which I mean I need to capture the date and time for each piece of data so that we end up with a sort of timeline of events for each subject.
As is, I have something like the following for each piece of data: (Each in it's own field)
ArrivalDate
ArrivalTime
HeartRateDate
HeartRateTime
HeartRateData
TemperatureDate
TemperatureTime
TemperatureData
BloodPressureDate
BloodPressureTime
BloodPressureData
There are around 200 similar pieces of data that I need to collect for each patient. To avoid having to re-enter the same data over and over, and also to reduce the potential for error, I would like to have all of the date fields in a given patient record default to the first one that is entered, in this case "Arrival Date". However, I also need each date field to be editable without affecting the others. The reason for this is that in the event that a patient's visit occurs over the span of a few days we can accurately record that.
I have tried messing around with the default value setting, as well as setting the control source to reference the "Arrival Date" field, but then of course any changes to one field affect them all. I am not even sure that what I am trying to do is possible but I will appreciate any help and/or suggestions!
Thank you in advance
Having all this data in separate columns of a big table isn't going to work. You don't measure things like temperature or blood pressure only once per patient, do you?
This is a classic one-to-many relation.
You should have a separate Measurements table, looking e.g. like this:
+--------+-----------+---------------+------------------+-----------+
| MeasID | PatientID | MeasType | MeasDateTime | MeasValue |
+--------+-----------+---------------+------------------+-----------+
| 1 | 1 | Temperature | 2017-05-17 14:30 | 38.2 |
| 2 | 1 | BloodPressure | 2017-05-17 14:30 | 130/90 |
| 3 | 1 | Temperature | 2017-05-17 18:00 | 38.5 |
| 4 | 2 | Temperature | etc. | |
+--------+-----------+---------------+------------------+-----------+
As Barmar wrote, there is no reason to have separate columns for date and time.
In the form where measurements are entered, you can use the BeforeInsert event to set MeasDateTime to the current time, with the Now() function.
So the user never has to enter it manually, but they can edit it if the measurement was at a different time than entering the data.

Time-series Stock Data in Matlab

I'm a MatLab beginner, and have no idea what I'm doing.
I have stock data in CSV format which is something like this:
+--------+--------+------+------+-----+-------+
| Ticker | Date | Open | High | Low | Close |
+--------+--------+------+------+-----+-------+
| APPL | 25-Oct | 10 | 12 | 9 | 12 |
| XYZ | 25-Oct | 10 | 12 | 9 | 12 |
| APPL | 26-Oct | 12 | 15 | 10 | 15 |
+--------+--------+------+------+-----+-------+
There are many stock tickers each day. The file is many rows long listing daily stock prices for each ticket on a particular stock exchange.
I'm aiming to do some fun time-series analysis on the 'close' price for each ticker.
To start with making simple charts of a single ticker over time, or multiple tickers over time would be awesome.
Questions:
1. Best way to import data.
I have a big long CSV. But am lost as to which import method is best. Column Vectors, Numeric Matrix, Cell Array or Table?
2. I need to create a time-series object for each ticker, right?
How would one go about that? I've been looking at this guide, but I'm unsure how to make an object for each ticker, over the span of time defined in the file.
http://www.mathworks.com/help/matlab/ref/timeseries-class.html
Any advice, pointers and resources that are good for beginners are appreciated massively!
Thanks!
There are a ton of ways to import data into MATLAB. Before you import data, I would make sure numeric columns hold ONLY numeric data or MATLAB can complain. Some options in my personal order of preference:
d = readtable('mycsvfile.csv'); % puts data in nice table datatype. I find it makes code more readable.
d = csvread('myfile.csv',1,0); % the 1 skips the first row which is probably header names for the csv file. Puts all the data in a matrix and you have to keep track of what column is what.
xlsread is good for reading excel files
Copy and paste the data into a variable in your workspace. Do save blahblah.mat so you can easily load the data later.
I personally wouldn't bother with financial time series objects. It's just going to complicate your life if you're new to MATLAB. If you loaded the data using tableread (i.e. option 1) you can then execute something like:
aapl_indicator = strcmp(d.Ticker, 'AAPL');
to get a vector indicating whether a row in your table is AAPL or not. Then:
close_price_aapl = d.Close(aapl_indicator);
will give you a vector of Apple's closing prices.
When you get down to doing math, you want to be using the matrices.

PostgreSQL Fuzzy Searching multiple words with Levenshtein

I am working out a postgreSQL query to allow for fuzzy searching capabilities when searching for a company's name in an app that I am working on. I have found and have been working with Postgres' Levenshtein method (part of the fuzzystrmatch module) and for the most part it is working. However, it only seems to work when the company's name is one word, for example:
With Apple (which is stored in the database as simply apple) I can run the following query and have it work near perfectly (it returns a levenshtein distance of 0):
SELECT * FROM contents
WHERE levenshtein(company_name, 'apple') < 4;
However when I take the same approach with Sony (which is stored in the database as Sony Electronics INC) I am unable to get any useful results (entering Sony gives a levenshtein distance of 16).
I have tried to remedy this problem by breaking the company's name down into individual words and inputting each one individually, resulting in something like this:
user input => 'sony'
SELECT * FROM contents
WHERE levenshtein('Sony', 'sony') < 4
OR levenshtein('Electronics', 'sony') < 4
OR levenshtein('INC', 'sony') < 4;
So my question is this: is there some way that I can accurately implement a multi-word fuzzy search with the current general approach that I have now, or am I looking in the complete wrong place?
Thanks!
Given your data and the following query with wild values for the Levenshtein Insertion (10000), Deletion (100) and Substitution (1) cost:
with sample_data as (select 101 "id", 'Sony Entertainment Inc' as "name"
union
select 102 "id",'Apple Corp' as "name")
select sample_data.id,sample_data.name, components.part,
levenshtein(components.part,'sony',10000,100,1) ld_sony
from sample_data
inner join (select sd.id,
lower(unnest(regexp_split_to_array(sd.name,E'\\s+'))) part
from sample_data sd) components on components.id = sample_data.id
The output is so:
id | name | part | ld_sony
-----+------------------------+---------------+---------
101 | Sony Entertainment Inc | sony | 0
101 | Sony Entertainment Inc | entertainment | 903
101 | Sony Entertainment Inc | inc | 10002
102 | Apple Corp | apple | 104
102 | Apple Corp | corp | 3
(5 rows)
Row 1 - no changes..
Row 2 - 9 deletions and 3 changes
Row 3 - 1 insertion and 2 changes
Row 4 - 1 deletion and 4 changes
Row 5 - 3 changes
I've found that splitting the words out causes a lot of false positives whe you give a threshold. You can order by the Levenshtein distance to position the better matches close to the top. Maybe tweaking the Levenshtein variables will help you to order the matches better. Sadly, Levenshtein doesn't weight earlier changes differently than later changes.

Calculate median and average in a partition in Tableau using table calculation

I have a details table of posts and subjects digged from a forum. Row is the single subject (ie postID and subjectIS is the primary key for the table), then I have some measures at subject level and some at post level. For example:
+---------+-------------+--------------+------------+--------------+--------+
| post.ID | post.Author | post.Replies | subject.ID | subject.Rank | year |
+---------+-------------+--------------+------------+--------------+--------+
| 1 | mike | 10 | movie | 4 | 1990 |
| 1 | mike | 10 | comics | 6 | 1990 |
| 2 | sarah | 0 | tv | 10 | 2001 |
| 3 | tom | 4 | tv | 10 | 2003 |
| 3 | tom | 4 | comics | 6 | 2003 |
| 4 | mike | 1 | movie | 4 | 2008 |
+---------+-------------+--------------+------------+--------------+--------+
I want to study the trend of posts and subjects by year and color it by subject.Rank.
Firsts are easily measured putting COUNTD(post.ID) and COUNTD(subject.ID) in rows and 'year' in column.
But if I drag MEDIAN(subject.Rank) in Color, I got a wrong result: it's not calculated at distinct subject.ID level but at row level.
I think I can accomplish it using table calculation features, but I have no idea on how to proceed.
It sounds like you are trying to treat Subject.Rank as a dimension, instead of as a measure. If so, just convert it to a dimension on the worksheet in question by right clicking on the field and choosing dimension. You can also convert it to a dimension in the data pane by dragging the field from the measures section up to the dimensions section. That will tell Tableau to treat that field as a dimension by default in the future.
A field can be treated a dimension in some cases, and a measure in others. Depends on what you are trying to achieve. If you are familiar with SQL, dimensions are used to partition data rows for aggregation using the GROUP BY clause.
Finally, count distinct (COUNTD) can be expensive on large datasets. Often, you can get the same result another way. So try to think of other approaches and save COUNTD for when you really need it.
Try using {fixed [1st LEVEL],[2nd level]: median()}
or
Table calculation approach
when you put in median there is an edit table calculation under advance compute using put you fields in there(Make sure its ordered the way you want it to calculate when you select them) then click OK select the at which level and restart every

Cross tab summary fields don't restrict by column

so I'm working with Crystal Reports 10 and was looking at the cross tab to try and have a nice and neat table of my information. I'm trying to make a report where for each item (as a row), the columns will be the different sizes it comes in and the value of that cell will be the quantity.
So something that looks like this:
Small | Medium | Large
Item 1 1 | 5 | 10
Item 2 5 | 10 | 15
Using the cross tab though, the quantity field I have has to be totalled, averaged, etc. so I can't get the specific breakdown for each size in a nice table like that. Is there any way to tweak the Cross Tab to do this or is there another tool in Crystal Reports that would let me have the quantities per size in that organized fashion?
Thanks for any help you guys can give.
Update:
The cross tab I have tried gives me something that looks like this
Small | Medium | Large
Item 1 16 | 16 | 16
Item 2 30 | 30 | 30
If I put the values in the details section as separate fields, I'm able to get the values to match up properly, but its not the right format. It comes out like this
Item 1 | Small | 1
Item 1 | Medium| 5
Item 1 | Large | 10
Create a Cross-tab
Add {table.size} to the Columns
Add {table.item} to the Rows
Add {table.quantity} to Summarized Fields