Tableau Map Report - tableau-api

I am working on creating a map sales report to show the sales by product for various territories. The territories are based upon zip codes and are custom territories that overflow into multiple states or are partially in a state. I have gotten everything set up and it looks good for the most part...EXCEPT 2 areas.
1.) one of the sales numbers shows up in Alaska which is not viewable if a user is zoomed in on the USA (we are US-based so it's only relevant to show anyways). Is there a way to force a sales number to show up on a user-defined location? For instance, can I show this on the State of Washington instead of Alaska or can it only default to the largest (area) part of a user-created territory map?
2.) being that we are US-based is there a way to move the states Alaska and Hawaii closer to the US? I know that utilizing the dashboard is a workaround, but it does not look good.

I'm not sure this could be a complete answer, but I think this question has more than one take.
That being said, if your worksheet is based on zip codes in order to create a map, I don't think you can force Tableau to visualize data out of their original position based on the specific geographic role.
The only thing that come to my mind is switching your approach from geographical role (country, state, city, zip, etc) to a more generic lat/long coordinates.
Doing so, you can manually match your Alaska zip codes to lat/long more "continental" areas.
Anyway this would require a lot of data manipulation prior to Tableau.
An alternative way of accomplish something similar to what you say in your second point could lead you to use 3 seperate worksheets in a single dashboard: continental, Alaska, Hawaii.
I did something on US data and I was facing the same problem for Hawaii, so I decided to use a floating worksheet putting it on the bottom left corner of the continental map.

Related

How do I standardize geographical data where the format (city/state/country) does not exist?

I am trying to show the current distribution of individuals across a world map, but I am running into trouble trying to standardize location data.
This is a simple feat with the locations of the American and Canadian individuals as they all follow a similar structure (City -> State -> Country). I would ideally like to show these on the city level so that each state with multiple individuals doesn't only contain one dot.
However, there are cities such as Kampala, Bicester, or Bucharest that do not have a state or province and the next largest region is the country itself.
If I use the city as the level of granularity, I see what I'm looking for in the US/Canada area but miss all of the geographical areas without a state. If I show only the State/Country, not only do I miss those without a state, but I also am only seeing one dot for each state whereas I want to see a dot for each city in the state that an individual resides in.
I tried to edit the unknown locations but was still unable to resolve the state/province conundrum. I am unsure how to get around this and can't find any resources discussing this issue. Can anyone point me in the right direction on how to standardize inconsistently formatted location data?

How to find relation between two columns of csv (containing labels and related data) file using doc2vec?

I am working on a problem related to doc2vec where i need to find labels that are related to a particular word. For ex (csv file):
Data Label / Tags
In a future world devastated by disease, a convict is sent back sci-fi
in time to gather information about the man-made virus that
wiped out most of the human population on the planet.
You have slipped under my skin, invaded my blood and seized my action
heart. That sounds more like a poison than a person,” was all I
could say. His confession had both shocked and thrilled me.
Plenty of data like this is available on which the model can be trained. Now, I want the results like, when I enter a particular word like virus, it gives me corresponding labels (sci-fi) where ever the word is used and also give those labels (action), where the word virus itself is not present but it's semantically related words (like poison, poisonous) are present. The semantically related words can be easily fetched from the model. I just want to list the labels.
I want to know if something could be applied rather than using keyword search. Any particular method which could help me solve this problem.
Thanks

Determining canonical classes with text data

I have a unique problem and I'm not aware of any algorithm that can help me. Maybe someone on here does.
I have a dataset compiled from many different sources (teams). One field in particular is called "type". Here are some example values for type:
aple, apples, appls, ornge, fruits, orange, orange z, pear,
cauliflower, colifower, brocli, brocoli, leeks, veg, vegetables.
What I would like to be able to do is to group them together into e.g. fruits, vegetables, etc.
Put another way I have multiple spellings of various permutations of a parent level variable (fruits or vegetables in this example) and I need to be able to group them as best I can.
The only other potentially relevant feature of the data is the team that entered it, assuming some consistency in the way each team enters their data.
So, I have several million records of multiple spellings and short spellings (e.g. apple, appls) and I want to group them together in some way. In this example by fruits and vegetables.
Clustering would be challenging since each entry is most often 1 or two words, making it tricky to calculate a distance between terms.
Short of creating a massive lookup table created by a human (not likely with millions of rows), is there any approach I can take with this problem?
You will need to first solve the spelling problem, unless you have Google scale data that could allow you to learn fixing spelling with Google scale statistics.
Then you will still have the problem that "Apple" could be a fruit or a computer. Apple and "Granny Smith" will be completely different. You best guess at this second stage is something like word2vec trained on massive data. Then you get high dimensional word vectors, and can finally try to solve the clustering challenge, if you ever get that far with decent results. Good luck.

Creating Dynamic Filters in Tableau

I'm working in Tableau to help my school district visualize discipline data. I want to be able to disaggregate and filter by quite a few different measures (at least 13).
In the past, if I wanted to be able to disaggregate by a number of measures, I would make a parameter with a list of possible outputs, display each output as the name of a measure, then create a calculated field that returned the value from a given measure based on that parameter. This works fine for disaggregating.
However, filtering based on these values presents a challenge. The problem is that I'm not filtering based on any given measure, I'm filtering on a calculated field that returns the value in that measure. If my parameter is set to "Day" for instance, and I filter to Tuesday, but then switch to "Race", everything vanishes, because now my calculated field is returning race. What I want to create is a dropdown menu that lets you select from a number of different measures to filter by.
Below is a link to a packaged workbook that can help illustrate the problem that I'm dealing with.
I feel like something like this should be possible in Tableau, but there's some little trick that I'm missing. When I contacted their support team, their solutions were both only viable due to the limited number of measures I was using in the dummy data. The support team felt that this was possible as well, but they didn't know how.
https://public.tableau.com/profile/publish/DynamicFiltersUsingParameters/Sheet1#!/publish-confirm
You could create an Filter Action on the Tableau dashboard which carries over the 'Day' filter to give a smaller subset of data to work with for the next filter.

What do I need to find out council map boundaries in Australia for iPhone app?

There are probably about 600 councils in Australia. I need to work out how to create boundaries for them all within my iPhone application so that when a user is in a certain area the application will know which council the user is in.
I probably can get a lot of this information from councils, however what information would I need to ask for? Is boundary information enough? And then how should my developer use that?
Thanks,
It sounds like what you're asking about is how to define the boundary of a council. Generally the boundary of a council (or country, or any other geographic region) can be defined by an ordered series of latitude, longitude pairs which represent points on the surface of the Earth; the border is the line that connects them.
Such a series might look a bit like the following:
Region 1:
64.222, 41.135
64.161, 41.143
64.114, 41.080
...
Region 2:
64.114, 41.080
64.008, 41.090
64.008, 40.902
...
Given such a series of border points there are established algorithms for determining whether a given point is within the region (if you're curious you can read about them here). I'm not sure whether there are more efficient algorithms for determining which of several regions a point is in, but that's for your developer to figure out.
I'll answer your two questions separately:
1. Where do I get council map boundaries for Australia?
The Australian Bureau of Statistics publish this data in ESRI Shapefile and MapInfo format. The areas are known as "Local Government Areas". The 2010 data set is available at http://www.abs.gov.au/AUSSTATS/abs#.nsf/DetailsPage/1259.0.30.001July%202010?OpenDocument
2. How do I use geospatial data?
The ESRI Shapefile format can be read by pretty much every spatial data package under the sun. I have some favourites however:
On client side my favourite library is GDAL, a translator library with an X/MIT style Open Source license. It comes with C, C++, Python and C# bindings. Or if this is too heavyweight, you might prefer to directly use Shapelib, an MIT licensed C library used by GDAL.
On the server side you can't go past PostGIS. If you are sending your latitude/longitude pair to a web server, consider installing these spatial extensions for the postgresql server. You can load a shapefile into the database using the bundled shp2pgsql utilty. Then, to find the LGA your lat/lon pair fall into query the database like this:
SELECT * FROM lga2010
WHERE ST_Intersects(lga2010.the_geom,
ST_SetSRID(ST_MakePoint(your_longitude, your_latitude),4326))