InputObject Properties Creation - powershell

I need some assistance creating a hashtable of users to use with Get-MGBetaUser
On the Microsoft Website (https://learn.microsoft.com/en-us/powershell/module/microsoft.graph.users/get-mguser?view=graph-powershell-1.0) They will give you the parameter it's looking for (UserID) but I can't find any other articles online with exact use case such as this.
Currently I can get one object in the hash and have to access it directly by asking for the Index
$Users['UserID'] = #{
UserID = "<IDOfUser>"
}
Get-MGBetaUser -InputObject $Users.UserID
If I pipe this same hash into Get-MGBetaUser, i'll get the error
Line |
6 | $Users | Get-MGBetaUser
| ~~~~~~~~~~~~~~~~~~~~~~~
| Resource 'System.Collections.Hashtable' does not exist or one of its queried reference-property objects
| are not present.
The hash will have approx. 15-20k userids which will need to be added, and they'll be coming from a CSV
It looks like Microsoft will only accept the Pipeline input through this method with the hash. Everything else I've always done will allow piping an array of IDs into it.
Thank you in advance for any assistance

Thanks all for your responses, as it seems the answer is you couldn't supply the Graph SDK with an array or hash of users as originally intended, I opted to go a different route. The example I was giving was to have x number of jobs spin off based off the number of hashes/files I had, so that I could limit each jobs scope and create realistic timeframes for data pulls.
I decided to create an hash table of Filters which will limit each jobs exposure. In my case, we have 140k+ students, so I created a filter for each SurNames ending in A,B,C... Spinning these into 27 jobs A-Z I can get complete results back within 30 minutes. In my testing, trying to do one big pull of students took hours. The reason for doing this is I am also opting to get Licensing information along with Sign In Activity (Which requires you to supply a GUID if you used a for-each)
Allowing Graph to use the filter enables the SDK to get the data in batches, and not the other way around.
If anyone has further insight into this, or sees a better way (Other than For-Each) feel free to let me know, otherwise I'm marking this closed!

Related

How to read paged records in mongo when record creator is deleted

May be title is not very clear but I'll try to explain.
There are two collections in mongo:
groups
users
Groups are created by users.
UI sends a /groups/1/10 to read first 10 groups. We don't want to return groups whose creators(users) are deleted.
Example:
UI makes call: /groups/1/10
Let us say only 8 records are available because 2 users are deleted from the system, hence their groups are not available.
What should we do?
Should UI make another request like: /groups/1/2 ?
Should we read let's say 20 groups, read first 10 available groups and return them? This may not be very good for second or third pages.
There is not enough information here to give a specific answer, specifically we need to know more about the schema that you are using. We'll try to give some general details that might point things in the right direction. We are also assuming that your endpoints are structured as /groups/<pageNumber>/<pageSize>.
Broadly speaking, if the client calls /groups/1/10 and there are (at least) 10 valid matching results, then the system should return 10 results.
It's not clear what you mean when you say:
only 8 records are available because 2 users are deleted from the system, hence their groups are not available ... Should UI make another request like: /groups/1/2 ?
The first part of that statement implies that there are only 8 valid results, but the second part implies that there are at least 2 more valid results that can be retrieved. If there are 10 valid results then they should be all get returned.
How you accomplish this depends on how invalid groups and/or deleted users are represented in your system. If, for example, the documents in your groups collection have some sort of valid field that becomes false when the user who created it gets deleted, then you should be applying a filter to remove those results such as:
db.groups.find({ valid: true }).limit(10)
If instead the groups have a document that references the user who created it, then you may need to do something a bit more complex. That may be along the lines of doing an aggregation that does a $lookup on the users collection and then perform a subsequent $match to remove the groups from the results whose creators have been deleted.
While there are many approaches to this problem, the only one that I would consider incorrect would be to force the client to perform the group validity check and/or force the client to make multiple requests.

IBM Cloudant DB - get historical data - best way?

I'm pretty confused concerning this hip thing called NoSQL, especially CloudantDB by Bluemix. As you know, this DB doesn't store the values chronologically. It's the programmer's task to sort the entries in case he wants the data to.. well.. be sorted.
What I try to achive is to simply get the last let's say 100 values a sensor has sent to Watson IoT (which saves everything in the connected CloudantDB) in an ORDERED way. In the end it would be nice to show them in a D3.css style kind of graph but that's another task. I first need the values in an ordered array.
What I tried so far: I used curl to get the data via PHP from https://averylongID-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_all_docs?limit=20&include_docs=true';
What I get is an unsorted array of 20 row entries with random timestamps. The last 20 entries in the DB. But not in terms of timestamps.
My question is now: Do you know of a way to get the "last" 20 entries? Sorted by timestamp? I did a POST request with a JSON string where I wanted the data to be sorted by the timestamp, but that doesn't work, maybe because of the ISO timestamp string.
Do I really have to write a javascript or PHP script to get ALL the database entries and then look for the 20 or 100 last entries by parsing the timestamp, sorting the array again and then get the (now really) last entries? I can't believe that.
Many thanks in advance!
I finally found out how to get the data in a nice ordered way. The key is to use the _design api together with the _view api.
So a curl request with the following URL / attributes and a query string did the job:
https://alphanumerical_something-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_design/iotp/_view/by-date?limit=120&q=name:%27timestamp%27
The curl result gets me the first (in terms of time) 120 entries. I just have to find out how to get the last entries, but that's already a pretty good result. I can now pass the data on to a nice JS chart and display it.
One option may be to include the timestamp as part of the ID. The _all_docs query returns documents in order by id.
If that approach does not work for you, you could look at creating a secondary index based on the timestamp field. One type of index is Cloudant Query:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#query
Cloudant query allows you to specify a sort argument:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#sort-syntax
Another approach that may be useful for you is the _changes api:
https://console.bluemix.net/docs/services/Cloudant/api/database.html#get-changes
The changes API allows you to receive a continuous feed of changes in your database. You could feed these changes into a D3 chart for example.

REST where should end point go?

Suppose there's USERS and ORDERS
for a specific user's order list
You could do
/user/3/order_list
/order/?user=3
Which one is prefered and why?
Optional parameters tend to be easier to put in the query string.
If you want to return a 404 error when the parameter value does not correspond to an existing resource then I would tend towards a path segment parameter. e.g. /customer/232 where 232 is not a valid customer id.
If however you want to return an empty list then when the parameter is not found then query string parameters. e.g. /contacts?name=dave
If a parameter affects an entire URI structure then use a path e.g. a language parameter /en/document/foo.txt versus /document/foo.txt?language=en
If unique identifiers to be in a path rather than a query parameter.
Path is friendly for search engine/browser history/ Navigation.
When I started to create an API, I was thinking about the same question.
Video from apigee. help me a lot.
In a nutshell when you decide to build an API, you should decide which entity is independent and which is only related to someone.
For example, if you have a specific endpoint for orders with create/update/delete operations, then it will be fine to use a second approach /order/?user=3.
In the other way, if orders have only one representation, depends on a user and they don't have any special interaction then you could first approach.
There is also nice article about best practice
The whole point of REST is resources. You should try and map them as closely as possible to the actual requests you're going to get. I'd definitely not call it order_list because that looks like an action (you're "listing" the orders, while GET should be enough to tell you that you're getting something)
So, first of all I think you should have /users instead of /user, Then consider it as a tree structure:
A seller (for lack of a better name) can have multiple users
A user can have multiple orders
An order can have multiple items
So, I'd go for something like:
The seller can see its users with yourdomain.com/my/users
The details of a single user can be seen with yourdomain.com/my/users/3
The orders of a single user can be seen with yourdomain.com/my/users/3/orders
The items of a single order can be seen with yourdomain.com/my/users/3/orders/5

How to optimize collection subscription in Meteor?

I'm working on a filtered live search module with Meteor.js.
Usecase & problem:
A user wants to do a search through all the users to find friends. But I cannot afford for each user to ask the complete users collection. The user filter the search using checkboxes. I'd like to subscribe to the matched users. What is the best way to do it ?
I guess it would be better to create the query client-side, then send it the the method to get back the desired set of users. But, I wonder : when the filtering criteria changes, does the new subscription erase all of the old one ? Because, if I do a first search which return me [usr1, usr3, usr5], and after that a search that return me [usr2, usr4], the best would be to keep the first set and simply add the new one to it on the client-side suscribed collection.
And, in addition, if then I do a third research wich should return me [usr1, usr3, usr2, usr4], the autorunned subscription would not send me anything as I already have the whole result set in my collection.
The goal is to spare processing and data transfer from the server.
I have some ideas, but I haven't coded enough of it yet to share it in a easily comprehensive way.
How would you advice me to do to be the more relevant possible in term of time and performance saving ?
Thanks you all.
David
It depends on your application, but you'll probably send a non-empty string to a publisher which uses that string to search the users collection for matching names. For example:
Meteor.publish('usersByName', function(search) {
check(search, String);
// make sure the user is logged in and that search is sufficiently long
if (!(this.userId && search.length > 2))
return [];
// search by case insensitive regular expression
var selector = {username: new RegExp(search, 'i')};
// only publish the necessary fields
var options = {fields: {username: 1}};
return Meteor.users.find(selector, options);
});
Also see common mistakes for why we limit the fields.
performance
Meteor is clever enough to keep track of the current document set that each client has for each publisher. When the publisher reruns, it knows to only send the difference between the sets. So the situation you described above is already taken care of for you.
If you were subscribed for users: 1,2,3
Then you restarted the subscription for users 2,3,4
The server would send a removed message for 1 and an added message for 4.
Note this will not happen if you stopped the subscription prior to rerunning it.
To my knowledge, there isn't a way to avoid removed messages when modifying the parameters for a single subscription. I can think of two possible (but tricky) alternatives:
Accumulate the intersection of all prior search queries and use that when subscribing. For example, if a user searched for {height: 5} and then searched for {eyes: 'blue'} you could subscribe with {height: 5, eyes: 'blue'}. This may be hard to implement on the client, but it should accomplish what you want with the minimum network traffic.
Accumulate active subscriptions. Rather than modifying the existing subscription each time the user modifies the search, start a new subscription for the new set of documents, and push the subscription handle to an array. When the template is destroyed, you'll need to iterate through all of the handles and call stop() on them. This should work, but it will consume more resources (both network and server memory + CPU).
Before attempting either of these solutions, I'd recommend benchmarking the worst case scenario without using them. My main concern is that without fairly tight controls, you could end up publishing the entire users collection after successive searches.
If you want to go easy on your server, you'll want to send as little data to the client as possible. That means every document you send to the client that is NOT a friend is waste. So let's eliminate all that waste.
Collect your filters (eg filters = {sex: 'Male', state: 'Oregon'}). Then call a method to search based on your filter (eg Users.find(filters). Additionally, you can run your own proprietary ranking algorithm to determine the % chance that a person is a friend. Maybe base it off of distance from ip address (or from phone GPS history), mutual friends, etc. This will pay dividends in efficiency in a bit. Index things like GPS coords or other highly unique attributes, maybe try out composite indexes. But remember more indexes means slower writes.
Now you've got a cursor with all possible friends, ranked from most likely to least likely.
Next, change your subscription to match those friends, but put a limit:20 on there. Also, only send over the fields you need. That way, if a user wants to skip this step, you only wasted sending 20 partial docs over the wire. Then, have an infinite scroll or 'load more' button the user can click. When they load more, it's an additive subscription, so it's not resending duplicate info. Discover Meteor describes this pattern in great detail, so I won't.
After a few clicks/scrolls, the user won't find any more friends (because you were smart & sorted them) so they will stop trying & move on to the next step. If you returned 200 possible friends & they stop trying after 60, you just saved 140 docs from going through the pipeline. There's your efficiency.

Multiple tags / folders in Google Reader

I want to be able to grab data from multiple tags / folders in a users Google Reader.
I know how to do one http://www.google.com/reader/atom/user/-/label/SOMELABEL but how would you do two or three or ten?
Doesn't look like you can get multiple tags/folders in one request. If it's feasible you should iterate over the different tags/folders and aggregate them in your application.
[edit]
Since it looks like you have a large list of tags/folders you need to query, an alternative is to get the full list of entries, then sort out the ones the user wants. It looks like each entry has a category element that will tell you what tag is associated with it. This might be feasible in your case.
(Source: http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI)
(Source: http://www.google.com/reader/atom/user/-/state/com.google/starred)
I think you cannot get aggregated data as you hope to be able to. If you think about it, even Google lets you browse folders or tags one at a time, and do not aggregate a sub-set of them.
You can choose to have a list of all the items (for each one of their available statuses) or a list of a particular tag/folder.
You could do it in 2 requests. First you need to perform a GET request to http://www.google.com/reader/stream/items/ids. It supports several parameters like
s (required parameter; stream id to fetch; may be defined more than one time),
n (required; number of items to fetch)
r for ranking (optional)
and others (see more under /ids section)
And then you should perform a POST request (this is because there could be a lot of ids, and therefore the request could be cut off) to http://www.google.com/reader/api/0/stream/items/contents. The required parameter is i which holds the feed item identifier (could be defined more than once).
This should return data from several feeds (as returned for me).