FB Deduplication 3 Ways event_id, fbp, externalid - facebook

There are three ways in which fb browser and server pixels can be deduplicated. i-e event_id, fbp, externalid
Which one should be used to get higher numebr of Results in FB?
What effects it will have if we use all three values in FB Pixel and Conversion API?
Thanks

Related

How to implement cursors for pagination in an api

This is similar to to this question which doesn't have any answers. I've read all about how to use cursors with the twitter, facebook, and disqus api's and also this article about how disqus generally built their cursors, but I still cannot seem to grok the concept of how they work and how to implement a similar solution in my own projects. Can someone explain specifically the different techniques and concepts behind them?
Lets first understand why offset pagination fails for large data sets with an example.
Clients provide two parameters limit for number of results and offset and for page offset.
For example, with offset = 40, limit = 20, we can tell the database to return the next 20 items, skipping the first 40.
Drawbacks:
Using LIMIT OFFSET doesn’t scale well for large
datasets. As the offset increases the farther you go within the
dataset, the database still has to read up to offset + count rows
from disk, before discarding the offset and only returning count
rows.
If items are being written to the dataset at a high frequency, the
page window becomes unreliable, potentially skipping or returning
duplicate results.
How Cursors solve this ?
Cursor-based pagination works by returning a pointer to a specific item in the dataset. On subsequent requests, the server returns results after the given pointer.
We will use parameters next_cursor along with limit as the parameters provided by client in this case.
Let’s assume we want to paginate from the most recent user to the oldest user.When client request for the first time , suppose we select the first page through query:
SELECT * FROM users
WHERE team_id = %team_id
ORDER BY id DESC
LIMIT %limit
Where limit is equal to limit plus one, to fetch one more result than the count specified by the client. The extra result isn’t returned in the result set, but we use the ID of the value as the next_cursor.
The response from the server would be:
{
"users": [...],
"next_cursor": "1234", # the user id of the extra result
}
The client would then provide next_cursor as cursor in the second request.
SELECT * FROM users
WHERE team_id = %team_id
AND id <= %cursor
ORDER BY id DESC
LIMIT %limit
With this, we’ve addressed the drawbacks of offset based pagination:
Instead of the window being calculated from scratch on each request based on the total number of items, we’re always fetching the next count rows after a specific reference point. If items are being written to the dataset at a high frequency, the overall position of the cursor in the set might change, but the pagination window adjusts accordingly.
This will scale well for large datasets. We’re using a WHERE clause to fetch rows with id values less than the last id from the previous page. This lets us leverage the index on the column and the database doesn’t have to read any rows that we’ve already seen.
For detailed explanation you can visit this wonderful engineering article from slack!
Here is an article about pagination: paginating-real-time-data-cursor-based-pagination
Cursors – we need to have at least one column with unique sequential values to implement cursor based pagination. This can be similar to Twitter’s max_id parameter or Facebook’s after parameter.
In general you should pass the current item or page number in the request as a param. Other usual param is the batch size of the page. Then on the server side backend you select and return the proper dataset, with an SQL query for example.
enter image description hereHere's what I am Done with. The cursor is working as a pointer and it points to that index. and limit will pick that many rows from that pointer. Let's say we have given id 10 and limit 5 then it will go to id 10 and pick 5 elements from there.
Some Graph API connections uses cursors by default. You can use 'limit' and 'before'/'after' parameters in your call. If you are still not clear, you can post your code here and I can explain with it.

FQL, search for non-friend people and sort by mutual friends

I look for facebook users (any user, not just my friends or the friends of a specific user) with a pattern in the name, like this:
SELECT first_name,uid FROM user WHERE contains('pete')
This returns an array of people who's name contains "pete" (i.e. also "peter"). The number is quite high, so the results are of course truncated. I would like to grab, in this set (before the server-side truncation) who has most mutual friends with me (or with an arbitrary user id). However, issuing
SELECT first_name,uid FROM user WHERE contains('pete') AND mutual_friend_count > 1
gives and empty set (and no errors). I think the server truncates before applying the latter selection, as in normal SQL would happen, for performance reasons (or explicit limits I missed in the documentation).
I made a second attempt by trying to retrieve all the friends of friends of a userid, and then select on the name. However, retrieving friends of friends is not allowed (and technically difficult because some friends hide their list of friends, and this breaks the query).
I can see no other way to achieve my goal... is there any? (I mean: given a userid, search for users with highest number of mutual friends).
Your best bet, since there's no FQL solution, would be to begin a database of your own following Facebook's best practices and policies of course. That way you can structure your data. Data can be kept current by setting up Facebook's Realtime updates API (https://developers.facebook.com/docs/reference/api/realtime/) to capture when new friendships are made.
I just tried it with "order by mutual_friend_count desc" and it worked. So the query needs to be:
SELECT first_name,uid FROM user WHERE contains('pete') order by mutual_friend_count desc

Facebook IN clause in FQL query

I have a big problem with Facebook FQL, if i do
select message from stream where source_id=175102475936031
I can get the correct data, if i do this
select message from stream where source_id IN(175102475936031,etc,etc,etc)
I got a empty JSON, why? is there any alternative to this select? I have to read the messages from multiple events.
Is your problem solved?
"IN" query is a supported feature of FQL. I have tried it multiple times.
Only needs to have:
column indexable
needs to have permission to access data
data should be present for at least one column :P
select message from stream where source_id IN( 175102475936031, etc, etc, etc)
"NOT IN" is not a supported feature of FQL. Ref: http://forum.developers.facebook.net/viewtopic.php?id=1420

How to get (better) demographics for fans of a Facebook page?

I'm trying to get demographics for fans of a page on Facebook - mostly country and city, but age and gender as secondary.
The primary way to do it is using FQL and doing a query in the insights table. Like so:
FB.api({
method: 'fql.query',
query: "SELECT metric, value FROM insights WHERE object_id='288162265211' AND metric='page_fans_city' AND end_time=end_time_date('2011-04-16') AND period=period('lifetime')"
}, callback);
The problem with this, however, is that the table returns a maximum of 19 records only, both for the country and the city stats. The response for a page I'm testing is as such:
[
{
"metric": "page_fans_city",
"value": {
"dallas": "12345",
"atlanta": "12340",
(...)
"miami": "12300"
}
}
]
So I'd like to know if there's any alternative to that -- to get demographics of the current fans of a page (no snapshot necessary).
Things I've tried:
Using LIMIT and OFFSET on the query do nothing (other than, sometimes, give me an empty list).
One alternative that has been discussed in the past is to use the "/members" method from the Graph API (more here) to get a list of all users, and then parse through that list. That simply doesn't work - a method exists, and it may have worked in the past, but it's not valid anymore (disabled?).
Request:
https://graph.facebook.com/platform/members?access_token=...
Response:
{"error":
{
"type":"OAuthException",
"message":"(#604) Your statement is not indexable. The WHERE clause must contain an indexable column. Such columns are marked with * in the tables linked from http:\/\/developers.facebook.com\/docs\/reference\/fql "
}}
Other solution was to do a query to the page_fan table and filtering by page_id. This doesn't work, either; it may have worked in the past, but now it says that the page_id column is not indexable therefore it cannot be used (same error as above, which leads me to believe /members uses the same internal API that has been disabled). Page_fan query is only useful to check if individual users are fans of a page.
There's also the like table, but that's only useful for Facebook items (like posts, photos, links, etc), and not Facebook Pages.
Going to the insights website about the Page, you can see the data in some nice graphs and tables, and download an Excel/CSV spreadsheet with the historic demographics data... however, it also limits the data to 19 entries (sometimes 20 with a few holes in there as cities trade top positions though).
Any other hint on how to get that data? I'd either like the insights query with more results, or at least a way to get all the page fans so I could do the location query myself later (even if the page I want to get it from has almost 5 million fans... gulp).
The data pipeline for this metric is currently limited to 20 items. This is a popular feature request and something Facebook hopes to improve soon.

Facebook - max number of parameters in “IN” clause?

In Facebook query language(FQL), you can specify an IN clause, like:
SELECT uid1, uid2 FROM friend WHERE uid1 IN (1000, 1001, 1002)
Does anyone know what's the maximum number of parameters you can pass into IN?
I think the maximum result set size is 5000
It may seem like an odd number (so perhaps I miss counted, but it's close ~1), but I can not seem to query more than 73 IN items. This is my query:
SELECT object_id, metric, value
FROM insights
WHERE object_id IN ( ~73 PAGE IDS HERE~ )
AND metric='page_fans'
AND end_time=end_time_date('2011-06-04')
AND period=period('lifetime')
This is using the JavaSCript FB.api() call.
Not sure if there is an undocumented limit or it might be an issue of facebook fql server timeout.
You should check if there is a error 500 returned from FB web server which might indicate you are passing a too long GET statement (see Facebook query language - long query)
I realized my get was too long so instead of putting many numbers in the IN statement, i put a sub-query there that fetches those numbers from FB FQL - but unfortunately it looks like FB couldn't handle the query and returned an 'unknown error' in the JSON which really doesn't help us understand the problem.
There shoud not be a maximum number of parameters as there isnt in SQL IN as far as I know.
http://www.sql-tutorial.net/SQL-IN.asp
just dont use more parameters than you have values for the function to check because you will not get any results (dont know if it will give away an error as I never tried to).