How do i Index more then 1 million product pages in Google Search Console? - google-search-console

I have a webshop which has over 1 million unique products (electronic parts). Now we are running into a problem, which is getting them all indexed in Google.
Somehow it is too much for Google to index.
We tried to write our own bot which scripts everything. After indexing about 400 to 600k it didnt work anymore.
So we are looking for other ways to get Google to index 1 million unique product pages.
Do you have any idea?

Related

MongoDB Atlas Serverless Database | Serverless Instance Costs | RPU cost explanation

Can someone explain how RPUs are getting calculated by an example ?
Let's say I have a mongo collection that has 5 million documents. So if i do a findOne to the collection, the RPUs generated would be 5M or 1 ?
It depends on a crucially import step that I could not find anywhere in Mongo's pricing information.
I had 4000 daily visits with a user database of 25,000 users. I was using findOne on the database two times per page load. What I thought would be 8000 RPUs turned out to be 170 million RPUs. That's right - 42,500 RPUs per page visit.
So the critically important step is: add indexes for the keys you're matching on.
Yes this is an obvious step to optimize performance, but also easy to overlook the first go-around of creating a database.
I was in the MongoDD dashboard daily to check user stats and was never alerted to any abnormalities. Searching online for over an hour, I could not find any mention of this from Mongo. But when I reached out to support, this is the first thing they pointed out. So they know it's an issue. They know it's a gotcha. They know how simple it'd be to point out that a single call is resulting in 40,000 RPUs. Doesn't seem that they care. Rather it seems to be a sleezy business model they've intentionally adopted.
So if you do a findOne on a 5 million document database, will you get 1 RPU charged or 5 Million RPUs charged? Depends if you remembered to index.

Create a sort index for millions of records

I have an application that runs a MongoDB database. This database will store 5 million documents per user. The web application that uses this database will display a 5 thousand of these documents on a page at any given time and will be a selection of the 5 million. Each document must have a sort order (or rank) such that the web page will be able to allow the user to sort their 5 thousand records, by dragging items up.down as they see fit (a sortable list).
I have read articles about how Trello uses a double-float decimal number to change the value of the sorted item in a list, but this seems to only allow for 50-odd worst-case sorts, so will not accommodate the large number of items in the users list. My questions is how do I do this?

Weird performance drop

Welcome everyone!
I am facing a weird drop of performance for my symfony2 app using MongoDB.
The query that creates the problem is the following:
$posts=$dm->getRepository('ngNearBundle:Posts')
->findBy(array('author'=>new \MongoId($userId)), array('date' => -1),5);
for a user (the first ever in the database) , the whole page loads in 100ms. For another user, it takes 500ms.
MY ASSUMPTION
The first user has almost 140K posts, where the second has barely 3 posts, so in order to gather the limit (5 posts) the cursor should navigate all the database looking for 5 posts but finds only 3.
If you agree with my assumption, how can I fix this problem, can Indexes help me out here ?
I have an index on the field author though.. is it because 99% of posts was written by user 1 so it's easy for Mongo to fetch data for that user?
Please enlighten me.

How to merge perl Net::LDAP search objects?

I have a task where I need to compare some employee IDs in a DB with the official source of current employee IDs that lives in an LDAP DB.
So I have a list of 3700 company IDs, which are called uid in the LDAP schema.
I realised that it might not be A Good Thing to just create a huge query like
(|(uid=foo1)(uid2=foo2)....(uid=foo3700))
and submit it. So I came up with the idea of chopping my list into chunks. I did some timing and for chunks of 50 the query ran in 7 minutes, 100 in 4 minutes and 200 in 3.5 minutes.
But now I have several search objects, one per query.
I imagine I can
loop over the chunks and query and process the objects and store the results for reporting later. Or I can
store the search results in some sort of array or hash. Or I can
somehow combine the search results into a big search object.
I kind of like the idea of 3., because it strikes me as the most generic solution, but I have no idea how to do it.
So is 3. a good approach?
If so, how to do it?
Otherwise, what are better approaches to this problem?

Caching Array from DB (MongoDB) in Node.js / Express.js

I wanted to add user search auto-complete (like Facebook's) to my Rails app on Heroku, and I chose to write it in Node.js because of the concurrency requirements. The search first pulls a user's friend list (of id's, which include all twitter friends, not just their friends on our site) from Mongo, then searches for users in that list, then does another search for any other users that match the query that weren't in the results returned by the friends search.
This was fairly fast at first (~150 ms), but for users with more friends (above, say, 100 total), loading their friends array ended up being a huge bottleneck, linearly slowing down the search to a maximum of around 1500 ms for user's with 1,000 friends (the maximum number supported for autocomplete friends search).
The problem is, I'm completely new to Node.js and Express (its Sinatra-like web framework), and I have no idea how to cache the friends array so I only need to load it once (ideally into memory). In Rails on Heroku I'd simply load the array into Memcache, but I'm not even sure how to configure Memcache in Node/Express, let alone if that's supported on Heroku.
Any ideas?
(Also note, I'm multi-key indexing for all these queries, including the friends ids)
I imagine mongodb would be the place to have the matching done. It seems like you are trying to get all of the results back into your own code, and then match them yourself in an array. You will probably find it faster to ask mongodb to filter out the top 10 matching results for you and then send those direcectly to the client.
The best part about databases is that they can do this filtering for you and quickly. And it should scale well beyond other solutions. Trust the database, the whole point of mongodb is that the query should be blazingly fast and close to the speed of memcache. You just need to ask it the right question. And I imagine you can hammer the database hard, but make sure to request only the exact # of matches you are intending to use.
To match John Smi...
Maybe something like this (I just made this up to show the idea):
friendIdList //Assumed to be a simple array of ids from your app
var matchFriends = db.people.find( { person_id : { $in : friendIdList }, name : /john smi.*/i } ).sort( { name : 1 } ).limit(10);
See the mongodb docs on regular expression queries
hope this helps, I am just learning about mongodb and not an expert, but this is how I would approach the problem on other databases
I know very little about Node.js or Express. However, I can tell you that you probably want to do this client-side. (i.e.: cookie the friends list on the client and use javascript to search)
If you look at FB's implementation, this is what they're doing (at least they were several months ago).
I would suggest, if you aren't going to preload the names all into the client side then you would be better doing the search after the first character is input. This will reduce the number of names you need to search for to a fraction and then submit that request to the DB. You can then return those results in alphabetical order, once as more characters are typed you can filter without sorting. Every request should then meet your 150ms target as long as a user doesn't have thousands of friends called "David Smith".