Dynamically creating static routes from database using Next JS - server

I'm trying to understand how Next JS does dynamic routing and I'm a little confused on how to properly implement it in my own website. Basically, I have a database (MySQL) of content that keeps growing, let's say they're blog posts, with images stored in GCS. From what I understand you can create a pages/[id].js file in your pages folder that can handle dynamically creating routes for new pages, but, in order for you to get a good SEO score you, the Google crawlers need to see your content before any javascript or data requests are made. So the pages have to be physically available for the content to instantly appear upon loading. So if I have pages/[id].js and I have content added to the database daily, how are physical content files supposed to spontaneously populate the pages folder? And if pages files keep getting created, how do I prevent my disk from running out of space? I'm sure there is something I'm not understanding.
I read on nextjs.org that you can have a function getStaticPaths that needs to return a list of possible values for 'id'. I'm wondering, if my site is live and new content (pages) is constantly being added to the database with their own unique ids, how is it "aware" of those ids? Do I need to write a program or message queue system that constantly appends new ids to a file that is read by getStaticPaths? What if I have to duplicate my site on multiple servers around the world, or if a server dies, do I have to keep track of the file's contents in order to boot up a new server with the same content?
From what I understand, in order for Google to see any sort of content on your website, the pages text (content) needs to be static and quickly available via physical files. Images and everything else can be loaded later since Google's crawlers mainly care about text. So if every post needs to be a physical file in your app's pages folder, how do new pages files get created if the content is added to the database?
TL:DR My main concern is having my content readily available for Google crawlers in order to get a good score for my website. How do I achieve that if content is added to my database?

As you stated before, you can set up getStaticPaths to provide a list of values for id at build time. If I understand correctly, you are most concerned about what happens to new content added after the initial build.
For this you have to return the fallback key from getStaticPaths.
If the fallback key is false, then all IDs not specified initially will go to 404 and you’d need to rebuild the app every time you add new content. This is what you don't want.
If you set it to true, then the initial values will be prerendered just like before, but new values will NOT go 404. Instead, the first user visiting a path with a new Id will trigger the rendering of that new page. This allows you to dynamically check for new content if a request hits an id that wasn't available at build time.
It is interesting here that the first visitor will temporarily see a ‘fallback’-version of the page, while next.js processes the request. On that fallback, you would usually just show a loading spinner. The server then passes the data to the client in order to properly render the full page. So in practice, the user will first see a loading indicator, then the page updates itself with the actual content. Subsequent visitors will get the now prerendered result immediately.
You may now be worried about crawlers hitting that fallback page and not getting SEO content. This concern has been addressed here: https://github.com/vercel/next.js/discussions/12482
Apart from being able to serve new pages after build, the fallback strategy has another use in that it allows you to prerender only a small subset of your website (like your most visited pages), while the other pages will be generated only when necessary.
From the docs: When is fallback: true useful?
You may statically generate a small subset of pages and use fallback:
true for the rest. When someone requests a page that’s not generated
yet, the user will see the page with a loading indicator. Shortly
after, getStaticProps finishes and the page will be rendered with the
requested data. From now on, everyone who requests the same page will
get the statically pre-rendered page.
This ensures that users always have a fast experience while preserving
fast builds and the benefits of Static Generation.

Related

Getting page views and other data for a single URL (or similar URLs)

We're using the reporting APIs to access page views for specific page paths (using a regex match), but we're getting a lot of HTTP 500s and 503s, presumably because the requests are timing out due to the volume of data that needs to be scanned.
So, my question is, is there a way to access aggregate data for a URL? (for example, all page views, or all referrers)
Secondly, since a single resource can have multiple URLs (f.e. if the title forms part of the URL, and the title is changed), we would need to be able to get aggregate data for a set of URLs (using a regex, for example).
Is this possible?

Is it possible to add adverts to a custom Facebook Page Tab app?

I need to create a custom Facebook Page Tab app which will show an external site in an iframe. This need to have adverts on it but I'm not sure if this is possible as the site is hosted externally.
I'm not sure if I need to sign up to the Facebook Audience Network to get approved etc. either?
Any help or advice would be great.
Many browsers have this limitation of not allowing external sites to be shown in an iframe. Imagine the case when you are working hard to create a site and others show all your content in iframes. That is, naturally frustrating.
However, there is a candidate-solution: Let's suppose you create a page which sends a request to the other site and appends all the content into the body and head of your page. This is very much possible, so the solution is to:
Create a page in your site, let's call it outsider
In the server-side code of your outsider page send a request to the desired page to be shown
You will get the html of the page. Process it and include its content into the head and body of outsider. This includes:
3.1. Checking all the CSS to be reached, as the target page might refer to local CSS, which is unreachable locally at your end. Process the URLs of CSS files
3.2. Checking all the Javascript to be reached, as the target page might refer to local JS, which is unreachable locally at your end. Process the URLs of JS files
3.3. Apply the idea described in 3.1. and 3.2. for other resources, like images, until you are satisfied with the content of outsider
Create an iframe, having the source to point to outsider. outsider is inside your scope, so it should be shown
NOTE: If the site owning the target page does not like the possibility of you showing their content inside iframes, they might protect it by, let's say, having Javascript in their code, which checks whether the page is inside an iframe. Remove that code while processing the response to your request. If nothing else prevents you from showing the page in an iframe, then you should achieve success.

Adsense with dynamic content

I know that this topic has been discussed before in varying extent but I have some specific queries. I will use an example for this case and would like to request you for your views.
Example:- A home finance management website. There are two pages. The basic page after login is an empty page with a text box. Type in "Rent" and rent details and trends pop up. Type in "Bills" and bill details and history pop up. The data shown to user is different of course.
Now -
1. If I place an Adsense script in the basic home page where I just have a text box, will it be disqualified for not having enough content ?
2. Even if the content changes (AJAX), does the ad change to suit the content ? Does the crawler keep a constant check of index the pages after defined intervals and whatever it finds there is kept and searched for keywords ? The same page may show different content to different users and hence have different keywords. (Also, since login would be cookie based, how does crawler see this page ?)
Edit -
I know from HERE that Google does take AJAX calls into account but since the results would be dynamically populated by accessing a database and while populating unique data, the bot looking at the form action page doesn't help much, does it?
3. Google prefers GET method. So if I go like this - xyz.com?show=rent / xyz.com?show=bills, the page is regenerated and the script reloaded but each time the crawler sniffs any one of the two pages, it might see different content for different users. What does it do ?
4. If I do not reload the page by form submission and the page is not regenerated every time, can I call a function to document.write the div I am putting the ad in ? Would that make it re-sniff the page ?
Any help is much appreciated.

facebook iframe app - how to organize and write code for faster page loading - PHP SDK

I am writing an app within a facebook iframe and am unsure how best to write this. I originally wrote all the code within the main canvas.php file but found everything was running too slow before results were being loaded into the iframe.
I then tried using the php header location method so to try and load different pages into the iframe, thus reducing page load time. However, the header location is ignored.
I have also tried using javascript to get the page to load within the iframe instead, this does load in the new page but the page experiences lots of problems. It wil not pass parrameteres to itself using $_GET.
Basically, I need to perform some checks when the canvas page is first loaded in the iframe and then re-direct to another file to avoid the checks being perfomed on every page load as this seriously shows everything down. I then need to have page reloads with different parrameteres in the URL to populate the iframe with different results, again this is very slow as it has to perfomr all the checks again.
Therefore, how can I achieve a smooth workflow as a normal site within a facebook iframe?
[EDIT] Just thought is Ajax a valid option?
Many thanks in advance.
Most people experience slow response times due to not having a channelURL specified. See http://developers.facebook.com/docs/reference/javascript/
Channel File
The channel file addresses some issues with cross domain communication
in certain browsers. The contents of the channel.html file can be just
a single line:
It is important for the channel file to be cached for as long as
possible. When serving this file, you must send valid Expires headers
with a long expiration period. This will ensure the channel file is
cached by the browser which is important for a smooth user experience.
Without proper caching, cross domain communication will become very
slow and users will suffer a severely degraded experience. A simple
way to do this in PHP is:
The channelUrl parameter is optional, but recommended. Providing a
channel file can help address three specific known issues. First,
pages that include code to communicate across frames may cause Social
Plugins to show up as blank without a channelUrl. Second, if no
channelUrl is provided and a page includes auto-playing audio or
video, the user may hear two streams of audio because the page has
been loaded a second time in the background for cross domain
communication. Third, a channel file will prevent inclusion of extra
hits in your server-side logs. If you do not specify a channelUrl, you
can remove page views containing fb_xd_bust or fb_xd_fragment
parameters from your logs to ensure proper counts.
The channelUrl must be a fully qualified URL matching the page on
which you include the SDK. In other words, the channel file domain
must include www if your site is served using www, and if you modify
document.domain on your page you must make the same document.domain
change in the channel.html file as well. The protocols must also
match. If your page is served over https, your channelUrl must also be
https. Remember to use the matching protocol for the script src as
well. The sample code above uses protocol-relative URLs which should
handle most https cases properly.

webkit .appcache file caches dynamic page

The main page of my mobile web app is a .jsp page. My app requires login (Google App Engine), so there is a Log In button when the user is not logged in and a Log Out button when the user is logged in, all handled by code on the .jsp page.
I load a lot of JS code on the page, so I used a .appcache file to cache that. Unfortunatelly, even though I added my .jsp page to the Network area, the page is being cached in a funny way, ignoring the content server from the server. That means that my Log Out button shows when users are Logged Out and vice-versa.
I tried to add no-cache directives as meta tags, but they are all being ignored.
Ideas?
According to dive into HTML5, the page that references the manifest is automatically included in the manifest.
http://diveintohtml5.ep.io/offline.html
Q: Do I need to list my HTML pages in my cache manifest?
A: Yes and no. If your entire web application is contained in a single page, just make sure that page points to the cache manifest using the manifest attribute. When you navigate to an HTML page with a manifest attribute, the page itself is assumed to be part of the web application, so you don’t need to list it in the manifest file itself. However, if your web application spans multiple pages, you should list all of the HTML pages in the manifest file, otherwise the browser would not know that there are other HTML pages that need to be downloaded and cached.
I have a similar issue, and I think I will end up loading the contents of the page via AJAX.
Caching in appCache is a two stage process: first the cache manifest is checked (in this case, as the page is loading), then if the content of it has changed, that content is reloaded. However, in your case, by that time, the stale page is already loaded and displayed.
The easiest fix would be to specifically exclude the page (but not the .js) from the appCache, so that only the js is cached, and not the page. I sounds like you might have figured that out, as you are trying to do it by putting the page in the network area. Check that that exclusion is correct, as that sounds like the problem, and that html cache attributes are being set correctly on that page.