I am working on linkchecker and want to know that when AEM saves the URLs in /var/linkchecker and on what basis?
If i am opening a link,then it saves it,or it has a polling like it traverse the complete content and put it in /var/linkchecker.
Which java class help to store valid or invalid links in its storage directory?
LinkChecker is based on an eventHandler for /content (and child) nodes on creates and updates. All content is parsed and links are validated against allowed protocols and (configurable) external site links.
External Links
All the validation is done asynchronously in the background and the HTML is updated based on verification results.
/var/linkchecker is the cache for external links. The results based on simple GET requests to external links in order to optimise the process. The HTTP 200/30x response means that the links are valid. AEM looks at this cache before requesting a validation of the external link in order to optimize the page processing. This also means that the link validation is NOT real time and the delay is proportional to the load on your server.
All the links that have been checked can be seen via the /etc/linkchecker.html screen where you can request for revalidation and refresh the status of the links.
You can configure the frequency of this background check via the Day CQ Link Checker Service configuration under /system/console/configMgr. The default interval is 5 seconds (scheduler.period parameter).
Under the config manager /system/console/configMgr you will find a lot of other Day CQ Link * configurations that control this feature.
For example, Day CQ Link Checker Transformer contains config for all the elements that need to be transformed by the link checker.
Similarly Day CQ Link Checker Info Storage Service configures the link cache.
Internal Links
Internal links are ignored unless they used FQDN and external urls (which is not normally the case on author). The only exception is in a multi-tenant environment where page from one site links to another site and all the mapping information is stored in sling mappings.
Related
I'm trying to understand how Next JS does dynamic routing and I'm a little confused on how to properly implement it in my own website. Basically, I have a database (MySQL) of content that keeps growing, let's say they're blog posts, with images stored in GCS. From what I understand you can create a pages/[id].js file in your pages folder that can handle dynamically creating routes for new pages, but, in order for you to get a good SEO score you, the Google crawlers need to see your content before any javascript or data requests are made. So the pages have to be physically available for the content to instantly appear upon loading. So if I have pages/[id].js and I have content added to the database daily, how are physical content files supposed to spontaneously populate the pages folder? And if pages files keep getting created, how do I prevent my disk from running out of space? I'm sure there is something I'm not understanding.
I read on nextjs.org that you can have a function getStaticPaths that needs to return a list of possible values for 'id'. I'm wondering, if my site is live and new content (pages) is constantly being added to the database with their own unique ids, how is it "aware" of those ids? Do I need to write a program or message queue system that constantly appends new ids to a file that is read by getStaticPaths? What if I have to duplicate my site on multiple servers around the world, or if a server dies, do I have to keep track of the file's contents in order to boot up a new server with the same content?
From what I understand, in order for Google to see any sort of content on your website, the pages text (content) needs to be static and quickly available via physical files. Images and everything else can be loaded later since Google's crawlers mainly care about text. So if every post needs to be a physical file in your app's pages folder, how do new pages files get created if the content is added to the database?
TL:DR My main concern is having my content readily available for Google crawlers in order to get a good score for my website. How do I achieve that if content is added to my database?
As you stated before, you can set up getStaticPaths to provide a list of values for id at build time. If I understand correctly, you are most concerned about what happens to new content added after the initial build.
For this you have to return the fallback key from getStaticPaths.
If the fallback key is false, then all IDs not specified initially will go to 404 and you’d need to rebuild the app every time you add new content. This is what you don't want.
If you set it to true, then the initial values will be prerendered just like before, but new values will NOT go 404. Instead, the first user visiting a path with a new Id will trigger the rendering of that new page. This allows you to dynamically check for new content if a request hits an id that wasn't available at build time.
It is interesting here that the first visitor will temporarily see a ‘fallback’-version of the page, while next.js processes the request. On that fallback, you would usually just show a loading spinner. The server then passes the data to the client in order to properly render the full page. So in practice, the user will first see a loading indicator, then the page updates itself with the actual content. Subsequent visitors will get the now prerendered result immediately.
You may now be worried about crawlers hitting that fallback page and not getting SEO content. This concern has been addressed here: https://github.com/vercel/next.js/discussions/12482
Apart from being able to serve new pages after build, the fallback strategy has another use in that it allows you to prerender only a small subset of your website (like your most visited pages), while the other pages will be generated only when necessary.
From the docs: When is fallback: true useful?
You may statically generate a small subset of pages and use fallback:
true for the rest. When someone requests a page that’s not generated
yet, the user will see the page with a loading indicator. Shortly
after, getStaticProps finishes and the page will be rendered with the
requested data. From now on, everyone who requests the same page will
get the statically pre-rendered page.
This ensures that users always have a fast experience while preserving
fast builds and the benefits of Static Generation.
I am reading thru AEM WCM and had a question that once after creating page in lower environments and publish where to check that look and feel? Is there any url to check? or will check in AEM only?
can anyone give example format url?
TKs
Have a look at the Adobe Authroing documentation, understand the concept and architecture of the AEM. As AEM is on the REST based concept implementation your page content path will be the page url (if you don't have any sling internal redirects or Dispatcher level url hiding implementations).
Just to explain you taking an geometrixx website OOTB example.
Working at author end in local instance at port 4502:
Example if you have created a test page under /content/geometrixx/en/toolbar/
the test page url will be http://localhost:4502/cf#/content/geometrixx/en/toolbar/testpage.html
preview mode can be tested appending wcmmode=disabled at the end of your url as shown below
http://localhost:4502/cf#/content/geometrixx/en/toolbar/testpage.html?wcmmode=disabled
from side kick preview option
If you have published the page (assuming your publish instance is
running at 4503 on local)
your page path will be http://localhost:4503/content/geometrixx/en/toolbar/testpage.html
If you are using touch UI then you can see the preview mode by:
Edit link: http://localhost:4502/editor.html/content/geometrixx/en/toolbar/testpage.html
Preview Link:
http://localhost:4502/content/geometrixx/en/toolbar/testpage.html?wcmmode=disabled
You should have a Replication Agent and configuration setup there to publish the pages. There is a default agent comes with AEM where you can publish the changes from author to publish environment.
I have the need to create a page in the Alfresco Share context that should be accessible without authentication. When using the page framework it seems pretty straight forward since you can add <authentication>none</authentication> to the page definition.
When using aikau the page definitions is gone and I'm left with the get.desc.xml-webscript file which does to my knowledge does not support the authentication element. Anyone having an idea?
It looks like you are accessing your webscript through the auth-page url:
http://<ip>:<port>/<context>/page/ap/ws/<webscript>
Please note that ap in the URL stands for the authenticated page defined under the directory:
/<project-name>/src/main/webapp/WEB-INF/surf-config/pages/auth-page.xml
This section :
<config evaluator="string-compare" condition="UriTemplate">
<uri-templates>
<uri-template id="remote-node-page">/{pageid}/p/{pagename}/{store_type}/{store_id}/{id}</uri-template>
<uri-template id="remote-site-page">/site/{site}/{pageid}/p/{pagename}</uri-template>
<uri-template id="remote-page">/{pageid}/p/{pagename}</uri-template>
<uri-template id="sitepage">/site/{site}/{pageid}/ws/{webscript}</uri-template>
<uri-template id="userpage">/user/{userid}/{pageid}/ws/{webscript}</uri-template>
<uri-template id="page">/{pageid}/ws/{webscript}</uri-template> <!-- this template matches your URI which means the resolution of which page/webscript would be accessed will rely fully on it -->
</uri-templates>
</config>
of your
/<project-name>/src/main/webapp/WEB-INF/surf.xml
Defines page/webscript resolution policy based on URI-templates. For further infos on how to set/exploit page uri templates please visit this tutorial
The auth-page has authentication set USER as shown here which would result in asking for authentication before even trying to resolve the webscript
So if you want to access some aikau page in un-authenticated mode (as a guest user) you should be using the noauth-page like this:
http://<ip>:<port>/<context>/page/na/ws/<webscript>
FYI: You do not have to set your webscript authentication at all as it defaults to none when the authentication tag is not present
It's worth being aware that you can create your own template pages for Aikau. You aren't limited to the pages that are defined in either Share or clients created via the Aikau Maven Archetype (see https://github.com/Alfresco/Aikau/blob/master/tutorial/chapters/Tutorial1.md).
In Share for example you have 4 templates available out-of-the-box:
dp (Dynamic Page - what you should use in most cases)
hdp (Hybrid Dynamic Page - where the header and footer and rendered above and below the page)
rp (Remote Page - accesses a page stored on the Alfresco Repository)
hrp (Hybrid Remote Page) - accesses a remote page stored on the Alfresco Repository and renders it between the standard header and footer.
In clients created by the Aikau Maven Archetype you have:
- na (Not Authenticated) - renders a page but doesn't require a user to be authenticated
- ap (Aikau Page) - renders a page for authenticated users.
Aikau pages make use of URI templates to reduce the amount of Surf objects that are required to build a page - however you always have the option of building your own pages.
See the examples in the archetype project for reference, the no-authentication page is defined here
Both this page and the standard authenticated page both re-use the standard template type which ultimately maps to the standard page FreeMarker template
However, if you want to build your own pages and templates you can - you're not limited to using what is provided by default.
I am a newbie to Adobe DTM (Dynamic Tag Management) and have not done any kind of training related to it. However, I have been given a requirement to integrate DTM with AEM 6. I Have some requirements related to Omniture where certain events on the website are tracked and that information needs to be sent to DTM. I have followed steps described on this blog (http://blogs.adobe.com/aemtutorials/2013/07/24/customize-the-client-context/) to customize the client context by creating a new session store and storing some sample data inside it. Now, next part is to retrieve this data into DTM which I am completely unaware on how to do. What is need to achieve in particular is to create a new data element as shown in the screenshot below and write some custom java script to access the data stored inside the client context (which is present in the session store) as explained in the blog mentioned.
I have no idea on how to integrate DTM with AEM instance and how to get hold of that data needed using the script. There is no info available on the internet regarding this, hence request you to help me in case anybody have worked on such a requirement earlier. Any help is highly appreciated
Step 1 - Set up DTM cloud services configuration in AEM. You may find cloud services config at /etc/cloudservices/dynamictagmanagement.html
Step 2 - Apply the above cloud config to the root of your website using the page property. This will insert the required JS scripts and JS object into the DOM. You could also do step (1) & (2) together by manually inserting header and footer code (from DTM) into the template.
Step 3 - Supply data to DTM JS object. This you could populate the data from server side or at client side using JS. You could leverage client context as well. JS APIs available to query client context.
PS: Am also a learner on this.
Helpful links:
http://blogs.adobe.com/experiencedelivers/experience-management/integrating-dtm-custom-aem6-page-template/
http://docs.adobe.com/docs/en/aem/6-0/administer/integration/marketing-cloud/dtm.html
You can use data elements with custom script like this:
e.g. dataElement authorizableId is custom script with content
return CQ_Analytics.ClientContext.get("/profile/authorizableId");
or
dataElement pageTitle
return CQ_Analytics.PageDataMgr.getProperty("title”);
This is how I implemented it. Please note that this implementation is for integrating with flat HTML files. Where we need to add the scripts in Head tag.
Pre-requisites:
1. We need to have login credentials for DTM website.
We need to have admin rights.
We need sc3.omniture site credentials. This is usually provided by Adobe team.
From Omniture console we need to generate the AppMeasurement.js file.
From AppMeasurement.js file we need to get important details like:
a. Tracking Server Name
b. s_account name
c. Visitor namespace
Connecting HTML files to DTM:
Login to https://dtm.adobe.com with admin credentials.
Click on Project dashboard
Click on Embed tab on top navigation.
Enable Host on Akamai.
Expand Header Code widget and copy the code.
Paste that code in the tag of your HTML.
Go back to DTM again. Expand Footer Code widget and copy the code.
Paste that code to the tag of your HTML.
Configuring DTM for Direct Call Rules:
1. Go to Rules tab from top navigation.
2. Click on Direct Call Rules from left navigation now.
Click on create rule.
Give it a name in the Name section.
Expand Conditions widget.
Pay close attention to the Conditions textbox. Direct Call Rules are fired using the _satelitte.track() method, and the text you enter in the Conditions textbox will be the argument you pass in this method. We entered “change-offer-submit”, so to fire this Direct Call Rule, we will use _satelitte.track(“change-offer-submit”) as you will see in the code below.
Now use the Adobe Analytics section to set up a custom link.
Below is the code that sets up our form and its validation. Notice the way the DTM _satelitte.track() is used. Each of the arguments passed in the _satelitte.track() method matches the Condition textbox in separate DTM rules.
HTML
<div class="outer-btn">
<input class="input-btn analyticsEvent" type="button" value="Submit" data-eventName="change-offer-submit">
</div>
JavaScript
<script>
jQuery('.analyticsEvent').on('click',function() {
window.console.log('Logged Event: ' + jQuery(this).attr('data-eventName'));
_satellite.track(jQuery(this).attr('data-eventName'));
location.href='./landingPage.html';
});
</script>
This has been superseded by a tool added to DTM in the June 2016 release, ContextHub was added in 6.1 as beta, in 6.2 it reached feature parity with ClientContext. It saves a lot of time building data layers as a lot of it will be there already .
I want to know the difference between Follow Redirects and Redirect Automatically while recording with Jmeter.
Also what effect will both these have when used with Retrieve all Embedded Resources from HTML
Redirect automatically, will not consider redirect as a separate request
where as Follow redirects will consider each redirection as a separate request.
This difference can be visualized in the Listener (View Results Tree).
If Retrieve all Embedded Resources from HTML is checked, it will give you Page Load Time, since apart from response time it will keep on calculating the time taken till all the supporting files of html page have been downloaded to Local (CSS, Images, Javascript files.. etc.)
Also if any values needs to be captured from redirect request you need to set configuration a follow redirect otherwise will not be able to capture those data using extractors (set cookie values for example)
Hope this will help.