How to know whether a site is running in ATG? - atg

Is there any way to know that whether a site is running in ATG or not? For Ex: By viewing page source or something like that.

You can start off by interrogating the Response Headers and look for the X-ATG-Version
X-ATG-Version:version=QVRHUGxhdGZvcm0vOS4yIFsgRFBTTGljZW5zZS8wIEIyQ0xpY2Vuc2UvMCAgXQ==
That normally indicates that a site is running ATG. That said, a lot of has been said about sites hiding their response headers for security purposes (as suggested in RFC2068):
Revealing the specific software version of the server may allow the
server machine to become more vulnerable to attacks against software
that is known to contain security holes. Implementers SHOULD make the
Server header field a configurable option.
If you don't already do this, you probably should.
Beyond the Response Header, the other tell tale sign that a site is using ATG is generally looking a the source code for the ever present hidden formhandlers
<input value="" type="hidden" name="/atg/commerce/order/purchase/CartModifierFormHandler.someFormElement">

Related

Coldfusion - Redirect website if it hits /folder/index.cfm?

Very new to Coldfusion, but not to web development so hopefully this is an easy question.
We recently changed a link on our website that took us to /folder/index.cfm. I want to make sure that when someone types www.ourwebsite.com/folder that it doesn't take them to /folder/index.cfm and instead to redirect them to another website.
Any pointers?
There are at least three ways it to do this.
Don't even bother with ColdFusion. Have your web server do the redirect. You are going to need to know if it Apache or IIS or whatever. You can then search for how that web server does it.
This might help you with some of that: Custom 404 error page not working on IIS 8.5
You can make a file at /folder/index.cfm and have a file that has
OR with cfscript
<cfscript>
location("newpage.cfm", false, 301)
</cfscript>
Note the addtoken and statuscode are optional. Add token helps because almost no CF website uses this kind of token. The status code helps because tells the browser that this is a permanent move.
You could intercept the request in application.cfc . In fact, in some systems all requested are checked for validity in application.cfc. You might still need a blank page at the target, but at least some ColdFusion is processed
Of all the options, 1 is my favorite, because there really isn't a lot that can be done with requests to missing pages. And the list of potential missing pages is unlimited.

How to prevent Google from indexing redirect URL I do not own

A domainname that I do not own, is redirecting to my domain. I don´t know who owns it and why it is redirecting to my domain.
This domain however is showing up in Googles search results. When doing a whois it also returns this message:
"Domain:http://[baddomain].com webserver returns 307 Temporary Redirect"
Since I do not own this domain I cannot set a 301 redirect, or disable it. When clicking the baddomain in Google it shows the content of my website but the baddomain.com stays visible in the URL bar.
My question is: How can I stop Google from indexing and showing this bad domain in the search results and only show my website instead?
Thanks.
Some thoughts:
You cannot directly stop Google from indexing other sites, but what you could do is add the cannonical tag to your pages so Google can see that the original content is located on your domain and not "bad domain".
For example check out : https://support.google.com/webmasters/answer/139394?hl=en
Other actions can be taken SEO wise if the 'baddomain' is outscoring you in the search rankings, because then it sounds like your site could use some optimizing.
The better your site and domain rank in the SERPs, the less likely it is that people will see the scraped content and 'baddomain'.
You could however also look at the referrer for the request and if it is 'bad domain' you should be able to do a redirect to your own domain, change content etc, because the code is being run from your own server.
But that might be more trouble than it's worth as you'd need to investigate how the 'baddomain' is doing things and code accordingly. (properly iframe or similar from what you describe, but that can still be circumvented using scripts).
Depending on what country you and 'baddomain' are located in, there are also legal actions. So called DMCA complaints. This however can also be quite a task, and well - it's often not worth it because a new domain will just pop up.

Why don't browsers support PUT and DELETE requests and when will they?

I'm seeing many frameworks recently that have decided to "fake" PUT and DELETE requests in form submissions (not ajax). Like Ruby on Rails. They seem to be waiting for browsers to catch up. Are they waiting in vain?
Is this even slated to be implemented anywhere?
Browsers do support PUT and DELETE, but it's HTML that doesn't.
For example, a browser will initiate a PUT request via Javascript (AJAX), but not via HTML <form> submission.
This is because HTML 4.01 and the final W3C HTML 5.0 spec both say that the only HTTP methods that their form elements should allow are GET and POST.
There was much discussion about this during the development of HTML 5, and at one point they got added to HTML 5, only to be removed again. The reason the additional methods were removed from the HTML 5 spec is because HTML 4-level browsers could never support them (not being part of HTML at the time they were made); and there is no way to allow them to do so without a JavaScript shim; thus, you may as well use AJAX.
Web pages trying to use forms with method="PUT" or method="DELETE" would fall back to the default method, GET for all current browsers. This breaks the web applications' attempts to use appropriate methods in HTML forms for the intended action, and ends up giving a worse result — GET being used to delete things! (hello crawler. oh, whoops! there goes my database)
Changing the default method for HTML <form> elements to POST would help (IMO the default should have always been POST, ever since Moasic* debuted forms in 1993), but to change the default would take at least a decade to percolate through the installed base. So in two words: ‘because legacy’. :-(
To support current browsers, authors will have to fake it with an override. I recommend authors use the widely knowna, b _method argument by including <input type=hidden name=_method value=DELETE> in their HTML; switch the form method to POST (since the request is unsafe); then add recognition of _method on the server side, which should then do whatever's necessary to mutate the request and forward it on as if it were a real DELETE request.
Note also that, since web browsers are the ultimate HATEOAS client, they need to have a new state to be transferred to them for DELETE requests. existing APIs often return 204 No Content for such requests. You should instead send back a hypermedia response with links so that the user can progress their browser state.
Also see the answers to these similar/identical questions:
Why are there are no PUT and DELETE methods on HTML forms?
Are the PUT, DELETE, HEAD, etc methods available in most web browsers?
Using PUT method in HTML form
Do Browsers support PUT requests with multipart/form data
* Mosaic, created by Marc Andreessen, also introduced the compound mistake of the <img src=…> tag — it should have been <image source=…>fallback</image>.

GET html using WWW::Mechanize causes "Forbidden"

I want to get the content of a film of imdb by using WWW::Mechanize. First of all, I have to find a way to find a respective /title/tt* url. When I have, e.g., a movie called fight club, I want to visit this link:
*ttp://www.imdb.com/find?s=all&q=fight+club
For some reason, this fails already. Heres the line that causes an error
$mech->get('http://www.imdb.com/find?s=all&q=fight+club');
error message:
Error GETing
http://www.imdb.com/find?s=all&q=fight+club:
Forbidden
If I write something like get(http://www.google.com), it works fine. What's the difference when using imdb? Any proposal for an alternative solution?
IMDB probably sniff the User-Agent string and reject WWW::Mechanize requests. The "solution" is to respect their wish to block you from interacting with the site in an automated fashion.
(Or you could read their terms and conditions very, very carefully and then change the user agent string)
Licensing IMDb Content; Consent to Use Robots and Crawlers: If you are interested in receiving our express written permission to use IMDb content for your non-personal (including commercial) use, please visit our Content Licensing section or contact our Licensing Department. We do allow the limited use of robots and crawlers, such as those from certain search engines, with our express written consent. If you are interested in receiving our express written permission to use robots or crawlers on our site, please contact our Licensing Department.
David is right, that's probably what's happening.
But did you know lots of information is available from IMDB via FTP? And that they have a number of tools you can use to get at their information other than scraping?
See http://www.imdb.com/interfaces

Is it a good practice to use an empty URL for a HTML form's action attribute? (action="")

I am wondering if anyone can give a "best practices" response to using blank HTML form actions to post back to the current page.
There is a post asking what a blank HTML form action does here and some pages like this one suggest it is fine but I'd like to know what people think.
The best thing you can do is leave out the action attribute altogether. If you leave it out, the form will be submitted to the document's address, i.e. the same page.
It is also possible to leave it empty, and any browser implementing HTML's form submission algorithm will treat it as equivalent to the document's address, which it does mainly because that's how browsers currently work:
8. Let action be the submitter element's action.
9. If action is the empty string, let action be the document's address.
Note: This step is a willful violation of RFC 3986, which would require base URL processing here. This violation is motivated by a desire for compatibility with legacy content. [RFC3986]
This definitely works in all current browsers, but may not work as expected in some older browsers ("browsers do weird things with an empty action="" attribute"), which is why the spec strongly discourages authors from leaving it empty:
The action and formaction content attributes, if specified, must have a value that is a valid non-empty URL potentially surrounded by spaces.
Actually, the Form Submission subsection of the current HTML5 draft does not allow action="". It is against the spec.
The action and formaction content attributes, if specified, must have a value that is a valid non-empty URL potentially surrounded by spaces. (emphasis added)
The quoted section in mercator's answer is a requirement on implementations, not authors. Authors must follow the author requirements. To quote How to read this specification:
In particular, there are conformance requirements that apply to producers, for example authors and the documents they create, and there are conformance requirements that apply to consumers, for example Web browsers. They can be distinguished by what they are requiring: a requirement on a producer states what is allowed, while a requirement on a consumer states how software is to act.
The change from HTML4—which did allow an empty URL—was made because “browsers do weird things with an empty action="" attribute”. Considering the reason for the change, its probably best not to do that in HTML4 either.
Not including the action attribute opens the page up to iframe clickjacking attacks, which involve a few simple steps:
An attacker wraps your page in an iframe
The iframe URL includes a query param with the same name as a form field
When the form is submitted, the query value is inserted into the database
The user's identifying information (email, address, etc) has been compromised
References
Bypassing CSRF protections with ClickJacking and HTTP Parameter Pollution
This will validate with HTML5.
<form action="#">
IN HTML 5 action="" IS NOT SUPPORTED SO DON'T DO THIS. BAD PRACTICE.
If instead you completely negate action altogether it will submit to the same page by default, I believe this is the best practice:
<form>This will submit to the current page</form>
If you are sumbitting the form using php you may want to consider the following. read more about it here.
<form method="post" action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]);?>">
Alternatively you could use # bear in mind though that this will act like an anchor and scroll to the top of the page.
<form action="#">
I think it's best to explicitly state where the form posts. If you want to be totally safe, enter the same URL the form is on in the action attribute if you want it to submit back to itself. Although mainstream browsers evaluate "" to the same page, you can't guarantee that non-mainstream browsers will.
And of course, the entire URL including GET data like Juddling points out.
Just use
?
<form action="?" method="post" enctype="multipart/form-data" name="myForm" id="myForm">
It doesn't violate HTML5 standards.
I used to do this a lot when I worked with Classic ASP. Usually I used it when server-side validation was needed of some sort for the input (before the days of AJAX). The main draw back I see is that it doesn't separate programming logic from the presentation, at the file level.
I use to do not specify action attribute at all. It is actually how my framework is designed all pages get submitted back exact to same address. But today I discovered problem. Sometimes I borrow action attribute value to make some background call (I guess some people name them AJAX). So I found that IE keeps action attribute value as empty if action attribute wasn't specified. It is a bit odd in my understanding, since if no action attribute specified, the JavaScript counterpart has to be at least undefined. Anyway, my point is before you choose best practice you need to understand more context, like will you use the attribute in JavaScript or not.
When you put empty action then some security filtration consider it malicious or phishing. Hence they can block your page. So its advisable not to keep action= blank.