How do I reduce the number of 301 redirect entries using wildcards and variables in Squarespace? - redirect

I recently renamed all of the URLs that make up my blog... and have written redirects for almost every page... using wildcards where I can... keeping in mind... all that I know is the * wildcard at this time...
Here is an example of what I have...
/season-1/2017/1/1/snl-s01e01-host-george-carlin -> /season-1/snl-s01e01-george-carlin 301
I want to write a catch-all that will redirect all 38 seasons of reviews with one redirect entry... but I can't figure out how to get rid of just the word "host" between s01e01- and -george-carlin... and was thinking it would work something like this...
/season-*/*/*/*/snl-s*e*-host-*-* -> /season-*/snl-s*e*[code to remove the word "host"]-*-* 301
Is that even close to being correct? Do I need that many *s
Thanks in advance for any help...

Unfortunately, you won't be able to reduce the number of individual redirect entries using the redirect features that Squarespace has to offer, namely the wildcard (*) and a single variable ([name]). Multiple variables would be needed, but only [name] is supported.
The closest you can get is:
/season-1/*/*/*/snl-s01e01-host-[name] -> /season-1/snl-s01e01-[name] 301
But, if I'm understanding things, while the above redirect appears more general, it would still need to be copy/pasted for each post individually. So although it demonstrates the best that could be achieved, it is not a technical improvement.
Therefore, there are only two alternatives:
Create a Google Sheet (or other spreadsheet) where the old URLs are copy/pasted in column one, a formula using arrayformula and regular expressions to parse the old URL and generate the new URL is added in column two, and in column three a formula is written to join the two cells with -> and 301. With that done, you could click, drag and highlight all cells in column 3, copy, and paste them in the "URL Shortcuts" text area in Squarespace.It can be quite time consuming to figure out, write and test the correct formula, but it does avoid having to manually type out every redirect. Whether it is less time/effort in total depends on the number of redirects and one's proficiency with writing spreadsheet formulas.It could be that using the redirect code above would simplify the formula that'd need to be written in the spreadsheet, which may save some time.
Another alternative would be to remove your redirects and instead handle the redirect via JavaScript added to the 404/Page-not-found page. Because it sounds like you already have all of the redirects in place but are simply trying to reduce the overall number, I wouldn't recommend changing to a JavaScript-based approach. There are other drawbacks to using JavaScript, in any case.

Related

RegEx for google event

Hello I would like help writing a regex that captures all urls with the word confirmation in it.
Ex:
https://example.com/this-is-a-confirmation
https://example.com/confirmation
https://example.com/folder/this-is-confirmation
I am trying to set up a Goal that captures all visits to any confirmation page on the website as by visiting that page you most likely filled out a form to download an asset
Thanks!
So, basically all links end with confirmation?
Then you could just use a very stupid and simple regex like:
confirmation$
If you only want the URL to contain confirmation:
confirmation
is already enough (Demo). That's basically the same as if you were using String.endsWith(str) and String.contains(str).
Depending on the way you're evaluating this, you may need to allow chars before the search term to produce a full match (not only a partial match):
.*confirmation
or
.*confirmation.*
if you want to allow any text after the search term.

How to process a simple loop in Perl's WWW::Mechanize?

Especially interesting for me as a PHP/Perl-beginner is this site in Switzerland:
see this link:http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de&webgrab_path=http://esv2000.edi.admin.ch/d/entry.asp?Id=1308
Which has a dataset of 2700 foundations. All the data are free to use with no limitations copyrights on it.
what we have so far: Well the harvesting task should be no problem if i take WWW::Mechanize - particularly for doing the form based search and selecting the individual entries. Hmm - i guess that the algorithm would be basically 2 nested loops: the outer loop runs the form based search, the inner loop processes the search results.
The outer loop would use the select() and the submit_form() functions on the second search form on the page. Can we use DOM processing here. Well - how can we get the get the selection values.
The inner loop through the results would use the follow link function to get to the actual entries using the following call.
$mech->follow_link(url_regex => qr/webgrab_path=http:\/\/evs2000.*\?
Id=\d+$/, n => $result_nbr);
This would forward our mechanic browser to the entry page. Basically the URL query looks for links that have the webgrap_path to Id pattern, which is unique for each database entry. The $result_nbr variable tells mecha which one of the results it should follow next.
If we have several result pages we would also use the same trick to traverse through the result pages. For the semantic extraction of the entry information,we could parse the content of the actual entries with XML:LibXML's html parser (which works fine on this page), because it gives you some powerful DOM selection (using XPath) methods.
Well the actual looping through the pages should be doable in a few lines of perl of max. 20 lines - likely less.
But wait: the processing of the entry pages will then be the most complex part
of the script.
Approaches: In principle we could do the same algorithm with a single while loop
if we use the back() function smartly.
Can you give me a hint for the beginning - the processing of the entry pages - doing this in Perl:: Mechanize
"Which has a dataset of 2700 foundations. All the data are free to use with no limitations copyrights on it."
Not true. See http://perlmonks.org/?node_id=905767
"The data is copyrighted even though it is made available freely: "Downloading or copying of texts, illustrations, photos or any other data does not entail any transfer of rights on the content." (and again, in German, as you've been scraping some other German list to spam before)."

Which hash function can this be?

I have some strings and some hashes of them, but I don't know which hash function is used. Any idea?
String hash
NN34W f8b46bcdc3b3c92
EM3M3 d8015ca876fd051
HXDKD a740e97464e5dfe
AKREJ aa7aa2dadfcbe53
3bNMK 0f11440639191d9
Edit:
Thank for answers, it's a hash of the captcha.
https://registracia.azet.sk/
If you check URL of captcha image, on the end is HASH value. This
On the server are send in HTTP POST are send TEXT: (P92M4) and HASH (72fec89a2e0ade2) and other values.
I like know how comptute hash of the TEXT P92M4, and control with HASH value, which is send on server.
Because I like make own captcha system for my school project, so I first analyzing situation and weakness.
As I understand your situation, a POST request sends both the "text" and the "hash" to the CAPTCHA server. This then uses whatever hash function they use to hash your text, checks to see if it matches the hash, and decides whether or not you succeeded. Presumably, the server sends you the image, as well as the hash, and then you enter the text.
As such, if you figured out the hashing function, you'd have completely broken this CAPTCHA system: All you would need to do is hash any string using their hashing function, and then when sending your POST request, ignore the hash they sent you and merely send them your computed text and hash pair. Thus, you could very easily automate successfully passing the CAPTCHA challenge.
To illustrate how difficult "reversing" the hash might be, consider the following hash that they very well might use:
Split the TEXT up alternating letters: thus ABCDE becomes ACE and BD
md5 the two halves using salts "fj49w0utw4a" and "r8h3wlsd"
md5("fj49w0utw4a"."ACE") is 115c05f0e5300f958ba01caa64b989f
md5("r8h3wlsd"."BD") is 74eecae86ef46382eb95443a1b1fa8f5
Take every 3rd char of the first string and every 4th char of the second, and alternate them until you have 15 chars
115c05f0e5300f958ba01caa64b989f becomes 55e09b1ab9
74eecae86ef46382eb95443a1b1fa8f5 becomes e8425af5
Final hash value for "ABCDE": 5e58e40295ba1fa
There is really no way you are ever going to reverse engineer that.
UPDATE
Note that CAPTCHAs as described above (and implemented on that site) are extremely insecure, as they only require one valid text/hash combination to be known
To demonstrate, use Firebug or equivalent and navigate to the CAPTCHA area of the form. We will be editing some hidden values.
Change the form[captcha_url] value from https://pokec.azet.sk/sluzby/system/captcha/[somehash] to https://pokec.azet.sk/sluzby/system/captcha/ee2be1f239e5d17
Change the form[captcha_hash] value from [somehash] to ee2be1f239e5d17
Regardless of what the picture says, type "P22KD" for the CAPTCHA
There are several ways to mitigate this vulnerability. As Tangrs suggested, you can store the hash value in a session variable so that it cannot be manipulated by the client. Less elegant but also effective is to store the submitted CAPTCHA in a database and not allow duplicate CAPTCHAs, as is implemented on the link in the question. This is fine, until you start running out of unused CAPTCHAs and end up getting collisions.
Seems smaller than any industry hash... possibly it's propriety?
A bit more info would help though, what language, where did you get it from?

How do I create a manual link on a tree in Oracle APEX when Session State Protection is turned on?

Friends,
I'm facing another challenge in APEX and I hope you can help.
I have created a tree using the method described in John & Scott's superb book, "Pro Application Express" whereby the page link is stored in a table. Below is an example:
go to a page passing some parameters
f?p=&APP_ID.:3:&SESSION.::::P3_IDENTIFIER,P3_FAMILY_NAME:&P2_IDENTIFIER.,&P2_FAMILY_NAME.
When the page is run this works as expected. I can expand the tree and navigate to the page passing parameters if required.
However when I turned on session state protection these "hand crafted" links stopped working. Which I expected because the link contains no checksum.
After some investigation I see I have to use APEX_UTIL.PREPARE_URL to generate the URL with a checksum. Unfortunately this is where I run into problems. I can't seem to be able to pass the parameters values to the calling page.
The original tree query was:
select "IDENTIFIER" id,
"PARENT_IDENTIFIER" pid,
"TITLE" name,
"LINK" link,
null a1,
null a2
from <some table>
I then changed this to use APEX_UTIL.PREPARE_URL:
....
APEX_UTIL.PREPARE_URL('f?p='||:APP_ID||':3:'||:APP_SESSION||'::::P3_IDENTIFIER,P3_FAMILY_NAME:&P2_IDENTIFIER.,&P2_FAMILY_NAME.') link,
...
and this works, the page is called and I can see the values of the parameters passed. But I can't use this method as it is restricted to the one page!
Finally I tried storing the page number, parameters and parameter values in different columns of the table that the tree is based on and then bring them together:
...
APEX_UTIL.PREPARE_URL('f?p='||:APP_ID||':'||navigate_to_page||':'||:APP_SESSION||'::::'||parameters||':'||parameter_values) link,
...
Where:
navigate to page has the value of: 3
parameters has the value of: P3_IDENTIFIER,P3_FAMILY_NAME
parameter_values has the values of: &P2_IDENTIFIER.,&P2_FAMILY_NAME.
This now calls the page, but the parameter values have become literals. so where I'm expecting an identifier I see &P2_IDENTIFIER and ditto for family name.
What am I doing wrong? How can I pass values to my called page using apex_util_prepare_url?
In case of need, my environment details are: Apex 3.2.1, Oracle Application Server 10.1.2.3. Oracle Database 10.2.0.3
Thanks in advance for any help you may be able to provide.
I think you'll need to resolve those variables, using the v() function:
APEX_UTIL.PREPARE_URL('f?p='||:APP_ID
||':'||navigate_to_page
||':'||:APP_SESSION
||'::::'||parameters
||':'||v('P2_IDENTIFIER')||','||v('P2_FAMILY_NAME')) link,
On a side note, you might need to be careful about P2_FAMILY_NAME since it's being used in the url; it sounds like a plain text field which contains user-entered data?

RESTful URL design for search

I'm looking for a reasonable way to represent searches as a RESTful URLs.
The setup: I have two models, Cars and Garages, where Cars can be in Garages. So my urls look like:
/car/xxxx
xxx == car id
returns car with given id
/garage/yyy
yyy = garage id
returns garage with given id
A Car can exist on its own (hence the /car), or it can exist in a garage. What's the right way to represent, say, all the cars in a given garage? Something like:
/garage/yyy/cars ?
How about the union of cars in garage yyy and zzz?
What's the right way to represent a search for cars with certain attributes? Say: show me all blue sedans with 4 doors :
/car/search?color=blue&type=sedan&doors=4
or should it be /cars instead?
The use of "search" seems inappropriate there - what's a better way / term? Should it just be:
/cars/?color=blue&type=sedan&doors=4
Should the search parameters be part of the PATHINFO or QUERYSTRING?
In short, I'm looking for guidance for cross-model REST url design, and for search.
[Update] I like Justin's answer, but he doesn't cover the multi-field search case:
/cars/color:blue/type:sedan/doors:4
or something like that. How do we go from
/cars/color/blue
to the multiple field case?
For the searching, use querystrings. This is perfectly RESTful:
/cars?color=blue&type=sedan&doors=4
An advantage to regular querystrings is that they are standard and widely understood and that they can be generated from form-get.
The RESTful pretty URL design is about displaying a resource based on a structure (directory-like structure, date: articles/2005/5/13, object and it's attributes,..), the slash / indicates hierarchical structure, use the -id instead.
Hierarchical structure
I would personaly prefer:
/garage-id/cars/car-id
/cars/car-id #for cars not in garages
If a user removes the /car-id part, it brings the cars preview - intuitive. User exactly knows where in the tree he is, what is he looking at. He knows from the first look, that garages and cars are in relation. /car-id also denotes that it belongs together unlike /car/id.
Searching
The searchquery is OK as it is, there is only your preference, what should be taken into account. The funny part comes when joining searches (see below).
/cars?color=blue;type=sedan #most prefered by me
/cars;color-blue+doors-4+type-sedan #looks good when using car-id
/cars?color=blue&doors=4&type=sedan #also possible, but & blends in with text
Or basically anything what isn't a slash as explained above.
The formula: /cars[?;]color[=-:]blue[,;+&], though I wouldn't use the & sign as it is unrecognizable from the text at first glance if that's your thing.
** Did you know that passing JSON object in URI is RESTful? **
Lists of options
/cars?color=black,blue,red;doors=3,5;type=sedan #most prefered by me
/cars?color:black:blue:red;doors:3:5;type:sedan
/cars?color(black,blue,red);doors(3,5);type(sedan) #does not look bad at all
/cars?color:(black,blue,red);doors:(3,5);type:sedan #little difference
possible features?
Negate search strings (!)
To search any cars, but not black and red:
?color=!black,!red
color:(!black,!red)
Joined searches
Search red or blue or black cars with 3 doors in garages id 1..20 or 101..103 or 999 but not 5
/garage[id=1-20,101-103,999,!5]/cars[color=red,blue,black;doors=3]
You can then construct more complex search queries. (Look at CSS3 attribute matching for the idea of matching substrings. E.g. searching users containing "bar" user*=bar.)
Conclusion
Anyway, this might be the most important part for you, because you can do it however you like after all, just keep in mind that RESTful URI represents a structure which is easily understood e.g. directory-like /directory/file, /collection/node/item, dates /articles/{year}/{month}/{day}.. And when you omit any of last segments, you immediately know what you get.
So.., all these characters are allowed unencoded:
unreserved: a-zA-Z0-9_.-~
Typically allowed both encoded and not, both uses are then equivalent.
special characters: $-_.+!*'(),
reserved: ;/?:#=&
May be used unencoded for the purpose they represent, otherwise they must be encoded.
unsafe: <>"#%{}|^~[]`
Why unsafe and why should rather be encoded: RFC 1738 see 2.2
Also see RFC 1738#page-20 for more character classes.
RFC 3986 see 2.2
Despite of what I previously said, here is a common distinction of delimeters, meaning that some "are" more important than others.
generic delimeters: :/?#[]#
sub-delimeters: !$&'()*+,;=
More reading:
Hierarchy: see 2.3, see 1.2.3
url path parameter syntax
CSS3 attribute matching
IBM: RESTful Web services - The basics
Note: RFC 1738 was updated by RFC 3986
Although having the parameters in the path has some advantages, there are, IMO, some outweighing factors.
Not all characters needed for a search query are permitted in a URL. Most punctuation and Unicode characters would need to be URL encoded as a query string parameter. I'm wrestling with the same problem. I would like to use XPath in the URL, but not all XPath syntax is compatible with a URI path. So for simple paths, /cars/doors/driver/lock/combination would be appropriate to locate the 'combination' element in the driver's door XML document. But /car/doors[id='driver' and lock/combination='1234'] is not so friendly.
There is a difference between filtering a resource based on one of its attributes and specifying a resource.
For example, since
/cars/colors returns a list of all colors for all cars (the resource returned is a collection of color objects)
/cars/colors/red,blue,green would return a list of color objects that are red, blue or green, not a collection of cars.
To return cars, the path would be
/cars?color=red,blue,green or /cars/search?color=red,blue,green
Parameters in the path are more difficult to read because name/value pairs are not isolated from the rest of the path, which is not name/value pairs.
One last comment. I prefer /garages/yyy/cars (always plural) to /garage/yyy/cars (perhaps it was a typo in the original answer) because it avoids changing the path between singular and plural. For words with an added 's', the change is not so bad, but changing /person/yyy/friends to /people/yyy seems cumbersome.
To expand on Peter's answer - you could make Search a first-class resource:
POST /searches # create a new search
GET /searches # list all searches (admin)
GET /searches/{id} # show the results of a previously-run search
DELETE /searches/{id} # delete a search (admin)
The Search resource would have fields for color, make model, garaged status, etc and could be specified in XML, JSON, or any other format. Like the Car and Garage resource, you could restrict access to Searches based on authentication. Users who frequently run the same Searches can store them in their profiles so that they don't need to be re-created. The URLs will be short enough that in many cases they can be easily traded via email. These stored Searches can be the basis of custom RSS feeds, and so on.
There are many possibilities for using Searches when you think of them as resources.
The idea is explained in more detail in this Railscast.
Justin's answer is probably the way to go, although in some applications it might make sense to consider a particular search as a resource in its own right, such as if you want to support named saved searches:
/search/{searchQuery}
or
/search/{savedSearchName}
I use two approaches to implement searches.
1) Simplest case, to query associated elements, and for navigation.
/cars?q.garage.id.eq=1
This means, query cars that have garage ID equal to 1.
It is also possible to create more complex searches:
/cars?q.garage.street.eq=FirstStreet&q.color.ne=red&offset=300&max=100
Cars in all garages in FirstStreet that are not red (3rd page, 100 elements per page).
2) Complex queries are considered as regular resources that are created and can be recovered.
POST /searches => Create
GET /searches/1 => Recover search
GET /searches/1?offset=300&max=100 => pagination in search
The POST body for search creation is as follows:
{
"$class":"test.Car",
"$q":{
"$eq" : { "color" : "red" },
"garage" : {
"$ne" : { "street" : "FirstStreet" }
}
}
}
It is based in Grails (criteria DSL): http://grails.org/doc/2.4.3/ref/Domain%20Classes/createCriteria.html
This is not REST. You cannot define URIs for resources inside your API. Resource navigation must be hypertext-driven. It's fine if you want pretty URIs and heavy amounts of coupling, but just do not call it REST, because it directly violates the constraints of RESTful architecture.
See this article by the inventor of REST.
In addition i would also suggest:
/cars/search/all{?color,model,year}
/cars/search/by-parameters{?color,model,year}
/cars/search/by-vendor{?vendor}
Here, Search is considered as a child resource of Cars resource.
There are a lot of good options for your case here. Still you should considering using the POST body.
The query string is perfect for your example, but if you have something more complicated, e.g. an arbitrary long list of items or boolean conditionals, you might want to define the post as a document, that the client sends over POST.
This allows a more flexible description of the search, as well as avoids the Server URL length limit.
RESTful does not recommend using verbs in URL's /cars/search is not restful. The right way to filter/search/paginate your API's is through Query Parameters. However there might be cases when you have to break the norm. For example, if you are searching across multiple resources, then you have to use something like /search?q=query
You can go through http://saipraveenblog.wordpress.com/2014/09/29/rest-api-best-practices/ to understand the best practices for designing RESTful API's
Though I like Justin's response, I feel it more accurately represents a filter rather than a search. What if I want to know about cars with names that start with cam?
The way I see it, you could build it into the way you handle specific resources:
/cars/cam*
Or, you could simply add it into the filter:
/cars/doors/4/name/cam*/colors/red,blue,green
Personally, I prefer the latter, however I am by no means an expert on REST (having first heard of it only 2 or so weeks ago...)
My advice would be this:
/garages
Returns list of garages (think JSON array here)
/garages/yyy
Returns specific garage
/garage/yyy/cars
Returns list of cars in garage
/garages/cars
Returns list of all cars in all garages (may not be practical of course)
/cars
Returns list of all cars
/cars/xxx
Returns specific car
/cars/colors
Returns lists of all posible colors for cars
/cars/colors/red,blue,green
Returns list of cars of the specific colors (yes commas are allowed :) )
Edit:
/cars/colors/red,blue,green/doors/2
Returns list of all red,blue, and green cars with 2 doors.
/cars/type/hatchback,coupe/colors/red,blue,green/
Same idea as the above but a lil more intuitive.
/cars/colors/red,blue,green/doors/two-door,four-door
All cars that are red, blue, green and have either two or four doors.
Hopefully that gives you the idea. Essentially your Rest API should be easily discoverable and should enable you to browse through your data. Another advantage with using URLs and not query strings is that you are able to take advantage of the native caching mechanisms that exist on the web server for HTTP traffic.
Here's a link to a page describing the evils of query strings in REST: http://web.archive.org/web/20070815111413/http://rest.blueoxen.net/cgi-bin/wiki.pl?QueryStringsConsideredHarmful
I used Google's cache because the normal page wasn't working for me here's that link as well:
http://rest.blueoxen.net/cgi-bin/wiki.pl?QueryStringsConsideredHarmful