IM (Instant Message) Address Standard - email

Dear stack overflowers,
I am not sure if this is the best place for this question, but I figured I'd give it a shot.
I am currently working on an API that will allow consumers to read/write data about users. i.e. name, emails, phoneNumbers, etc. And, as you could guess by the title, I am also storing ims.
Since users may contain multiple im addresses that belong to different services (e.g. skype, google talk, AIM, etc.), there is a type attribute associated with each im address.
I am at the point where I am attempting to validate the user attributes, and when I arrived to ims I was unable to find a formal specification, or normative document that dictates how these should be formatted/validated.
My question is the following:
Is there a general format that im URI's follow?
*note:*I have stumbled upon RFC 3861 that touches on im addresses. But it seems like this isn't a standard. Additionally, there is only one example here that has the following format:
im:fred#example.com
Since emails are effectively unique identifiers, it seems reasonable that they could be represented in this way.
Could anyone shed light on this?

After looking in several sites, I was unable to find a standard that applies to all IM providers. I even looked in some API documentation (Yahoo and Jabber) without any luck. If anyone else finds anything that leads them to think any different, please share the knowledge. But as for right now, it appears I am out of luck...

Related

Am I Being Hacked by Redirection?

I don't really know how to explain this, so bare with me. But our Facebook pixel detected traffic from another domain. We only have one domain. We went to see what other domain it could possibly be referencing. It turns out, this other domain was a carbon copy of our site. The only thing that was different was the web address. Does anyone have a clue what is going on? It's as though someone is retargeting our customers to a mirrored website.
We tested the foreign site by placing an order using store credit given to ourselves on the backend of our site. The order went through and instead of showing the order was placed in the US, it said it was placed in Turkey.
This is over my head and I have no clue where to start solving this issue.
I've actually seen this happen to someone else before. I'm not sure what the motive behind doing something like this is - but if the orders from the cloned store are being paid to your gateway, then the upside is that you're not losing money over it. However, I do believe that the intent is somewhat malicious.
The most logical reason I have been able to come up with is that if your store has high amounts of traffic, is well known, and has a good SEO rating, the people that are cloning your store are trying to "SEO-Hijack" you in a sense. Essentially piggybacking off of your site because of the SEO ratings it already has in order to boost their own and potentially turn it into a separate store/website later.
This isn't necessarily something that can be fixed by BigCommerce since the copy of your store isn't on the platform whatsoever, since they are essentially just piggybacking off of your SEO rating. The best option here would be to do a domain WHOIS lookup for their domain and report it as fraud to their registrar as an attempt to get legal action to be taken or a cease & desist.
Sorry that this is happening to you!
Here's a helpful explanation that I was able to find and a helpful blog post on how to prevent it and the steps to take.
Oh no, I'm sorry to hear about this! As blurfus suggested above -- Please the BigCommerce Support team to report this as soon as you can. You can find their contact information here: https://support.bigcommerce.com/s/#contact

REST HATEOAS - How does the client know link semantics?

Imagine I have a fully implemented REST API that offers HATEOAS as well.
Let's assume I browse the root and besides the self link two other links (e.g. one for /users and one for /orders) are returned. As far as I have heard, HATEOAS eliminates the need for out-of-band information. How should a client know what users means? Where are the semantics stored?
I know that is kind of a stupid question, but I really would like to know that.
Suppose you've just discovered Twitter and are using it for the very first time. In your Web browser you see a column of paragraphs with a bunch of links spread around the page. You know there's a way to do something with this, but you don't know specifically what actions are available. How do you figure out what they are?
Well, you look at the links and consider what their names mean. Some you recognize right away based on convention: As an experienced Web user, you have a pretty good idea what clicking on the "home", "search" and "sign out" links is meant to accomplish.
But other links have names you don't recognize. What does "retweet" do? What does that little star icon do?
There are basically two ways you, or anyone, will figure this out:
Through experimentation, which is to say, clicking on the links and seeing what happens, then deriving a meaning for each link from the results.
Through some source of out-of-band information, such as the online help, a tutorial found through a Google search or a friend sitting next to you explaining how the site works.
It's the same with REST APIs. (Recall that REST is intended to model the way the Web enables interaction with humans.)
Although in principle computers (or API-client developers) could deduce the semantics of link relations through experimentation, obviously this isn't practical. That leaves
Convention, based on for instance the IANA 's list of standardized link relations and their meanings.
Out-of-band information, such as API documentation.
There is nothing inconsistent in the notion of REST requiring client developers to rely on something beyond the API itself to understand the meaning of link relations. This is standard practice for humans using websites, and humans using websites is what REST models.
What REST accomplishes is removing the need for out-of-band information regarding the mechanics of interacting with the API. Going back to the Twitter example, you probably had to have somebody explain to you at some point what, exactly, the "retweet" link does. But you didn't have to know the specific URL to type in to make the retweet happen, or the ID number of the tweet you wanted to act on, or even the fact that tweets have unique IDs. The Web's design meant all this complexity was taken care of for you once you figured out which link you wanted to click.
And so it is with REST APIs. It's true that in most cases, the computer or programmer will just need to be told what each link relation means. But once they have that information, they can navigate through the entire API without needing to know anything else about the details of how it's all put together.
REST doesn't eliminate the need for out-of-band information. You still have to document your media-types. REST eliminates the need for out-of-band information in the client interaction with the API underlying protocol.
The semantics are documented by the media-type. Your API root is a resource of a media-type, let's say something like application/vnd.mycompany.dashboard.v1+json, and the documentation for that media type would explain that the link relation users leads to a collection of application/vnd.mycompany.user.v1+json related to the currently authenticated user, and orders leads to a collection of application/vnd.mycompany.order.v1+json.
The library analogy works here. When you enter a library after a book, you know how to read a book, you know how to walk to a bookshelf and pick up the book, and you know how to ask the librarian for directions. Each library may have a different layout and bookshelves may be organized differently, but as long as you know what you're looking for and you and the librarian speak the same language, you can find it. However, it's too much to expect the librarian to teach you what a book is.

SNDSMTPEMM NOTE Limited to 400 Characters

I am trying to get our iSeries 6.1 machine to send email through our Exchange server. I can do it with SNDDST and with SNDSMTPEMM, but both are very limiting. I need support for basic HTML, and for PDF attachments. I thought I could get them both from SNDSMTPEMM, but now I see that the body parameter for SNDSMTPEMM (NOTE) is limited to 400 characters. Is it possible that this command allows 10 attachments but less than a paragraph of text?
I would like to know if anyone is using this command, and if I am missing something about it that would allow me to create an actual email message.
If indeed I can't put more than 400 characters into the body of an email with this command, I have read about MMAIL and MAILTOOL and I am curious if anyone knows if this message length restriction exists for those as well?
It will be a very hard sell for our main programmer to install any third-party anything to get this working, so I would love to be able to do it with SNDDST of SNDSMTPEMM (or some other built in I haven't found yet).
I don't currently need to be able to send to multiple recipients, but I do need to be able to attach a couple of attachments (where SNDDST fails for me). I also can't use attachments with an *LMSG.
I'm sorry if this is the wrong place for this kind of post - I find it very difficult to find the right place.
The SNDSMTPEMM command is indeed limited to 400 characters in the message body, according to the documentation.
Where I work, we still mainly use MMAIL, which used to be free but now requires a $50 "donation" (and lots of hoops to jump through just to register). It doesn't have that message length limitation. It comes with several commands for ease of use, and a service program for more fine-grained control over how the message is built. Once you download it, you have access to the source, so you can really muck around with it if you have to. (The donation also allows you to download a multitude of other utilities from Easy400.net.)
A better but more expensive option is Bradley Stone's MAILTOOL. It's still competitively priced, as far as commercial IBM midrange software goes. If you go that route, it's probably worth getting the Plus! add-on, which side-steps IBM's native SMTP, a recurring source of headaches. (MMAIL and the basic MAILTOOL rely on native SMTP.)
The best place for this kind of post, at least for now, is the Midrange-L mailing list at midrange.com. When it comes to AS/400, iSeries, and IBM i stuff, that community is currently much more active than Stack Overflow, and they welcome open-ended discussion and "what do you recommend?" posts, which are discouraged here. You can find some discussion on the command you mentioned, and some alternatives, in this thread.

HATEOAS Client Design

I have read a lot of discussions here on SO, watched Jon Moore's presentation (which explained a lot, btw) and read over Roy Fielding's blog post on HATEOAS but I still feel a little in the dark when it comes to client design.
API Question
For now, I'm simply returning xhtml with forms/anchors and definition lists to represent the resources. The following snippet details how I lay out forms/anchors/lists.
# anchors
<li class='docs_url/#resourcename'>
<a rel='self' href='resource location'></a>
</li>
# forms
<form action='action_url' method='whatever_method' class='???'></form>
# lists
<dl class='docs_url/#resourcename'>
<dt>property</dt>
<dd>value</dd>
</dl>
My question is mainly for forms. In Jon's talk he documents form types such as (add_location_form) etc. and the required inputs for them. I don't have a lot of resources but I was thinking of abstract form types (add , delete, update, etc) and just note in the documentation that for (add, update) that you must send a valid representation of the target resource and with delete that you must send the identifier.
Question 1: With the notion of HATEOAS, shouldn't we really just make the client "discover" the form (by classing them add,delete,update etc) and just send back all the data we gave them? My real question here (not meant to be a discussion) is does this follow good practice?
Client Question
Following HATEOAS, with our actions on resources being discover-able, how does this effect client code (consumers of the api) and their ui. It sounds great that following these principals that the UI should only display actions that are available but how is that implemented?
My current approach is parsing the response as xml and usin xpath to look for the actions which are known at the time of client development (documented form classes ie. add,delete,update) and display the ui controls if they are available.
Question 2: Am I wrong in my way of discovery? Or is this too much magic as far as the client is concerned ( knowing the form classes )? Wouldn't this assume that the client knows which actions are available for each resource ( which may be fine because it is sort of a reason for creating the client, right? ) and should the mapping of actions (form classes) to resources be documented, or just document the form classes and allow the client (and client developer) to research and discover them?
I know I'm everywhere with this, but any insight is much appreciated. I'll mark answered a response that answers any of these two questions well. Thanks!
No, you're pretty much spot on.
Browsers simply render the HTML payload and rely on the Human to actually interpret, find meaning, and potentially populate the forms appropriately.
Machine clients, so far, tend to do quite badly at the "interpret" part. So, instead developers have to make the decisions in advance and guide the machine client in excruciating detail.
Ideally, a "smart" HATEOS client would have certain facts, and be aware of context so that it could better map those facts to the requirements of the service.
Because that's what we do, right? We see a form "Oh, they want name, address, credit card number". We know not only what "name", "address", and "credit card" number mean, we also can intuit that they mean MY name, or the name of the person on the credit card, or the name of the person being shipped to.
Machines fail pretty handily at the "intuit" part as well. So as a developer, you get to code in the logic of what you think may be necessary to determine the correct facts and how they are placed.
But, back to the ideal client, it would see each form, "know" what the fields wanted, consult its internal list of "facts", and then properly populate the payload for the request and finally make the request.
You can see that a trivial, and obviously brittle, way to do that is to simply map the parameter names to the internal data. When the parameter name is "name", you may hard code that to something like: firstName + " " + lastName. Or you may consider the actual rel to "know" they're talking about shipping, and use: shipTo.firstName + " " + shipTo.lastName.
Over time, ideally you could build up a collection of mappings and such so that if suddenly a payload introduced a new field, and it happened to be a field you already know about, you could fill that in as well "automatically" without change to the client.
But the simply truth is, that while this can be done, it's pretty much not done. The semantics are usually way to vague, you'd have to code in new "intuition" each time for each new payload anyway, so you may as well code to the payload directly and be done with it.
The key thing, though, especially about HATEOS, is that you don't "force" your data on to the server. The server tells you what it wants, especially if they're giving you forms.
So the thought process is not "Oh, if I want a shipping invoice, I see that, right now, they want name, address and order number, and they want it url encoded, and they want it sent to http://example.com/shipping_invoice. so I'll just always send: name + "&" + address + "&" + orderNumber every time to http://example.com/shipping_invoice. Easy!".
Rather what you want to do is "I see they want a name, address, and order number. So what I'll do is for each request, I will read their form. I will check what fields they want each time. If they want name, I will give them name. If they want address, I will give them address. If they want order number, I will give them order number. And if they have any PRE-POPULATED fields (or even "hidden" fields), I will send those back too, and I will send it in the encoding they asked for, assuming I support it, to the URL I got from the action field of the FORM tag.".
You can see in the former case, you're ASSUMING that they want that payload every time. Just like if you were hard coding URLs. Whereas with the second, maybe they decided that the name and address are redundant, so they don't ask for it any more. Maybe they added some nice defaults for new functionality that you may not support yet. Maybe they changed the encoding to multi-part? Or changed the endpoint URL. Who knows.
You can only send what you know when you code the client, right? If they change things, then you can only do what you can do. If they add fields, hopefully they add fields that are not required. But if they break the interface, hey, they break the interface and you get to log an error. Not much you can do there.
But the more that you leverage HATEOS part, the more of it they make available to you so you can be more flexible: forms to fill out, following redirects properly, paying attention to encoding and media types, the more flexible your client becomes.
In the end, most folks simply don't do it in their clients. They hard code the heck out of them because it's simple, and they assume that the back end is not changing rapidly enough to matter, or that any downtime if such change does happen is acceptable until they correct the client. More typically, especially with internal systems, you'll simply get an email from the developers "hey were changing XYZ API, and it's going live on March 1st. Please update your clients and coordinate with the release team during integration testing. kthx".
That's just the reality. That doesn't mean you shouldn't do it, or that you shouldn't make your servers more friendly to smarter clients. Remember a bad client that assumes everything does not invalidate a good REST based system. These systems work just fine with awful clients. wget ftw, eh?

Email obfuscation question

Yes, I realize this question was asked and answered, but I have specific questions about this that I feel were not clear on that thread and I'd prefer not to get lost in the shuffle on another thread as well.
Previous threads said that rendering the email address to an image the way Facebook does is overkill and unprofessional user experience for business/professional websites. And it seems that the general consensus is to use a JavaScript document.write solution using html entities or some other method that breaks up and/or makes the string unreadable by a simple bot. The application I'm building doesn't even need the "mailto:" functionality, I just need to display the email address. Also, this is a business web application, so it needs to look/act as professional as possible. Here are my questions:
If I go the document.write route and pass the html entity version of each character, are there no web crawlers sophisticated enough to execute the javascript and pull the rendered text anyway? Or is this considered best practice and completely (or almost completely) spammer proof?
What's so unprofessional about the image solution? If Facebook is one of the highest trafficked applications in the world and not at all run by amateurs, why is their method completely dismissed in the other thread about this subject?
If your answer (as in the other thread) is to not bother myself with this issue and let the users' spam filters do all the work, please explain why you feel this way. We are displaying our users' email addresses that they have given us, and I feel responsible to protect them as much as I can. If you feel this is unnecessary, please explain why.
Thanks.
It is not spammer proof. If someone looks at the code for your site and determines the pattern that you are using for your email addresses, then specific code can be written to try and decipher that.
I don't know that I would say it is unprofessional, but it prevents copy-and-paste functionality, which is quite a big deal. With images, you simply don't get that functionality. What if you want to copy a relatively complex email address to your address book in Outlook? You have to resort to typing it out which is prone to error.
Moving the responsibility to the users spam filters is really a poor response. While I believe that users should be diligent in guarding against spam, that doesn't absolve the person publishing the address from responsibility.
To that end, trying to do this in an absolutely secure manner is nearly impossible. The only way to do that is to have a shared secret which the code uses to decipher the encoded email address. The problem with this is that because the javascript is interpreted on the client side, there isn't anything that you can keep a secret from scrapers.
Encoders for email addresses nowadays generally work because most email bot harvesters aren't going to concern themselves with coding specifically for every site. They are going to try and have a minimal algorithm which will get maximum results (the payoff isn't worth it otherwise). Because of this, simple encoders will defeat most bots. But if someone REALLY wants to get at the emails on your site, then they can and probably easily as well, since the code that writes the addresses is publically available.
Taking all this into consideration, it makes sense that Facebook went the image route. Because they can alter the image to make OCR all but impossible, they can virtually guarantee that email addresses won't be harvested. Given that they are probably one of the largest email address repositories in the world, it could be argued that they carry a heavier burden than any of us, and while inconvenient, are forced down that route to ensure security and privacy for their vast user base.
Quite a few reasons Javascript is a good solution for now (that may change as the landscape evolves).
Javascript obfuscation is a better mouse trap for now
You just need to outrun the others. As long as there are low hanging fruit, spammers will go for those. So unless everyone starts moving to javascript, you're okay for now at least
most spammers use http based scripts which GET and parse using regex. using a javascript engine to parse is certainly possible but will slow things down
Regarding the facebook solution, I don't consider it unprofessional but I can clearly see why purists may disagree.
It breaks accessibility standards (cannot be parsed by browsers, voice readers or be clicked.
It breaks semantic construct (it's an image, not a mailto link anymore)
It breaks the presentational layer. If you increase browser default font size or use high contrast custom CSS, it won't apply to the email.
Here is a nice blog post comparing a few methods, with benchmarks.
http://techblog.tilllate.com/2008/07/20/ten-methods-to-obfuscate-e-mail-addresses-compared/