Salesforce.com Id attribute seems to have a 15 and 18 character value, whats the difference? - soap

When using the SOAP API to work with salesforce.com (SFDC) it seems that the primary key in the underlying database is Id. Well there seems to be two representations of this value as either a 15 character version or an 18 character version.
I have been using the 18 since it is clearly more specific, but what is contained in the last three digits, that they can be dropped, seemingly?
Anyone understand what this is all about?

From the Web Services API Developer's Guide:
ID fields in the Salesforce.com user
interface contain 15-character,
base-62, case-sensitive strings. Each
of the 15 characters can be a numeric
digit (0-9), a lowercase letter (a-z),
or an uppercase letter (A-Z). Two
unique IDs may only be different by a
change in case.
Because there are applications like
Access which do not recognize that
50130000000014c is a different ID from
50130000000014C, an 18-digit,
case-safe version of the ID is
returned by all API calls. The 18
character IDs have been formed by
adding a suffix to each ID in the
Force.com API. 18-character IDs can be
safely compared for uniqueness by
case-insensitive applications, and can
be used in all API calls when
creating, editing, or deleting data.
If you need to convert the
18-character ID to a 15-character
version, truncate the last three
characters. Salesforce.com recommends
that you use the 18-character ID.

I know this is an old post, but just in case it is useful to someone...
If you want to do ad-hoc conversions of Id's, rather than programatically, then this Chrome extension makes it easy:
https://chrome.google.com/webstore/detail/sf-15-to-18/cogllpmaoflgaekieefhmglbpgdgmoeg
FYI - I'm the developer. Please use the feedback form on the app if you'd like to suggest any improvements or additional functionality.
Thanks!

Related

How to implement MongoDB ObjectId validation from scratch?

I'm developing a front-end app where I want to support searching data by the id, so I'm going to have an "object id" field. I want to validate the object id to make sure it's a valid MongoDB ObjectId before sending it to the API.
So I searched on how to do it and I found this thread, where all the answers suggest using a implementation provided by a MongoDB driver or ORM such as mongodb or mongose. However, I don't want to go that way, because I don't want to install an entire database driver/ORM in my front-end app just to use some id validation - I'd rather implement the validation myself.
Unfortunately, I couldn't find an existing implementation. Then I tried checking the ObjectId spec and implementing the validation myself, but that didn't work out either.
The specification says...
The ObjectID BSON type is a 12-byte value consisting of three different portions (fields):
a 4-byte value representing the seconds since the Unix epoch in the highest order bytes,
a 5-byte random number unique to a machine and process,
a 3-byte counter, starting with a random value.
Which doesn't make much sense to me. When it says the ObjectId has 12 bytes, it makes me think that the string representation is going to have 12 characters (1 byte = 1 char), but it doesn't. Most object ids have 24 characters.
Finally, I searched mongodb's and mongoose's source code but I didn't had much luck with that either. The best I could do was finding this line of code, but I don't know where to go from there.
TL;DR: What is the actual algorithm to check if a given string is a valid MongoDB Object Id?
You find is correct, you just stopped too early. the isValid comes from the underlying bson library: https://github.com/mongodb/js-bson/blob/a2a81bc1bc63fa5bf3f918fbcaafef25aca2df9d/src/objectid.ts#L297
And yes, you get it right - there is not much to validate. Any 12 bytes can be an object ID. The reason you see 24 characters is because not all 256 ASCII are printable/readable, so the ObjectID is usually presented in hex format - 2 characters per byte. The regexp to validate 12-bytes hex representation would be /[0-9a-f]{24}/i
TL;DR: check the constructor of ObjectId in the bson library for the official validation algorithm
Hint: you don't need most of it, as you are limited to string input on frontend.

UUID for a page content in AEM across author and it's associated publish servers are different

A page in author with UUID(jcr:uuid) is activated and its content is replicated onto the 3 associated publish servers.
The content available in all the 3 publish servers has different UUIDs. So, considering the same content across all the 4 instances on AEM (1 author + 3 publish), how to associate with something unique?
I'm implementing a solution where I need to associate a unique id that can be mapped to the individual content across all the instances.
Approaches that I've tried till now:
Used the content path - to generate a unique id - by removing the '/' & '-' in the path.
The issue faced - For some paths this can be more than 128 chars which is the limit for the service to accept a unique id.
If I generate a unique id programmatically it will work, but how can I try to use that to track the back content? As I cannot store this programmatically created id on the jcr:content and activate the page.
Issues - If I replicate the page, it will change the activation date as well- which is also important metadata for the content.
What can be the most feasible solution for the use case? Kindly help with suggestions and possible solutions.
You could use a hash of the content path. Easiest way to get a hash is using hashCode(). For compactness, use the Base64 representation of the hash bytes and truncate after a predetermined number of chars.

Is it legitimate to insert UUIDs into Postgres that have been generated by a client application?

The normal MO for creating items in a database is to let the database control the generation of the primary key (id). That's usually true whether you're using auto-incremented integer ids or UUIDs.
I'm building a clientside app (Angular but the tech is irrelevant) that I want to be able to build offline behaviour into. In order to allow allow offline object creation (and association) I need the the client appplication to generate primary keys for new objects. This is both to allow for associations with other objects created offline and also to allow for indempotence (making sure I don't accidentally save the same object to the server twice due to a network issue).
The challenge though is what happens when that object gets sent to the server. Do you use a temporary clientside ID which you then replace with the ID that the server subsequently generates or you use some sort of ID translation layer between the client and the server - this is what Trello did when building their offline functionality.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Is this a legitimate approach? The risk seems to be 1. Collisions and 2. any form of security that I haven't anticipated. Collisons seem to be taken care of by the way that UUIDs are generated but I can't tell if there are risks in allowing a client to choose the ID of an inserted object.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Yes, this is fine. Postgres even has a UUID type.
Set the default ID to be a server-generated UUID if the client does not send one.
Collisions.
UUIDs are designed to not collide.
Any form of security that I haven't anticipated.
Avoid UUIDv1 because...
This involves the MAC address of the computer and a time stamp. Note that UUIDs of this kind reveal the identity of the computer that created the identifier and the time at which it did so, which might make it unsuitable for certain security-sensitive applications.
You can instead use uuid_generate_v1mc which obscures the MAC address.
Avoid UUIDv3 because it uses MD5. Use UUIDv5 instead.
UUIDv4 is simplest, it's a 122 bit random number, and built into Postgres (the others are in the commonly available uuid-osp extension). However, it depends on the strength of the random number generator of each client. But even a bad UUIDv4 generator is better than incrementing an integer.

Storing facebook id field in postgres database

I use FB APIs to get all the posts for a message.
While reviewing the return data, I noticed it in the following format.
88887444189_99993647936419190
Is there any standard data field type within POSTGRES that I can use by default?
It looks kind of UUID format but not sure.
Use text.
You could change the underscore to a period and jam it into a numeric, but that'd be horrible.
Unless FB's API docs specify its structure and meaning, treat it as an arbitrary identifier and store it as text. Even if it looked like something recognisable, say a UUID, you should still use text unless it's documented to be a UUID or whatever. Otherwise your code could break later when it changes without warning because you relied on undocumented behaviour in the API.

am I exposing sensitive data if I put a bson ID in a url?

Say I have a Products array in my Mongodb. I'd like users to be able to see each product on their own page: http://www.mysite.com/product/12345/Widget-Wodget. Since each Product doesn't have an incremental integer ID (12345) but instead it has a BSON ID (5063a36bdeb13f7505000630), I'd need to either add the integer ID or use the BSON ID.
Since BSON ID's include the PID:
4-byte timestamp,
3-byte machine identifier,
2-byte process id,
3-byte counter.
Am I exposing secure information to the outside world if I use the BSON ID in my url?
I can't think of any use to gain privileges on your machines, however using ObjectIds everywhere discloses a lot of information nonetheless.
By crawling your website, one could:
find about some hidden objects: for instance, if the counter part goes from 0x....b1 to 0x....b9 between times t1 and t2, one can guess ObjectIds within these invervals. However, guessing ids is most likely useless if you enforce access permissions
know the signup date of each user (not very sensitive info but better than nothing)
deduce actual (as opposed to publicly available) business hours from the timestamps of objects created by the staff
deduce in which timezones your audience lives from the timestamps of user-generated objects: if your website is one which people use mostly at lunchtime, then one could measure peaks of ObjectIds and deduce that a peak at 8 PM UTC means the audience was on the US West coast
and more generally, by crawling most of your website, one can build a timeline of the success of your service, having for any given time knowledge of: your user count, levels of user engagement, how many servers you've got, how often your servers are restarted. PID changes occurring on weekends are more likely crashes, whereas those on business days are more likely crashes + software revisions
and probably find other info specific to your business processes and domain
To be fair, even with random ids one can infer a lot. The main issue is that you need to prevent anyone from scraping a statistically significant part of your site. But if someone is determined, they'll succeed eventually, which is why providing them with all of this extra, timestamped info seems wrong.
Sharing the information in the ObjectID will not compromise your security. Someone could infer minor details such as when the ObjectID was created (timestamp), but none of the ObjectID components should be tied to authentication or authorization.
If you are building an e-commerce site, SEO is typically a strong consideration for public URLs. In this case you normally want to use a friendlier URL with shorter and more semantic path components than an ObjectID.
Note that you do not have to use the default ObjectID for your _id field .. so could always generate something more relevant for your application. The default ObjectID does provide a reasonable guarantee of uniqueness, so if you implement your own _id allocation you will have to take this into consideration.
See also:
Create an Auto-Incrementing Sequence Field
As #Stennie said, not really.
Let's start with the pid, most hackers wouldn't bother looking for a pid, on say Linux, instead they would just do:
ps aux | grep mongod
or something similar. Of course this requires the hacker to have actually hacked your server, I know of no public hack available based on the pid alone. Considering the pid will change when you restart the machine or mongod, this information is utterly useless to anyone trying to spy.
The machine id is another bit of data that is quite useless publicly and, to be honest, they would get a better understanding of your network using ping or digg than they would through the machine id alone.
So to answer the question: No, there is no real security threat and the information you are displaying is of no use to anyone except MongoDB really.
I also agree with #Stennie on using SEO friendly URLs, an example which I commonly use for e-commerce is /product/product_title_ with a smaller random id (maybe base 64 encode the _id) or a auto incrementing id with .html on the end.