Please see screen grab.
I am using a finance application and T-SQL. I would just like to understand why data would be stored on a db table in hexadecimal as opposed to integer, and why it would follow the format of 8-4-4-4-12 characters (total: 32).
Thanks in advance.
It's a UUID (or, in Microsoft parlance, a GUID) - normally used for ensuring UNIQUEness in PRIMARY KEYs across different servers in a distributed system - if your system isn't distributed, there's no reason not to use simple autogenerated INTEGER as PKs.
Related
I want to store the uids generated by firebase auth in a postgres database. As it is not a valid uuid I am not sure which datatype to choose. Mainly I am not sure if I should use a char or a varchar.
I would say use varchar to allow for the uid changing over time. From the Postgres end there is really no difference, see here:
Tip
There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead.
Firebase Authentication UIDs are just strings. The strings don't contain any data - they are just random. A varchar seems appropriate.
I could have posted this to SQL forums, but I rather look for an idea or best practice, that is why I have chosen this forum.
I have got an integer column in SQL called Payroll Number and it is unique to employee. we will be interrogating employee information from this system via SQL views and put into another system, but we dont want payroll numbers to be appeared as they are on this system. Therefore, we need to hash those payroll numbers on SQL so that views will serve hashed user-friendly numbers.
I spent quite a lot of time reading encryption techniques in SQL, but they are using complex algorithms to hash data and produce binary. But what I am after is less complext and obfuscating a number rather than hashing.
For instance, payroll number is 6 characters long(145674), I want to be able to generate maybe 9-10 characters long integer number from this number and use it on other systems.
I had a look at XOR'ing but I need something more robust and elegant.
How do you guys do these things? Do you write your simple algorithm obfuscate your integers? I need to do this on SQL leve, what do you suggest?
Thanks for your help
Regards
It is not hard to hash a value but it is hard to hash a value and be sure of uniqueness and have it be a number. However, I do have a cross database solution.
Make a new table - with two columns, id (auto generated from random starting point) and payroll id.
Everytime you need to use a user externally insert them into this table. This will give you a local unique id you can use (internally and externally) but it is not the payroll id.
In fact, if you have an internal id already (eg user id from the user table) just use that. There is no advantage to hashing this value if it is never decoded. However, you can use the autogen of id as your random unique hash -- it has all the properties you need.
We are currently using VARCHAR for storing text data in DB2 however we are hitting the problem that length of VARCHAR specified is not the same as length of text because in DB2 VARCHAR length specified is UTF-8 data length which can vary depending on stored text data. For example some texts contain characters from different languages and because of it some texts with 500 characters can't be saved in VARCHAR(500) and etc.
Now we are planning to migrate to VARGRAPHIC. I need to know what are limitations of using VARGRAPHIC for storing unicode text data in DB2.
Are there any problems with using VARGRAPHIC?
DB2 doesn't check that the data is in fact double-byte String, but it assumes it must be. Usually the drivers will do proper conversions for you but you might one day bump into some bug. It is unlikely though.
If you use federated databases Vargraphic support in queries might fail completely. In overall the amount of bug reports for vargraphic data types is somewhat high. Support for it isn't probably as well tested and tried as for other data types.
Vargraphic will with unicode database (ie. UTF-8 is requirement) use big-endian UCS-2, meaning your space requirements for those columns double. Vargraphic is DB2 properietary data type. If you migrate off DB2 some day you will have to do an extra conversion.
I've always used either auto_generated or Sequences in the past for my primary keys. With the current system I'm working on there is the possibility of having to eventually partition the data which has never been a requirement in the past. Knowing that I may need to partition the data in the future, is there any advantage of using UUIDs for PKs instead of the database's built-in sequences? If so, is there a design pattern that can safely generate relatively short keys (say 6 characters instead of the usual long one e6709870-5cbc-11df-a08a-0800200c9a66)? 36^6 keys per-table is more than sufficient for any table I could imagine.
I will be using the keys in URLs so conciseness is important.
There is no pattern to reduce a 128-Bit UUID to 6 chars, since information gets lost.
Almost all databases implement a surrogate key strategy called incremental keys.
Postgres and Informix have serials, MySql auto_increment, and Oracle offers sequence generators. In your case I think it would be safe to use integer IDs.
See this article: Choosing a Primary Key: Natural or Surrogate? for a discussion of availabe techniques
I'm not sure what type of partition are you planning (this?), but I don't see why to change the primary key design? Even if the old partitioned tables are "alive" (i.e., you might insert rows in any partitioned table), there is no problem in sharing the sequence among several tables.
what is ROWID and RECID actually in progress.Can we use the RECID instead of ROWID.what is the diffrrence between them??
Both RECID and ROWID are unique pointers to a specific record in the database.
Both are more-or-less physical pointers into the database itself, except for non-OpenEdge tables where there is no equivalent on the underlying platform. In those cases, it may be comprised of the values making up the primary key.
RECIDs are 32 bit integers up through 10.1A, and were fine when the database was an OpenEdge database and had only one area. From 10.1B forward they were upgraded to 64 bit integers.
In v6 the capacity was added to connect to non-OpenEdge databases, and in v8 to create OpenEdge databases of more than one storage area. At that point, RECIDs were insufficient to address all of the records in a table uniquely in all circumstances.
So the ROWID structure was born. Its actual architecture depends on the type of database underneath, but it does not suffer from the limitations of being an integer.
The documentation is fairly clear in stating that RECIDs should not be used going forward, except for code that manipulates the OpenEdge database metaschema.
RECID is deprecated, for a couple of versions now. ROWID is the replacement for it. I understand that what it actually returns is the physical address of the DB block containing your record. From memory, they introduced ROWID when they wanted to support different DB engines - Oracle / SQL server et al - from the 4GL, which implies that there is more in a ROWID than a RECID.
I'd stay away from RECID, you might get away with it short term, but you're giving yourself a potential problem that you could avoid altogether.