Celery - Task Id max length? - celery

I would like to store celery task id in a model in a CharField my Django database. I am required to specify a max length. What is the max length for celery task id?

Celery uses Python's standard library uuid module ( https://docs.python.org/3/library/uuid.html ) to generate task IDs. This module generates standard RFC-4122 UUIDs, which means 32 hexadecimal numbers with 4 dashes in between so the Python generated UUID strings are always 36 chars long (example: 6f726825-ccef-4be1-b64c-ae13605d48db).

Related

How to capture the Null values(custname) and respective CustID in separate file and the rest of the CustID's in other file

CustID
CustNAme
10
Ally
20
null
30
null
40
Liza
50
null
60
Mark
You need to generate an artificial key (e.g. line number) on each file. Then have the source of CustID as the stream input to a Lookup stage, and the source of CustName as the reference input of the Lookup stage, where the lookup key is LineNumber. Set the Lookup Failed rule to suit your own needs.
One way to generate line numbers is a Column Generator stage operating in sequential mode.
You can use a Transformer Stage with two output links. Use the output link constraints to check on null values to split the stream.
As constraint, just write IsNull(DSLink2.CustNAme) or IsNotNull(DSLink2.CustNAme) respectively.
Note: You can also write !IsNull(col) or Not(IsNull(col)) for IsNotNull(col)

What's a best practice for saving a unique, random, short string to db?

I have a table with a varchar column named key, which is supposed to hold a unique, 8-char random string, which is going to be used as an unique identifier by users. This field should be generated and saved on creation of objects, I have a question about how to create it:
Most of recommendations point to UUID field, but it's not applicable for me because it's too long, and if just get a subset of it then there's no guarantee of uniqueness.
Currently I've just implemented a loop in my backend (not DB), which generates a random string and tries to insert it to DB, and retries if the string turns out to be not unique. But I feel that this is just a really bad practice.
What's the best way to do this?
I'm using Postgresql 9.6
UPDATE:
My main concern is to remove the loop that retries to find a random, short string (or number, doesn't matter) that is unique in that table. AFAIK the solution should be a way to generate the string in DB itself. The only thing that I can find for Postgresql is uuid and uuid-ossp that does something like this, but uuid is way too long for my application, and I don't know of any way to have a shorter representation of uuid without compromising it's uniqueness (and I don't think it's possible theoretically).
So, how can I remove the loop and it's back-and-forth to DB?
Encryption is guaranteed unique, it has to be otherwise decryption would not work. Provided you encrypt unique inputs, such as 0, 1, 2, 3, ... then you are guaranteed unique outputs.
You want 8 characters. You have 62 characters to play with: A-Z, a-z, 0-9 so convert your binary output from the encryption to a base 62 number.
You may need to use the cycle walking technique from Format-preserving encryption to handle a few cases.

What is the mean of NUM_LOADER_THREADS and NUM_WRITER_THREADS in sysbench-mongodb?

I use the sysbench-mongodb to test the performance of mongodb, and the tool has a config file named config.bash. In this file has two parameter:
# total number of simultaneous insertion threads (for loader)
# valid values : integer > 0
export NUM_LOADER_THREADS=8
# total number of simultaneous benchmark threads
# valid values : integer > 0
export NUM_WRITER_THREADS=64
I do not know what's the difference between these two parameter. Who had used these tool? Anyone can help me? Thank you.

database design for new system but legacy dependency

We are planing to make a new project (complete relaunch) of a web application in PHP (Symfony 2) and PostgreSQL. Currently we use PHP and MySQL (MyISAM). -> webapp
The current and new webapp depends on another system (.NET) including a database (MS SQL 8 / 2000), which will not be modified (changed or merge the databases together) anytime soon, because there is a complex workflow with the whole megillah -> legacy system
BTW: biggest table has 27 million rows in total
Most of the data/tables will be transfered multuple times per day from the legacy database to the webapp database. For the new webapp we already redesigned most of the database schema, so we have now almost a normalised schema (the schema of the legacy database is massive redundant and really messy)
Currently the transfer job try to insert data. When there is an exception with the specific code, we know the row already there and then do a update. This is because of performance (no select before update).
For the new webapp schema we still want to use the same primary IDs like in the legacy database. But there are some problems, one of them: some tables has primary keys which looks like a integer, but they aren't. most of the rows have integers like 123456, but then, there are some rows with a character like 123456P32.
Now there are two options for the new schema:
Use string type for PK and risk performance issues
Use integer type for PK and make a conversion
The conversion could look like this (character based)
legacy new
--------------------------
0 10
1 11
2 12
. ..
9 19
a 20
b 21
. ..
y 45
z 46
A 50 (not 47, because the arity of the second digit is 'clean' with 50)
B 51
. ..
Z 76
The legacy pk 123 would be converted into 111213, so the length is double from original. Another example 123A9 -> 1112135019. Because every character hase two digits it also can be converted back.
My first doubt was that the sparse PKs would bring some performance issues, but when using b-tree (self-balancing) as index which is default index sysetm for Postgres, it should be fine.
What do you think? Have you some experience with similar systems with legacy dependencies?
PostgreSQL performance with text PK isn't that bad — I'd go with it for simplicity.
You didn't tell us how long can these keys be. Using your conversion an ordinary integer would be enough for only 4 character key and bigint only for 9.
Use CREATE DOMAIN to isolate the proposed data types. Then build and test a prototype. You're lucky; you have no shortage of valid test data.
create domain legacy_key as varchar(15) not null;
create table your_first_table (
new_key_name legacy_key primary key,
-- other columns go here.
);
To test a second database using integer keys, dump the schema, change that one line (and the name of the database if you want to have them both at the same time), and reload.
create domain legacy_key as bigint not null;
You should think hard about storing the legacy system's primary key exactly as they are. Nothing to debug--great peace of mind. If you must convert, be careful with values like '1234P45'. If that letter happens to be an E or a D, some applications will interpret it as indicating an exponent.
You shouldn't have performance problems due to key length if you're using varchar() keys of 10 or 15 characters, especially with version 9.2. Read the documentation about indexes before you start. PostgreSQL supports more kinds of indexes than most people realize.

Cassandra CLI: RowKey with RandomPartitioner = Show Key in Cleartext?

using Cassandra CLI gives me the following output:
RowKey: 31307c32333239
=> (super_column=3f18d800-fed5-17cf-b91a-0016e6df0376,
(column=date, value=1312289229287, timestamp=1312289229647000)
I am using RandomPartitioner. Is it somehow possible to get the RowKey (from CLI) in Cleartext? Ideally in the CLI but if there is a Helper Class to convert it back into a String, this would also be ok.
I know the key is somehow hashed. If the key can not be "retrieved" (what I have to assume), Is there are helper Class exposed in Cassandra, that I can use to generate the Key based on my original String to compare them?
My Problem: I have stored Records in Cassandra, but using the Key like "user1|order1" I am not able to retrieve the records. Cassandra does not find any records. I assume that somehow my keys are wrong and I need to compare them and find out whtere the problem is...
Thanks very much !! Jens
This question is highly relevant: Cassandra cli: Convert hex values into a human-readable format
The only difference is that in Cassandra 0.8, there is now a "key_validation_class" attribute per column family, and it defaults to BytesType. When the CLI sees that it's BytesType, it represents the key in hex.
Cassandra is treating your RowKey as hex bytes: 31 30 7c 32 33 32 39 in hex is
10|2329 in ASCII, which I guess is your key.
If you want to see your plaintext keys in the CLI, then you need to give the command assume MyColumnFamily keys as ascii; which will make the CLI automatically translate between ASCII and hex for you in the row keys.
Or, when you create the column family, use create column family MyColumnFamily with key_validation_class = AsciiType; which will set the preference permanently.
The hashing occurs at a lower level and doesn't affect your problem.