User profile database design - database-schema

i have to design a user account/profile tables for a university project. The basic idea i have is the following:
a table for user account (email, username, pwd, and a bunch of other fields)
a user profile table.
It seems to me that there are two ways to model a user profile table:
put all the fields in a table
[UserProfileTable]
UserAccountID (FK)
UserProfileID (PK)
DOB Date
Gender (the id of another table wich lists the possible gender)
Hobby varchar(200)
SmallBio varchar(200)
Interests varchar(200)
...
Put the common fields in a table and design an ProfileFieldName table that will list
all fields that we want. For example:
[ProfileFieldNameTable]
ProfileFieldID int (PK)
Name varchar
Name will be 'hobby', 'bio', 'interests' etc...Finally, we will have a table that will associate profiles with profile fields:
[ProfileFieldTalbe]
ProfileFieldID int (PK)
UserProfileID FK FK
FieldContent varchar
'FieldContent' will store a small text about hobbies, the bio of the user, his interests and so on.
This way is extensible, meaning that in this way adding more fields corresponds to an INSERT.
What do you think about this schema?
One drawback is that to gather all profile information of a single user now i have to do a join.
The second drawback is that the field 'FieldContent' is of type varchar. What if i want it to be of another type (int, float, a date, a FK to another table for listboxs etc...)?

I suggest 2nd option would be better,
The drawbacks which you mentioned are not actual drawbacks,
Using JOINS is one of ways to retrieving the data from the 1 or more tables ,
2) 'FieldContent' is of type varchar : I understand that you are going to create only 'FieldContent' for all your other fields
In that case, I suggest that you can have 'FieldContent' for each corresponding fields so that you can give any any kind of Data Type which you wish.
Coming to the first option,
1)If you put all the fields in one table may lead lot of confusion and provides less feasibility to extend later, if any requirements changes
2)There will be lot of redundancy as well.

Related

Would this PostgresQL model work for long-term use and security?

I'm making a real-time chat app and was stuck figuring out how the DB model should look like. I've made this diagram, but would this work? My issue is more to do with foreign keys.
I know this is a very vague question. But have been struggling with this model for a while now. This is the first database I'm setting up so it's probably got a load of errors.
Actually you are fairly close, but over complicated it a bit. At the conceptual/logical model you have just 2 entities. Users and Messages
with a many-to-many relationship. At the physical level the Channels table resolves the M:M into the 2 one_to_many you have described. But the
viewing this way ravels a couple issues. The attribute user is not required in the Messages table and if physically implemented requires a not easily done validation
that the user there exists in the Channels table. Further everything that Message:User relationship provides is a available
via Users:Channels:Messages relationship. A similar argument applies to Channels column in Users - completely resolved by the resolution table. Suggestion: drop user from message table and channels from users.
Now lets look at the columns of Channels. It looks like you using a boiler plate for created_at and updated_at, but are they necessary?
Well at least for updated_at No. What can be updated? If either User or Message is updated you have a brand new entry. Yes it may seem like the same physical row (actually it is not)
but the meaning is completely different. Well how about last massage? What is it trying to indicate that the max value created at for the user does not give you?
I cannot see anything. I guess you could change the created at but what is the point of tracking when I changed that column. Suggestion: drop last message sent and updated at (unless required by Institution standards) from message table.
That leaves the Users table itself. Besides Channels mentioned above there is the Contacts column. Physically as a array it violates 1NF and becomes difficult to manage - (as wall as validating that the contact is in fact a user)
Logically it is creating a M:M on USER:USER. So resolve it the same way as User:Messages, pull it out into another table, say User_Contacts with 2 attributes to the Users table. Suggestion drop contacts for the users table and create a resolution table.
Unfortunately, I do not have a good ERD diagrammer, so I just provide DDL.
create table users (
user_id integer generated always as identity primary key
, name text
, phone_number text
, last_login timestamptz
, created_at timestamptz
, updated_at timestamptz
) ;
create type message_type as enum ('short', 'long'); -- list all values
create table messages(
msg_id integer generated always as identity primary key
, msg_type message_type
, message text
, created_at timestamptz
, updated_at timestamptz
);
create table channels( -- resolves M:M Users:Messages
user_id integer
, msg_id integer
, created_at timestamptz
, constraint channels_pk
primary key (user_id, msg_id)
, constraint channels_2_users_fk
foreign key (user_id)
references users(user_id)
, constraint channels_2_messages_fk
foreign key (msg_id)
references messages(msg_id )
);
create table user_contacts( -- resolves M:M Users:Users
user_id integer
, contact_id integer
, created_at timestamptz
, constraint user_contacts_pk
primary key (user_id, contact_id)
, constraint user_2_users_fk
foreign key (user_id)
references users(user_id)
, constraint contact_2_user_fk
foreign key (user_id)
references users(user_id)
, constraint contact_not_me_check check (user_id <> contact_id)
);
Notes:
Do not use text as PK, use either integer (bigint) or UUID, and generate them during insert.
Caution on ENUM. In Postgres you can add new values, but you cannot remove a value. Depending upon number of values and how often the change consider creating a lookup/reference table for them.
Do not use the data type TIME. It is really not that useful without the date. Simple example I login today at 15:00, you login tomorrow at 13:00. Now, from the database itself, which of us logged in first.

Do i really need individual table for my three types of users?

If i have three type of users. Let's say seller, consumers, and sales persons. Should i make individual table for there details like name, email passwords and all other credentials etc with a role_type table or separate table for each of them. Which is the best approach for a large project considering all engineering principles for DBMS like normalization etc.
Also tell me Does it effect the performance of the app if i have lots of joins in tables to perform certain operations?
If the only thing that distinguishes those people is the role but all details are the same, then I would definitely go for a single table.
The question is however, can a single person have more than one role? If that is never the case, then add a role_type column to the person table. Depending on how fixed those roles are maybe use a lookup table and a foreign key, e.g.:
create table role_type
(
id integer primary key,
name varchar(20) not null unique
);
create table person
(
id integer primary key,
.... other attributes ...,
role_id integer not null references role_type
);
However, in my experience the restriction to exactly one role per person usually doesn't hold, so you would need a many-to-many relation ship
create table role_type
(
id integer primary key,
name varchar(20) not null unique
);
create table person
(
id integer primary key,
.... other attributes ...,
);
create table person_role
(
person_id integer not null references person,
role_id integer not null references role_type,
primary key (person_id, role_id)
);
It sounds like this is a case of trying to model inheritance in your relational database. Complex topic, discussed here and here.
It sounds like your "seller, consumer, sales person" will need lots of different attributes and relationships. A seller typically belongs to a department, has targets, is linked to sales. A consumer has purchase history, maybe a credit limit, etc.
If that's the case,I'd suggest "class table inheritance" might be the right solution.
That might look something like this.
create table user_account
(id int not null,
username varchar not null,
password varchar not null
....);
create table buyer
(id int not null,
user_account_id int not null(fk),
credit_limit float not null,
....);
create table seller
(id int not null,
user_account_id int not null(fk),
sales_target float,
....);
To answer your other question - relational databases are optimized for joining tables. Decades of research and development have gone into this area, and a well-designed database (with indexes on the columns you're joining on) will show no noticeable performance impact due to joins. From practical experience, queries with hundreds of millions of records and ten or more joins run very fast on modern hardware.

Create/alter table for each new user/project

I am building a platform with two kinds of users: Users_A create projects with unique virtual coins associated, and Users_B can buy and exchange this coins.
The problem:
Approach 1: if I use one unique table as a virtual wallet, the User_B ID will be the row, and each column will be each coin. In this way, I have to ALter the table each time a new project is created.
Approach 2: I create an electronic wallet (table) for every single User_B.
Which one of the two is worse/better in terms of performance?
Is there any other possible approach?
It's a bit unclear to me what exactly you are trying to model. But any model that requires ALTERing a table because you add new content to the database is flawed.
That sounds like a basic many-to-many relationship to me:
You definitely need a table for the users:
create table users
(
user_id integer primary key,
... other columns ...
);
and one for the different coins:
create table coin
(
coin_id integer primary key,
... other columns ...
);
You need a table for the projects. You said "unique virtual coins associated", so I assume one project deals with exactly one type of coins:
create table project
(
project_id integer primary key,
owner_user_id integer not null references users,
coin_id integer not null references coin
... other columns
);
I am not sure what exactly you mean with "buy and exchange" coins, but you probably need something like a transfer table:
create table coin_transfer
(
from_user_id integer not null references users,
to_user_id integer not null references users,
project_id integer not null references project,
transfer_type text not null check (transfer_type in ('buy', 'exchange'))
amount numeric not null
);
You also mention a "wallet" that belongs to a user. You would never create one table for each wallet, instead a table that contains the information which user owns the wallet. Assuming each user would have one wallet for each coin type you'd need something like this:
create table wallet
(
wallet_id integer primary key,
owner_user_id integer not null references users,
coin_id integer not null references coin,
... other columns ...
);
The above is only a very rough sketch and because there is a lot of information missing from your question.

Many-to-Many in Postgres?

I went with PostgreSQL because it is an ORDMBS rather than a standard relational DBMS. I have a class/object (below) that I would like to implement into the database.
class User{
int id;
String name;
ArrayList<User> friends;
}
Now, a user has many friends, so, logically, the table should be declared like so:
CREATE TABLE user_table(
id INT,
name TEXT,
friends TYPEOF(user_table)[]
)
However, to my knowledge, it is not possible to use a row of a table as a type (-10 points for postgreSQL), so, instead, my array of friends is stored as integers:
CREATE TABLE user_table(
id INT,
name TEXT,
friends INT[]
)
This is an issue because elements of an array cannot reference - only the array itself can. Added to this, there seems to be no way to import the whole user (that is to say, the user and all the user's friends) without doing multiple queries.
Am I using postgreSQL wrong? It seems to me that the only efficient way to use it is by using a relational approach.
I want a cleaner object-oriented approach similar to that of Java.
I'm afraid you are indeed using PostgreSQL wrong, and possibly misunderstanding the purpose of Object-relational databases as opposed to classic relational databases. Both classes of database are still inherently relational, but the former provides allowances for inheritance and user-defined types that the latter does not.
This answer to one of your previous questions provides you with some great pointers to achieve what you're trying to do using the Postgres pattern.
Well, first off PostgreSQL absolutely supports arrays of complex types like you describe (although I don't think it has a TYPEOF operator). How would the declaration you describe work, though? You are trying to use the table type in the declaration of the table. If what you want is a composite type in an array (and I'm not really sure that it is) you would declare this in two steps:
CREATE TYPE ima_type AS ( some_id integer, some_val text);
CREATE TABLE ima_table
( some_other_id serial NOT NULL
, friendz ima_type []
)
;
That runs fine. You can also create arrays of table types, because every table definition is a type definition in Postgres.
However, in a relational database, a more traditional model would use two tables:
CREATE TABLE persons
( person_id serial NOT NULL PRIMARY KEY
, person_name text NOT NULL
)
;
CREATE TABLE friend_lookup
( person_id integer FOREIGN KEY REFERENCES persons
, friend_id integer FOREIGN KEY REFERENCES persons(person_id)
, CONSTRAINT uq_person_friend UNIQUE (person_id, friend_id)
)
;
Ignoring the fact that the persons table has absolutely no way to prevent duplicate persons (what about misspellings, middle initials, spacing, honorifics, etc?; also two different people can have the same name), this will do what you want and allow for a simple query that lists all friends.

Is this kind of DB relation design favourable and correct? Should it be converted to a no-sql solution?

First of all, I did my research but being rather a newbie, I am not that well acquainted with words so might have failed in founding the correct ones. I beg your pardon in case of a possible duplicate.
Question #1:
I have a table consisting of ID [PK] and LABEL [Varchar 128]. Each record (row) here is unique. What I want is, to define relations between these LABELS.
Requisite:
There will be an n amount of groups, each group containing one or more of these LABELS. In each group, each LABEL can either exist or not exist (meaning a group does not have 2x of same LABEL).
How should I define this relation?
I thought of creating another table with ID [PK] - Group ID [randomly assigned unique key] - LABEL_ID [ID of Labels table pointing to a single Label]
Is this correct and favourable? If a group has 10 LABELS then there will be 10 records with unique ID, same uniquely assigned Group ID and LABEL_ID pointing to LABELS table.
Question #2:
Should I let go of the Relational solution (as described above) and opt for a NoSQL solution? Where Each group is stored on it's own as a single entry into the database with an ID [PK] - Data [Containing either labels or IDs of labels pointing to the Label table]?
If NoSQL is the way to go, how should I store this data?
a) Should I have ID - Data (containing Labels)?
b) ID - Data (containing IDs of Labels)?
Question #3:
If NoSQL solution here is the best way, which NoSQL database should I choose for this use case?
Thank you.
There's no real need for an ID column in this GroupLabels table:
CREATE TABLE GroupLabels (
GroupID int not null,
LabelID int not null,
constraint PK_GroupLabels PRIMARY KEY (GroupID,LabelID),
constraint FK_GroupLabels_Groups FOREIGN KEY (GroupID) references Groups,
constraint FK_GroupLabels_Labels FOREIGN KEY (LabelID) references Labels
)
By doing the above, we've automatically achieved a constraint - that the same label can't be added to the same group more than once.
With the above, I'd say it's a reasonably common SQL solution.
There is too little information here to make recommendations on the question of "to SQL or not to SQL".
However, the relational approach would be as you describe, I think.
CREATE TABLE Group
(
GroupId int PRIMARY KEY
)
CREATE TABLE GroupLabel
(
GroupId int FOREIGN KEY REFERENCES Group,
LabelId int FOREIGN KEY REFERENCES Label,
UNIQUE (GroupId, LabelId)
)
CREATE TABLE Label
(
LabelId int PRIMARY KEY,
Value varchar(100) UNIQUE
)
Here, every label is unique, Many labels may be in each group and each label may be in many groups but each label can only be in each group once.
As #Damien_The_Unbeliever indicates, the Group table can be omitted if you don't need to store any additional attributes about each group by making the GroupId column on the GroupLabels table solely unique.
You might need to change the syntax slightly for whatever RDBMS you're using.