I would like to update a column in a table that contains names with several characters added as suffix. Is it possible to remove these characters from this specific column in some way as there is not any other condition?
For example I have got names such as
Joe Doe Nm and I want to update it to Joe Doe
John Crow Nkk and update it to John Crow
George Stavrou_Ngf and same as above (George Stavrou)
and similar style regarding names.
Regards
Related
I am currently studying SQL normal forms.
Lets say I have the following table the primary key is userid
userid FirstName LastName Phone
1 John Smith 555-555
1 Tim Jack 432-213
2 Sarah Mit 454-541
3 Tom jones 987-125
The book I'm reading states the following conditions must be true in order for a table to be in 1st normal form.
Rows contain data about an entity.
Columns contain data about attributes of the entities.
All entries in a column are of the same kind.
Each column has a unique name.
Cells of the table hold a single value.
The order of the columns is unimportant.
The order of the rows is unimportant.
No two rows may be identical.
A primary key Must be assigned
I'm not sure if my table violates the
8th rule No two rows may be identical.
Because the first two records in my table
1 John Smith 555-555
1 Tim Jack 432-213
share the same userid does that mean that they are considered
duplicate rows?
Or does duplicate records mean that every peace of data in the row
has to be the same for the record to be considered a duplicate row
see example below?
1 John Smith 555-555
1 John Smith 555-555
EDIT1: Sorry for the confusion
The question I was trying to ask is simple
Is this table below in 1st normal form?
userid FirstName LastName Phone
1 John Smith 555-555
1 Tim Jack 432-213
2 Sarah Mit 454-541
3 Tom jones 987-125
Based on the 9 rules given in the textbook I think it is but I wasn't sure that
if rule 8 No two rows may be identical
was being violated because of two records that use the same primary key.
The class text book and prof isn't really that clear on this subject which is why I am asking this question.
Or does duplicate records mean that every peace of data in the row has to be the same for the record to be considered a duplicate row see example below?
They mean that--the latter of your choices. Entire rows are what must be "identical". It's ok if two rows share the same values for one or more columns as long as one or more columns differ.
That's because a relation holds a set of values that are tuples/rows/records, and set is a collection of values that are all different.
But SQL & some relational algebras have different notions of "identical" in the case of NULLs compared to the relational model without NULLs. You should read what your textbook says about it if you want to know exactly what they mean by it. Two rows that have NULL in the same column are considered different. (Point 9 might be summarizing something involving NULLs. Depends on the explanation in the book.)
PS
There's no single notion of what a relation is. There is no single notion of "identical". There is no single notion of 1NF.
Points 3-8 are better described as (poor) ways of restricting how to interpret a picture of a table to get a relation. Your textbook seems to be (strangely) making "1NF" a property of such an interpretation of a picture of a table. Normally we simply define a relation to be a certain thing so if you have one then it has to have the defined properties. Then "in 1NF" applies to a relation & either means "is a relation" & isn't further used or it means certain further restrictions hold. A relation is a set of tuples/rows/records, and in the kind of relation your 3-8 describes they are sets of attribute/column/field name-value pairs & the values paired with a name have to be of the type paired with that name in some schema/heading that is a set of name-type pairs that is defined either as part of the relation or external to it.
Your textbook doesn't seem to present things clearly. It's definition of "1NF" is also idiosyncratic in that although 3-8 are mathematical, 1 & 2 are informal/heuristic (& 9 could be either or both).
I am trying to map regexp so when a user enters a specific word I force them to choose a specific other term in the table.
For instance I have as simple example fields:
Bob
Bob Smith
Bob Jones
Sally
Sally Smith
Sally Jones
If I do
regexp_filter=Bob>Bob Smith to make sure when a user simply enters Bob I push Bob Smith instead and then do a sphinql search:
Select * from Index where MATCH('Bob')
I still get all the Bob records (in other words it did not interpret as Bob Smith
However if I dod
regexp_filter=Bob=>Sally
Then Select * from Index where MATCH('Bob') returns all the Sally records.
I am "simply" trying to force the index to return the Bob Jones record should a user simply search on Bob.
FYI I did try
Select * from Index where MATCH('^Bob$')
and that returned NULL
I still get all the Bob records (in other words it did not interpret as Bob Smith
Because the regex is applied to BOTH the query text and document text during indexing. So becomes MATCH('Bob Smith'), but your documents also become
Bob Smith
Bob Smith Smith
Bob Smith Jones
... so your query, matches all three documents, same if didnt have the regex!
I am "simply" trying to force the index to return the Bob Jones record should a user simply search on Bob.
Wonder if you meaning should NOT return ?
... in which case can use MATCH('^Bob$') but rememeber to remove the regex. Or could perhaps use MATCH('"^Bob$"') an extra set of quotes to use phrase mode.
Try also MATCH('^Bob$ ') with a space! some versions of sphinx has a bug unless a space after the $!
I have a field from the data I am reading in that can contain multiple values. They are essentially tags.
For example, there could be a column called "persons responsible". This could read "Joe; Bob; Sue" or "Sue" for a given row.
Is it possible from within Tableau to read these in as separate categories? So that for this sample data:
Project | Persons
---------------------------
Zeta | Bob; Sue; Joe
Enne | Sue
Doble Ve | Bob
There could be a count of Bob (2), Sue (2), Joe (1)?
I am working on getting better data inputs, but I was wondering if there was a temporary solution at this level.
I would definitely work towards normalizing your schema.
In the meantime, there is a workaround that is almost reasonable if there is a small set of possible values for the tags (persons in your example).
If Bob, Sue and Joe are the only people in the system, you can use the contains() function to define a boolean calculated field for each person -- e.g. Bob_Is_Responsible = contains(Persons, 'Bob"), and similar fields for Sue and Joe. Then you could use those as building blocks, possibly with sets, to break the data up in different ways.
Of course, this approach gets cumbersome fast if the number of tags grows, or if it is unconstrained. But you asked for a temporary solution ...
If the number of elements is small, you write and union several queries with each one having the project and nth element.
Ideally, you'd reshape your data to look like this either in the database or with the above mentioned union technique. Then you could count() or countd() the elements by project.
Project | Persons
---------------------------
Zeta | Bob
Zeta | Sue
Zeta | Joe
Enne | Sue
Doble Ve | Bob
There are several transposing questions on Stackoverflow, but looking at few non of them is really similar to my problem. The main difference being: the have a predefined set of columns.
Let's say my table looks like this:
ID Name Value
---------------
1 Set Mitch
2 Get Jane
3 Push Dave
4 Pull Mike
5 Dummy John
...
I'd like to transpose it to become:
Set Get Push Pull Dummy ...
----------------------------------
Mitch Jane Dave Mike John ...
It looks like you're looking for a "dynamic pivot table". See the example here, or Google that term for more information:
http://www.kodyaz.com/articles/t-sql-pivot-tables-in-sql-server-tutorial-with-examples.aspx
Do you need to do this in the SQL? It seems pretty trivial if you can just do it after you get a SELECT * query into an array you can manipulate at will.
I have created a cross tab however if I have a firstname field it merges the rows together if there are two names which are the same. How do you get it to display the names in the each row. In the example below Sarah is not displayed twice as the cells are merged together.
Firstname Lastname
Judy Collins
Sarah Dane
Smith
Joe Dine
Mary Lane
It sounds like you grouped your crosstab on the first name only. I would recommend:
Make a new Formula (Call it "FullName")
In the formula, combine the first name and last name, e.g. something like {First Name}&{Last Name}
Edit your crosstab to group by the FullName formula instead.
Does this help?
EDIT
Based on your comment, I don't think your comment is with Crystal. You need a "unique ID" of some sort, a distinct number for each person. My original suggestion was trying to use the user's full name as a unique ID, but that won't work if your dataset is big enough to include multiple people with the same name. Does your dataset have any kind of unique ID? What is this crosstab trying to display? There might be a better way.