CouchDB Merging Revisions (conflict) - merge

So I'm causing a conflict on purpose with the example of this site: http://guide.couchdb.org/draft/conflicts.html (Working with Conflicts).
Now there are two revisions and CouchDB decides with its own algorithm which revision to use but i would like to keep both revisions and merge them.
As an example: i got Phonenumber: 111 and Name: Jules on database A and Phonenumber: 222 and Name: Jules on database B.
Is it possible to create a new document which keeps all information from the old and the new revision?
Or a new field like "NewPhonenumber: 222" and the other fields Phonenumber: 111 and Name: Jules ?
I just want to keep both revisions no matter how.
I tried to write a View function but i just don't know how to grab the data from the conflicting database.
function(doc) {
if(doc._conflicts)
emit(doc._id, doc);
emit(doc._id, {oldNumber: doc.phonenumber, newNumber: doc.phonenumber, name: doc.name});
}
How can I replace "oldNumber: doc.phonenumber" with the number from the old revision?
Thanks!

The property doc._conflicts contains one or more conflicted revisions of the doc - you have to iterate over the list and grab the value you want.
If you do this in the view your conflict resolving will never be stored. You have to send the resolved version of the doc as new revision to CouchDB.
You can request the doc with its conflicts by using the query param ?conflicts=true (more in the CouchDB documentation) and store your decision as new revision.

Related

How to handle future dated records in postgress using Ef core

I am working on microservices architecture for payroll application.
ORM -EF core
I have Employee table ,where employee details are stored as jsonb column(firstname,lastname,department etc) in postgress .
one of the use case is, I may receive request for future dated changes.Example- Employee designation gets changed next month but I receive request for those change in current month.
I have two approachs to handle this scenario.
Approach 1 :
when I get future dated record(effective date > current date), I will store those records in separate table not in employee master table.
I will create one console application which runs on everyday (cron) and picks up the correct record(effectivedate == currentdate) and update the employee master table.
Approach 2:
almost same as approach 1, instead of using a table for storing future dated record, I will update the record in employee master table.
If I go with approach 2,
I need to delete existing record when effective date becomes current date
when I do get request I should get only current record not future record - to achieve this, I need to add condition for checking effective date. All employee details are stored in jsonb column so I need to fetch entire records with current and future dated record and filter only the current record.
I feel approach 1 is better.Please help me on this. I would like to know another approaches which may fit for this use case.
Thanks,
Revathi

Relational or full object in MongoDB documents

I have a general MongoDB question as I have recently found an issue with how I store things.
Currently, there is a collection called spaces like this:
{
_id: 5e1c4689429a8a0decf16f69,
challengers: [
5dfa24dce9cbc0180fb60226,
5dfa26f46719311869ac1756,
5dfa270c6719311869ac1757
],
tasks: [],
owner: 5dfa24dce9cbc0180fb60226,
name: 'testSpace',
description: 'testSpace'
}
As you can see, this has a challengers array, in which we store the ID of the User.
Would it be okey, if instead of storing the ID, I would store the entire User object, minus fields such as password etc?
Or should I continue with this reference path of referring to the ID of other documents?
The problem I have with this, is that when I want to go through all the spaces that a user has, I want to see what members are a part of that space (challengers array). However, I receive the IDS instead of name and email obviously. I am therefore struggling with sending the correct data to the frontend (I have tried doing some manual manipulation without luck).
So, if I have to continue the path of reference, then I will need to solve my problem somehow.
If it is okey to store the entire object in the array, It would be a lot easier.
HOWEVER, I want to do what is the best practice.
Thank you everyone!

TYPO3: How to check if a record is new or just a copy

I implemented two hooks (processDatamap_afterDatabaseOperations and processDatamap_postProcessFieldArray) to manipulate any record after saving.
My Question is:
Every time I copy or create a record I enter the hooks and get a parameter "status" which is always "new" no matter if the record is actually new or just a copy of an existing record.
It seems like TYPO3 handles copies as new records.
How can I check if a record is actually a copy or a new record?
I am currently working with TYPO3 Version 8.7.9.
You can use the t3_origuid.
It should be added to your extbase domain model.
See here.
After handling the "copy" command the id of the original record will be copied into this field.
So in the hooks:
processDatamap_preProcessFieldArray or processDatamap_postProcessFieldArray you can access to it.
Like:
if(isset($fieldArray['t3_origuid']) {
<your_code>
}

Should a tag be it's own resource or a nested property?

I am at a crossroads deciding whether tags should be their own resource or a nested property of a note. This question touches a bit on RESTful design and database storage.
Context: I have a note resource. Users can have many notes. Each note can have many tags.
Functional Goals:
I need to create routes to do the following:
1) Fetch all user tags. Something like: GET /users/:id/tags
2) Delete tag(s) associated with a note.
3) Add tag to a specific note.
Data/Performance Goals
1) Fetching user tags should be fast. This is for the purpose of "autosuggest"/"autocomplete".
2) Prevent duplicates (as much as possible). I want tags to be reused as much as possible for the purpose of being able to query data by tag. For example, I'd like to mitigate scenarios where the user types a tag such as "superheroes" when the tag "superhero" already exists.
That being said, the way I see it, there are two approaches of storing tags on a note resource:
1) tags as nested property. For example:
type: 'notes',
attributes: {
id: '123456789',
body: '...',
tags: ['batman', 'superhero']
}
2) tags as their own resource. For example:
type: 'notes',
data: {
id: '123456789',
body: '...',
tags: [1,2,3] // <= Tag IDs instead of strings
}
Either one of the approaches above could work but I am looking for a solution that will allow scalability and data consistency (imagine a million notes and ten million tags). At this point, I am leaning toward option #1 since it is easier to cope with code wise but may not necessarily be the right option.
I am very interested in hearing some thoughts about the different approaches especially since I cannot find a similar questions on SO about this topic.
Update
Thank you for the answers. One of the most important things for me is identifying why using one over the other is advantageous. I'd like the answer to include somewhat of a pro/con list.
tl;dr
Considering your requirements, IMO you should store tags as resources and your API should return the notes with the tags as embedded properties.
Database design
Keep notes and tags as separate collections (or tables). Since you have many notes and many tags and considering the fact that the core functionality is dependent on searching/autocomplete on these tags, this will improve performance when searching for notes for particular tags. A very basic design can look like:
notes
{
'id': 101, // noteid
'title': 'Note title',
'body': 'Some note',
'tags': ['tag1', 'tag2', ...]
}
tags
{
'id': 'tag1', // tagid
'name': 'batman',
'description': 'the dark knight',
'related': ['tagx', 'tagy', ...],
'notes': [101, 103, ...]
}
You can use the related property to handle duplicates by replacing tagx, tagy by similar tags.
API Design
1. Fetching notes for user:
GET /users/{userid}/notes
Embed the tags within the notes object when you handle this route at backend. The notes object your API send should look something like this:
{
'id': 101,
'title': 'Note title',
'body': 'Some note',
'tags': ['batman'] // replacing the tag1 by its name from tag collection
}
2. Fetching tags for user:
GET /users/{userid}/tags
If it's not required, you can skip on sending the notes property which contains the id for your notes.
3. Deleting tags for notes:
DELETE /users/{userid}/{noteid}/{tag}
4. Adding tags for notes:
PUT /users/{userid}/{noteid}/{tag}
Addressing the performance issues, fetching tags for user should be fast because you have a separate collection for the same. Also, handling duplicates will be simpler because you can simply add the similar tags (by id or name) into the related array. Hope this was helpful.
Why not to keep tags as nested property
The design is not as scalable as the previous case. If the tags are nested property and a tag has to be edited or some information has to be added, then it will require changes in all the notes since multiple notes can contain the same tag. Whereas, keeping the tags as resources, the same notes will be mapped with their ids and a single change would be required in the tags collection/table.
Handling duplicate tags might not be as simple as when keeping them as separate resources.
When searching for tags you will need to search for all the tags embedded inside every note. This adds overhead.
The only advantage of using tags as nested property IMO is it'll make it easier to add or delete tags for a particular note.
It might be a little bit complicated. So I can just share my experience with Tag work (in our case, it was a main feature of VoIP App).
In any case all Tags will be as unique object, which contains a lot info. As you know it would be a more complicated for transferring, but you would need this information, for example below. And sure, Json it's fastest solution.
type: 'notes',
data: {
id: '123456789',
body: '...',
tags: [UUID1,UUID2,UUID3]
}
Just for example, how much of information you would needed. When you want to change color of tag, or size, based on Tag Rate, color based on number usage, linked (not same), duplicates, and so on.
type: 'tag',
data: {
uuid: '234-se-324',
body: 'superhero',
linked: [UUID3, UUID4]
rate: 4.6,
usage: 4323
duplicate: [superheros, suppahero]
}
As you can see, we use even duplicates. Just to save uniques of every Tag. Sure we also contain logic to filter the Words Root, but as you can see from example above, we also use duplicate value with special Roots, like "Superhero" and "Suppahero" which are same for us.
And you might think, this is a lot information for the "autosuggest" or "autocomplete", but we never faced performance issues (in case, if sever side support sanity). And all information is important for every usage, and Note in this case too.
Saving tags as nested property makes sense if you want to have all data in same row. Let me give you an example.
On invoice you add items,
Title, description, price, qty, tax, ...
tax in this case could be : VAT 20% so you calcualte invoice with 20%, but one day tax changes to 22% and all invoices that are saved on DB will be 2% more. In this case you add new column and you save it as raw number 20, and when you read that invoice from db you get all data from one row, instead of calculating it from different tables or variables.
Same thing is with tags. If you somehow want to merge duplicates, its easy to do it with IDs rather than strings.
Also there are some other factors that you might consider it.
in a social network, a user might have tags that are called skills, interests, sports, and more. There is no real way to differentiate between tags from (https://github.com/mbleigh/acts-as-taggable-on)
So if you are making tags that you will tag many things you have to use id

Can't access record id field

I've imported an Excel spreadsheet that has one-to-many relationship entries. For example, a business has one legal name but multiple locations with a DBA name for each location. There is a record for each DBA location. I'm filtering through the input data creating a single entry for each legal business name in one table and creating a business location entry for each DBA location. I'm trying to manually assigned the record ID for a legal business into each of its DBA business location records.
Here is my problem. When I try the following:
#dba_business.legal_business_id = #legal_business.id
I get the following error.
undefined method `id' for #<LegalBusiness::ActiveRecord_Relation:0x007fe1f2cc3770>
I tried the following but the #dba_business.legal_business_id ends up blank instead of putting the record ID value in the field.
#dba_business.legal_business_id = #legal_business
Legal_Business is set up with has_many :dba_business and DBA_business :belongs_to legal_business.
I used the debug.inspect command to see the attributes, logger.debug "LEGAL BUSINESS: #{#legal_business.inspect}", and you can clearly see the ID field defined as an attribute.
LEGAL BUSINESS: *#<ActiveRecord::Relation [#<LegalBusiness id: 58722, user_id: nil, legal_name:.........*
I'm using PostgreSQL 9.3, Rails 4.1, Ruby 2.1.1 with rvm. Any suggestions appreciated.
i recommend reading this: http://nofail.de/2013/10/debugging-rails-applications-in-development/
and then look at your debug message:
LEGAL BUSINESS: *#<ActiveRecord::Relation [#<LegalBusiness id: 58722, user_id: nil, legal_name:.........*
HINT: where are brackets [] usually used?
I solved the problem. The result of a quick copy-and-paste. I had an "#" in the wrong place.
#object = #Model.create()
and it should have been
#object = Model.create ()