Race condition, url shortener algorithm/Rails/MongoDB/MongoMapper - mongodb

I created a url shortener algorithm with Ruby + MongoMapper
It's a simple url shortener algorithm with max 3 digits
http://pablocantero.com/###
Where each # can be [a-z] or [A-Z] or [0-9]
For this algorithm, I need to persist four attributes on MongoDB (through
MongoMapper)
class ShortenerData
include MongoMapper::Document
VALUES = ('a'..'z').to_a + ('A'..'Z').to_a + (0..9).to_a
key :col_a, Integer
key :col_b, Integer
key :col_c, Integer
key :index, Integer
end
I created another class to manage ShortenerData and to generate the unique
identifier
class Shortener
include Singleton
def get_unique
unique = nil
#shortener_data.reload
# some operations that can increment the attributes col_a, col_b, col_c and index
# ...
#shortener_data.save
unique
end
end
The Shortener usage
Shortener.instance.get_unique
My doubt is how can I make get_unique synchronized, my app will be
deployed on heroku, concurrent requests can call
Shortener.instance.get_unique

I changed the behaviour to get the base62 id. I created an auto increment gem to MongoMapper
With the auto incremented id I encode to base62
The gem is available on GitHub https://github.com/phstc/mongomapper_id2
# app/models/movie.rb
class Movie
include MongoMapper::Document
key :title, String
# Here is the mongomapper_id2
auto_increment!
end
Usage
movie = Movie.create(:title => 'Tropa de Elite')
movie.id # BSON::ObjectId('4d1d150d30f2246bc6000001')
movie.id2 # 3
movie.to_base62 # d
Short url
# app/helpers/application_helper.rb
def get_short_url model
"http://pablocantero.com/#{model.class.name.downcase}/#{model.to_base62}"
end
I solved the race condition with MongoDB find_and_modify http://www.mongodb.org/display/DOCS/findAndModify+Command
model = MongoMapper.database.collection(:incrementor).
find_and_modify(
:query => {'model_name' => 'movies'},
:update => {'$inc' => {:id2 => 1}}, :new => true)
model[:id2] # returns the auto incremented_id
If this new behaviour I solved the race condition problem!
If you liked this gem, please, help to improve it. You’re welcome to make your contributions and send them as a pull request or just send me a message http://pablocantero.com/blog/contato

Related

How to end up with a new Postgresql column value if there is a mismatch

INSERT INTO main_parse_user ("user_id","group_id", "username","bio", "first_name")
VALUES (%s,%s,%s,%s,%s) ON CONFLICT (user_id) DO UPDATE SET (group_id,username,bio,first_name) =
(EXCLUDED.group_id,EXCLUDED.username,
coalesce(main_parse_user.bio, EXCLUDED.bio),EXCLUDED.first_name)
Here is the code I have now, in case of a conflict, it updates everything except bio (if it is empty, it updates)
There was a new need to check with the old one when a new base arrives and, if the values ​​differ, to supplement, and if the values ​​do not differ, just leave it as it is
EXAMPLE
OLD
bio id
1 qwerty
NEW
bio id
1 qwerty1
AFTER
bio id
1 qwerty | 1
And if both bios in the old and new tables are the same, then do not touch
What you are getting is precisely does the coalesce() function does. Since the new requirement is to to supplement (I assume that means add to the existing value) you can replace, coalesce(...) with
case when main_parse_user.bio is distinct from EXCLUDED.bio
and EXCLUDED.bio is not null
then concat(trim(main_parse_user.bio), ' ', trim(EXCLUDED.bio))
else main_parse_user.bio
end

How to get record by hash digest in Aerospike

Can I retrieve record from aerospike database by previously saved hash digest?
Here's an example how you do it in the Aerospike client for Python. The Client.get needs a valid key tuple, which can be (namespace, set, None, digest) instead of the more standard (namespace, set, primary-key).
>>> client = aerospike.client(config).connect()
>>> client.put(('test','demo','oof'), {'id':0, 'a':1})
>>> (key, meta, bins) = client.get(('test','demo','oof'))
>>> key
('test', 'demo', None, bytearray(b'\ti\xcb\xb9\xb6V#V\xecI#\xealu\x05\x00H\x98\xe4='))
>>> (key2, meta2, bins2) = client.get(key)
>>> bins2
{'a': 1, 'id': 0}
>>> client.close()
You need three things to locate a record in Aerospike - namespace, set name (if used, can be null) and your key (that you used initially - say a string or integer). The "Key" object you pass to the get call comprises these three entities. The client library will compute the hash using set + your key, then in addition use the namespace to get the record. Aerospike only stores the hash (unless sendKey is set to true) but you need the namespace as well. So in your case, you can create the Key object that is passed to get() by specifying a namespace and hash and then pass that key object to get() but you cannot use get() with just the hash and not specifying a namespace.

pymongo - ensureIndex and upserts

I have a simple dict that defines a base record as shown below:
record = {
'h': site_hash, #combination of date (below) and site id hashed with md5
'dt': d, # date - YYYYMMDD
'si': data['site'], # site id
'cl': data['client'], # client id
'nt': data['type'], # site type
}
Then I call the following to update the record if it doesn't exist with the following:
collection.update(
record,
{'$inc':updates}, # updates contain some values that increase such as events: 1, actions:1, etc
True # do upsert
);
I was wondering if I change the above to the following if it would have better performance since the code below only looks existing 'h' values instead of h/dt/si/cl/nt and I'd only need ensureIndex on the 'h' field. However, obviously $set would execute every time causing more writes the record as opposed to just $inc.
record = {
'h': site_hash, #combination of date (below) and site id hashed with md5
}
values = {
'dt': d, # date - YYYYMMDD
'si': data['site'], # site id
'cl': data['client'], # client id
'nt': data['type'], # site type
}
collection.update(
record,
{'$inc':updates,'$set':values},
True # do upsert
);
Does anyone have any tips or suggestions on best practice here?
If 'h' is already unique then you can just create an index on h, there's no need to index 'dt', 'si', etc. In that case I expect your first example to be a little more performant under very heavy load, for the somewhat obscure reason that it will create smaller entries in the journal.

Mongoid where greater than sum of two fields

Hi I'm using mongoid (mongodb) to go a greater than criteria:
Account.where(:field1.gt => 10)
But I was wondering if it was possible to do a criteria where the sum of two fields was greater than some number. Maybe something like this (but doesn't seem to work):
Account.where(:'field1 + field2'.gt => 10)
Maybe some embedded javascript is needed? Thanks!
I'd recommend using the Mongoid 3 syntax as suggested by Piotr, but if you want to make this much more performant, at the expense of some storage overhead, you could try something like this:
class Account
# ...
field :field1, :type => Integer
field :field2, :type => Integer
field :field3, :type => Integer, default -> { field1 + field2 }
index({ field3: 1 }, { name: "field3" })
before_save :update_field3
private
def update_field3
self.field3 = field1 + field2
end
end
Then your query would look more like:
Account.where(:field3.gte => 10)
Notice the callback to update field3 when the document changes. Also added an index for it.
You can use MongoDB's javascript query syntax.
So you can do something like:
Account.collection.find("$where" => '(this.field1 + thist.field2) > 10')
Or in Mongoid 3 the following will work as well
Account.where('(this.field1 + thist.field2) > 10')
As Sammaye mentioned in the comment it introduces the performance penealty since javascript has to be executed for every document individually. If you don't use that query that often then it's ok. But if you do I would recommend adding another field that would be the aggregation of field1 and field2 and then base the query on that field.

Sunspot Rails to order search results by model id?

Assume that I have the following model and I have made it searchable with sunspot_rails.
class Case < ActiveRecord::Base
searchable do
end
end
Standard schema.xml of Sunspot in Rails declare id as an indexed field. When I use the web interface to access solr and test queries a query like:
http://localhost:8982/solr/select/?q=id%3A%22Case+15%22&version=2.2&start=0&rows=10&indent=on
which searches for Cases with id equal to Case 15 works fine and returns results.
The problem is when I carry out the search with Sunspot Rails in the rails console:
s = Case.search do
keywords('id:"Case 15"')
end
I get:
=> <Sunspot::Search:{:fl=>"* score", :rows=>10, :start=>0, :q="id:\"Case 15\"", :defType=>"dismax", :fq=>["type:Case"]}>
which show that it correctly puts in :q the correct query value, but the hits are 0:
s.hits
returns
=> []
If we assume that keywords is not equivalent and only searches the text field (full-text search) and not the field defined before the colon :, then I can try the following:
s = Case.search do
with(:id, "Case 15")
end
but this fails with a Sunspot exception:
Sunspot::UnrecognizedFieldError: No field configured for Case with name 'id'
How can I search using the indexed standard solr/sunspot id field of my model?
And to make the question more useful, how can I order by the id. The following does not work:
s = Case.search do
keywords("xxxx")
order_by :id, :desc
end
does not work. Sunspot::UnrecognizedFieldError: No field configured for Case with name 'id'
The id that you are talking about is a Sunspot internal field and it should not be used directly.
Why not add your own id field (change variable name, to avoid name collision):
class Case < ActiveRecord::Base
searchable do
integer :model_id {|content| content.id }
end
end
and then
s = Case.search do
keywords("xxxx")
order_by :model_id, :desc
end
Other (messy) option would be to hack directly solr params:
s = Case.search do
keywords("xxxx")
adjust_solr_params(:sort, 'id desc')
end