I have been using HBase with some sample DNS data. My idea is quite simple is using HBase's revision/VERSION scheme to make a one to many relationship. This is mainly to simplify loading and management of data. Here is a sample of my design
A-Records table
ROW: www.example.com
IP: 1.1.1.1
TIMESTAMP: 1388922331000
VERSION : 1
IP: 1.1.1.7
TIMESTAMP: 1388940991000
VERSION: 2
My Hbase table looks like
hbase>create table 'a', 'ip'
hbase> alter 'a', { NAME => 'ip', VERSIONS => 100 }
When you query for all rows of www.example.com in base shell or using
hbase> get 'a', 'www.example.com, { COLUMNS => 'ip', VERSIONS => 100 }
I get all the results back. I can iterate through them in a RESTFUL API and provide user the experience of one to many table.
Experts in HBase see any issues with this design?
Your design is OK. I know some system's design just like yours.
BTW, if you use java. I am writing a framework which can mapping one row with multi-version to a list of DO. Which maybe help you.
https://github.com/zhang-xzhi/simplehbase
Related
can anyone tell me about relationship on typo 3? e.g. i got 2 tables, 'A' and 'B', currently i got simple form that can inserting data into 'A' table, the 'A' table fields are "name","id_types","address". the "id_types" is foreign_key from 'B' table. And the 'B' table fields are "id_types","types_name". How can i make this relation on typo 3?
is it something to do with persistence_object_identifier?
this is my code for trying manually adding into second table
public function createartistAction(Artist $artist)
{
$artisttype = new Artisttype();
$artisttype->setArtisttype_name($artist->getArtisttype_id());
$this->ArtisttypeRepository->add($artisttype);
$datenow = date('d/m/Y');
$date = date_create($datenow);
$artist->setCreated_at($date);
$artist->setUpdated_at($date);
$this->addFlashMessage('Artist Created.');
$this->ArtistRepository->add($artist);
$this->redirectToUri('/artist/viewArtist');
}
any help would be much appreciated.
thanks
I guess that is here to get you started:
http://docs.typo3.org/flow/TYPO3FlowDocumentation/TheDefinitiveGuide/PartII/Modeling.html
All relationship in TYPO3 Flow is done in the mvc model and it uses various design patterns to make it easy for the programmer. So in your case the best way would be not to even touch the database by itself but use the opportunities the php framework offers.
It's not a fast solution and maybe one have to read a lot before one can start. But once you understand it, it's a pretty handy way to get your data structured.
I am using postgres database in my cakephp project.
I have a table with some data and a column called "status".
"Status" it's enum and can be "waiting", "in_progress", "completed".
My script has to get the first found record with status=waiting, change the status to "in_progress" and also get the id of this record and all this in one atomic procedure.
The id is needed after the computation to change status to "completed".
There will be many such scripts working in parrallel thats why I need this simple "row locking".
I am using postgres db for the first time - is there any easy way to accomplish this?
Maybe cake supports some convinient way of doing this?
with cakePHP it has no diffrence what kind of DB you have, simply use $this->Model->find... modify your status and then '$this->Model->save....`
$row = $this->Model->find('first',array('conditions' => array('Model.status' => 'waiting')));
$row['Model']['status'] = 'in progress';
$this->Model->save($row);
(...do something...)
$row['Model']['status'] = 'completed';
$this->Model->save($row);
propably you want to run it in loop and put some kind of const as statuses...
I need to store benchmark runs for each nightly builds. To do this, i came up with the following data model.
BenchmarkColumnFamily= {
build_1: {
(Run1, TPS) : 1000K
(Run1, Latency) : 0.5ms
(Run2, TPS) : 1000K
(Run2, Latency) : 0.5ms
(Run3, TPS) : 1000K
(Run3, Latency) : 0.5ms
}
build_2: {
...
}
...
}
To create such a schema, i came up with the following command on cassandra-cli:
create column family BenchmarkColumnFamily with
comparator = 'CompositeType(UTF8Type,UTF8Type)' AND
key_validation_class=UTF8Type AND
default_validation_class=UTF8Type AND
column_metadata = [
{column_name: TPS, validation_class: UTF8Type}
{column_name: Latency, validation_class: UTF8Type}
];
Does the above command create the schema i intend to create? The reason for my confusion is that, when i insert data into the above CF using:
set BenchmarkColumnFamily['1545']['TPS']='100';
it gets inserted successfully even though the comparator type is composite. Furthermore, even the following command gets executed successfully
set BenchmarkColumnFamily['1545']['Run1:TPS']='1000';
What is it that im missing?
I don't think you're doing anything wrong. The CLI is parsing the strings for values based on the type, probably using org.apache.cassandra.db.marshal.AbstractType<T>.fromString(). And for Composite types, it uses ':' as field separator (not that I've seen documented, but I've experimented with Java code to convince myself.
Without a ':', it seems to just set the first part of the Composite, and leave the second as null. To set the second only, you can use
set BenchmarkColumnFamily['1545'][':NOT_TPS']='999';
From the CLI, dump out the CF:
list BenchmarkColumnFamily;
and you should see all the names (for all the rows), e.g.
RowKey: 1545
=> (column=:NOT_TPS, value=999, timestamp=1342474086048000)
=> (column=Run1:TPS, value=1000, timestamp=1342474066695000)
=> (column=TPS, value=100, timestamp=1342474057824000)
There is no way (via CLI) to constrain the composite elements to be non-null or specific values, that's something you'd have to do in code.
Also, the column_metadata option for the CF creation is unnecessary, since you've already listed the default validation as UTF8Type.
The cassandra-cli tool is very limited in dealing with composites. Also, some unexpected things can happen in Cassandra with respect to validation of explicit, user-supplied composites. I don't know the exact answer for your situation, but I can tell you that you'll find this sort of model vastly easier to work with using the CQL 3 engine.
For example, your model could be expressed as:
CREATE TABLE BenchmarkColumnFamily (
build text,
run int,
tps text,
latency text,
PRIMARY KEY (build, run)
);
INSERT INTO BenchmarkColumnFamily (build, run, tps, latency) VALUES ('1545', 1, '1000', '0.5ms');
See this post for more information about how that translates to the storage-engine layer.
In my keyspace
posts = [
#key
'post1': {
# columns and value
'url': 'foobar.com/post1',
'body': 'Currently has client support FOOBAR for the following programming languages..',
},
'post2': {
'url': 'foobar.com/post2',
'body': 'The table with the following table FOOBAR structure...',
},
# ... ,
}
How to create a like query in Cassandra to get all posts that contains the word 'FOOBAR'?
In SQL is SELECT * FROM POST WHERE BODY LIKE '%FOOBAR%', but in Cassandra?
The only way to do this efficiently is to use a full-text search engine like https://github.com/tjake/Solandra (Solr-on-cassandra). Of course you can roll your own using the same techniques manually, but usually this is not called for.
Note that this is true for SQL databases too: they will translate %FOO% to a table scan, unless you use a FTS extension like postgresql's tsearch2.
You might create another column family where the keys are the domains, and the values are the keys in your original column family. That way you could refer to records within a specific domain directly.
Cassandra 3.4 added support for LIKE in CSQL. So finally it is available natively.
I have a DBIx::Class object representing an eBay auction. The underlying table has a description column which contains a lot of data. The description column is almost never used, so it's not included in the DBIx::Class column list for that table. That way, most queries don't fetch the auction description data.
I do, however, have one script that needs this column. In this one case, I want to access the contents of the description column as I would any other column:
$auction->description
How can I accomplish this without forcing all other queries to fetch the description column?
In older versions of DBIx::Class (not sure of the version number), the following used to work:
my $rs = $schema->resultset('Auctions');
my $lots = $rs->search(
undef,
{ '+select' => 'description', '+as' => 'description' },
);
That doesn't seem to work for row updates under modern versions of DBIx::Class. Trying that with an update
$auction->update({ description => '...'})
under DBIx::Class 0.08123 gives the following error: "DBIx::Class::Relationship::CascadeActions::update(): No such column description at ..."
Assuming that the script needing the extra column is running in its own process. You can do something like this:
my $rs = $schema->resultset('Auctions');
$rs->result_source->add_columns('description');
YourApp::Schema::Lots->add_columns('description');
YourApp::Schema::Lots->register_column('description');
Of course, that's a global change. After adding the column, other code in the same process will start fetching the description column in queries. Not to mention, it's kind of ugly.