A script that deletes all tables in Hbase - command-line

I can tell hbase to disable and delete particular tables using:
disable 'tablename'
drop 'tablename'
But I want to delete all the tables in the database without hardcoding the names of any of the tables. Is there a way to do this? I want to do this through the command-line utility ./hbase shell, not through Java or Thrift.

disable_all and drop_all have been added as commands in the HBase ruby shell. These commands were added in jira HBASE-3506 These commands take a regex of tables to disable/drop. And they will ask for confirmation before continuing. That should make droping lots of tables pretty easy and not require outside libraries or scripting.

I have a handy script that does exactly this, using the Python Happybase library:
import happybase
c = happybase.Connection()
for table in c.tables():
c.disable_table(table)
c.delete_table(table)
print "Deleted: " + table
You will need Happybase installed to use this script, and you can install it as:
sudo easy_install happybase

You can pipe commands to the bin/hbase shell command. From there you can use some scripting to grab the table names and pipe the disable/delete commands back to hbase.
i.e.
echo "list" | bin/hbase shell | ./filter_table_names.pl > table_names.txt
./turn_table_names_into_disable_delete_commands.pl table_names.txt | bin/hbase shell

There is a hack.
Open $HBASE_HOME/lib/ruby/shell/commands/list.rb file and add below line at the bottom of command method.
return list
After that, list command returns an array of names of all tables.
And then do just like this.
list.each {|t| disable t;drop t}

I'm not deleting tables through the hbase shell but I deleting them from the command line by,
- deleting my hadoop distributed filesystem directory, then,
- creating a new clean hadoop distributed filesystem directory, then,
- formatting my hadoop distributed filesystem with 'hadoop namenode -format', then,
- start-all.sh and start-hbase.sh
Reference:
http://hadoop.apache.org/common/docs/r0.20.1/api/overview-summary.html#overview_description

If you're looking for something that will do this in a 'one-liner' via a shell script you can use this method:
$ echo 'list.each {|t| disable t; drop t}; quit;' | hbase shell
NOTE: The above was run from Bash shell prompt. It echoes the commands into hbase shell and does a loop through all the tables that are returned from the list command, and then disables & drops each table as it iterates through the array that list returned. Once it's done, it quits.

Related

Creating Batch Files with PostgreSQL \copy Command in Jetbrains Datagrip

I'm familiarizing myself with the standalone version of Datagrip and having a bit of trouble understanding the different approaches to composing SQL via console, external files, scratch files, etc.
I'm managing, referencing the documentation, and am happy to figure things out as such.
However, I'm trying to ingest CSV data into tables via batch files using the Postgres \copy command. Datagrip will execute this command without error but no data is being populated.
This is my syntax, composed and ran in the console view:
\copy tablename from 'C:\Users\username\data_file.txt' WITH DELIMITER E'\t' csv;
Note that the data is tab-separated and stored in a .txt file.
I'm able to use the import functions of Datagrip (via context menu) just fine but I'd like to understand how to issue commands to do similarly.
\copy is a command of the command-line PostgreSQL client psql.
I doubt that Datagrip invokes psql, so it won't be able to use \copy or any other “backslash command”.
You probably have to use Datagrip's import facilities. Or you start using psql.
Ok, but what about the SQL COPY command https://www.postgresql.org/docs/12/sql-copy.html ?
How can I run something like that with datagrip ?
BEGIN;
CREATE TEMPORARY TABLE temp_json(values text) ON COMMIT DROP;
COPY temp_json FROM 'MY_FILE.JSON';
SELECT values->>'aJsonField' as f
FROM (select values::json AS values FROM temp_json) AS a;
COMMIT;
I try to replace 'MY_FILE.JSON' with full path, parameter (?), I put it in sql directory etc.
The data grip answer is :
[2021-05-05 10:30:45] [58P01] ERROR: could not open file '...' for reading : No such file or directory
EDIT :
I know why. RTFM! -_-
COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible by the PostgreSQL user (the user ID the server runs as) and the name must be specified from the viewpoint of the server.
Sorry.....

How to source multiple functions in psql?

I have a directory with at least 6 function files.
I'm using psql and I need to be able to source (initialize ?) all function files at once.
I'm sure making a single function and call all others like SELECT fn1, fn2 isn't going to work.
Doing \i functions_folder/fn1.sql 6 times isn't ideal either.
So is there a way I can (maybe) \i functions/*.sql? Doing that currently gives me
functions/*.sql: No such file or directory
I'm using psql on postgresql 9.6.2
If you are using *nix OS:
postgres=# \! cat ./functions/*.sql > all_functions.sql
postgres=# \i all_functions.sql
For Windows try to find the analogue of the cat (as I remember it is copy command with some flags)
PS I have the feeling that it could be done by using backquotes:
Within an argument, text that is enclosed in backquotes (`) is taken as a command line that is passed to the shell. The output of the command (with any trailing newline removed) replaces the backquoted text.
Documentation
But still have no idea how. Probably somebody more experienced will provide a hint...
Create a wrapper that contains the \i commands, and \i the wrapper.

Export and Append Data to CSV

I am trying to export data to existing csv file.
I have been using these methods to export data.
Microsoft.Jet.OLEDB.4.0
SQLCMD
Data Export Wizard
However I cannot find if there is any parameter / option to append the exported data to existing file. Is there any way? Thanks.
Note: answer is biased towards *nix operating systems; I'm not too familiar with windows.
If you can run your sql query via the command line,
using a scripting language, you can use a library that creates an MSSQL connection, (an example of this is a node.js program I authored (https://github.com/skilbjo/aqtl but any tool will do), or
a windows binary that runs something like sqlcmd from the command line,
you can just pipe the output to the csv file. For example:
$ node runquery.js myquery.sql >> existing_csv_file.csv

How to create batch file to execute multiple DB2 queries

I have to run some DB2 SQL queries more frequently.It is taking lot of time to do that manually. For that, I am planning to create batch file to execute those DB2 SQL commands.
So,Please let me know whether it is possible to create windows batch file to run set of DB2 sql queries.
You can save a .sql file to your hard drive, and execute it using the DB2 command line using:
db2 -vtf C:\path\to\somefile.sql
-v echoes the command text back to the command line
-t sets the statement terminator to ;. If you want to use something else (creating stored procedures for example), you can use -td__ where __ represents up to two characters you can use as the terminator. Alternatively, you can use --#SET TERMINATOR __ inside your batch file
-f tells the command line to load the commands from the file.
See other command line options here.

How to include MySQL database schema on GitHub?

Stackoverflow and MySQL-via-command-line n00b here, please be gentle! I've been looking around for answers to my question but could only find topics dealing with GitHubbing MySQL dumps (as in: data dumps) for collaboration or MySQL "version control" via GitHub, neither of which tells me what I want to know:
How does one include MySQL database schemas/information on tables with PHP projects on GitHub?
I want to share a PHP project on GitHub which relies on the existence of a MySQL database with certain tables. If someone wanted to copy/make use of this project, they would need to have these particular tables in place to make the script work (all tables but one are empty in the beginning and only get filled by the user over time, via the script; the non-empty table holds three values from the start). How does one go about this, what is common practice?
Would I just get a (complete) dump file of my own db/tables, then
delete all the data parts (except for that one non-empty
table), set all autoincrements to zero and then upload that .sql file
to GitHub along with the rest of the project?
OR
Is it best/better practice to write a (PHP) script with which the
(maybe not-so-experienced) user can create these tables without
having to use mysqldump/command line magic?
If solution #1 is the way to go, would I include further instructions on how to use such a .sql file?
Sorry if my questions sound silly, but as I said above, I myself am new to using the command line for MySQL-related things and had only ever used phpMyAdmin until yesterday (when I created my very first dump file with mysqldump - yay!).
Common practice is to include an install script that creates the necessary tables, so solution #2 would be the way to go.
[edit] That script could ofc just replay a dump. ;)
You might also be interested in migrations: How to automate migration (schema and data) for PHP/MySQL application
If you want also track database schema changes
You can use git hooks.
In directory [your_project_dir]/.git/hooks add / edit script pre-commit
#!/bin/sh -e
set -o errexit
# -- you can omit next line if not using version table
version=`git log --tags --no-walk --pretty="format:%d" | sed 1q | sed 's/[()]//g' | sed s/,[^,]*$// | sed 's ...... '`
BASEDIR=$(dirname "$0")
# -- set directorey wher schema dump is placed
dumpfile=`realpath "$BASEDIR/../../install/database.sql"`
echo "Dumping database to file: $dumpfile"
# -- dump database schema
mysqldump -u[user] -p[password] --port=[port] [database-name] --protocol=TCP --no-data=true --skip-opt --skip-comments --routines | \
sed -e 's/DEFINER[ ]*=[ ]*[^*]*\*/\*/' > "$dumpfile"
# -- dump versions table and update core vorsiom according to last git tag
mysqldump -u[user] -p[password] --port=[port] [database-name] [versions-table-name] --protocol=TCP --no- data=false --skip-opt --skip-comments --no-create-info | \
sed -e 's/DEFINER[ ]*=[ ]*[^*]*\*/\*/' | \
sed -e "/INSERT INTO \`versions\` VALUES ('core'/c\\INSERT INTO \`versions\` VALUES ('core','$version');" >> "$dumpfile"
git add "$dumpfile"
# --- Finished
exit 0
Change [user], [password], [port], [database-name], [versions-table-name]
This script is executed autamatically by git on each commit. If commiting tag new version is saved to table dump by tag name. If no changes in database, nothing is commited. Make sure if script is executable :)
Your install script can take sql queries from this dump and developer can easy track database changes.