Slow importing into Google Cloud SQL

Slow importing into Google Cloud SQL - google-cloud-sql

Google Cloud SQL is my first real evaluation at MySQL as a service. I created a D32 instance, set replication to async, and disabled binary logging. Importing 5.5 GB from dump files from a GCE n1-standard-1 instance in the same zone took 97 minutes.
Following the documentation, the connection was done using the public IP address, but is in the same region and zone. I'm fully open to the fact that I did something incorrectly. Is there anything immediately obvious that I should be doing differently?

we have been importing ~30Gb via cloud storage from zip files containing SQL statements and this is taking over 24Hours.
A big factor is the number of indexes that you have on the given table.
To keep it manageable, we split the file into chunks with each 200K sql statements which are being inserted in one transaction. This enables us to retry individual chunks in case of errors.
We also tried to do it via compute engine (mysql command line) and in our experience this was even slower.
Here is how to import 1 chunk and wait for it to complete. You cannot do this in parallel as cloudSql only allows for 1 import operation at a time.
#!/bin/bash
function refreshAccessToken() {
echo "getting access token..."
ACCESSTOKEN=`curl -s "http://metadata/computeMetadata/v1/instance/service-accounts/default/token" -H "X-Google-Metadata-Request: True" | jq ".access_token" | sed 's/"//g'`
echo "retrieved access token $ACCESSTOKEN"
}
START=`date +%s%N`
DB_INSTANCE=$1
GCS_FILE=$2
SLEEP_SECONDS=$3
refreshAccessToken
CURL_URL="https://www.googleapis.com/sql/v1beta1/projects/myproject/instances/$DB_INSTANCE/import"
CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:myprojectId' --header 'x-goog-api-version:1'"
CURL_PAYLOAD="--data '{ \"importContext\": { \"database\": \"mydbname\", \"kind\": \"sql#importContext\", \"uri\": [ \"$GCS_FILE\" ]}}'"
CURL_COMMAND="curl --request POST $CURL_URL $CURL_OPTIONS $CURL_PAYLOAD"
echo "executing $CURL_COMMAND"
CURL_RESPONSE=`eval $CURL_COMMAND`
echo "$CURL_RESPONSE"
OPERATION=`echo $CURL_RESPONSE | jq ".operation" | sed 's/"//g'`
echo "Import operation $OPERATION started..."
CURL_URL="https://www.googleapis.com/sql/v1beta1/projects/myproject/instances/$DB_INSTANCE/operations/$OPERATION"
STATE="RUNNING"
while [[ $STATE == "RUNNING" ]]
do
echo "waiting for $SLEEP_SECONDS seconds for the import to finish..."
sleep $SLEEP_SECONDS
refreshAccessToken
CURL_OPTIONS="-s --header 'Content-Type: application/json' --header 'Authorization: OAuth $ACCESSTOKEN' --header 'x-goog-project-id:myprojectId' --header 'x-goog-api-version:1'"
CURL_COMMAND="curl --request GET $CURL_URL $CURL_OPTIONS"
CURL_RESPONSE=`eval $CURL_COMMAND`
STATE=`echo $CURL_RESPONSE | jq ".state" | sed 's/"//g'`
END=`date +%s%N`
ELAPSED=`echo "scale=8; ($END - $START) / 1000000000" | bc`
echo "Import process $OPERATION for $GCS_FILE : $STATE, elapsed time $ELAPSED"
done

Related

Getting most recent execution by job via RunDeck API

Is there a more efficient way to get the most recent execution of every job in a RunDeck project that by (1) querying for the job list and then (2) querying for the max: 1 execution list of each job in serial?

The most efficient way to get all jobs last execution is getting the IDs, putting the IDs in a list, and then calling the execution endpoint based on this answer.
I made a working bash example using jq, take a look:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="40"
rdeck_token="RQvvsGODsP8YhGUw6JARriXOAn6s6OQR"
# project name
project="ProjectEXAMPLE"
# first get all jobs (of ProjectEXAMPLE project)
jobs=$(curl -s --location "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$project/jobs" --header "Accept: application/json" --header "X-Rundeck-Auth-Token: $rdeck_token" --header "Content-Type: application/json" | jq -r .[].id)
# just for debug, print all jobs ids
echo $jobs
# then iterates on the jobs id's and extracting the last succeeded execution
for z in ${jobs}; do
curl -s --location "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/job/$z/executions?status=succeeded&max=1" --header "Accept: application/json" --header "X-Rundeck-Auth-Token: $rdeck_token" --header "Content-Type: application/json" | jq
done
Sadly doesn't exist a direct way to do that in a single call.

keycloak group members causes LDAP query for each individual user

I have a keycloak instance with a read only LDAP Federation. I add members to groups by searching for both GROUP_ID and USER_ID (which queries the LDAP server) and then using a 'PUT' to add the USER to the GROUP (I use python to parse the JSON):
GROUP_ID=$(curl -s -H "Authorization: bearer $ACCESS_TOKEN" \
https://${KEYCLOAK_SERVER}/auth/admin/realms/myrealm/groups|python -c '
import json,sys,os;keycloak_data=json.load(sys.stdin);
GROUP=os.environ["GROUP"]
for g in keycloak_data:
if g["name"]==GROUP:
print g["id"]
sys.exit()
')
USER_ID=$(curl -s -H "Authorization: bearer $ACCESS_TOKEN" \
https://${KEYCLOAK_SERVER}/auth/admin/realms/myrealm/users?username=$USER | python -c '
import json,sys,os;keycloak_data=json.load(sys.stdin)
USERID=os.environ["USER"]
for c in keycloak_data:
if c["attributes"]["LDAP_ID"][0]==USERID:
print c["id"]
sys.exit()
')
echo "Adding $USER to $GROUP GROUP_ID=$GROUP_ID USER_ID=$USER_ID"
curl -s -X PUT -H "Authorization: bearer $ACCESS_TOKEN" \
https://${KEYCLOAK_SERVER}/auth/admin/realms/myrealm/users/$USER_ID/groups/$GROUP_ID
Above works great. But when I want to get a list of the members in that group it queries LDAP for every user (which can take a long time if there are over 100 users in a group)
curl -s -H "Authorization: bearer $ACCESS_TOKEN" \
https://${KEYCLOAK_SERVER}/auth/admin/realms/myrealm/groups/$GROUP_ID/members
Any way to query the group membership without triggering an LDAP query for every member of the group? I have already tried changing the LDAP Cache settings under the LDAP Federation and doesn't seem to help.
I just want to know (quickly) which users are in a given keycloak group.

Trying to create a usergroup in Jira Cloud with REST API

I have an upcoming tool migration where I can import assignees but not inactive ones - and there is no user group by default with only active users.
So I've exported all jira users and filtered based on active - so I have a nice list of all their usernames/emails. Now I want to use the REST API to create a usergroup from the list and add each user.
From the API documentation, it's pretty straightforward:
curl --request POST \
--url '/rest/api/3/group/user' \
--header 'Authorization: Bearer <access_token>' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
"accountId": "384093:32b4d9w0-f6a5-3535-11a3-9c8c88d10192"
}'
However, I'm not about to type in one by one the accountIds. How can I specify an excel list or how else can i achieve this?

Easier than I thought - Just made a bash script where accountId's is the variable to cycle through the addresses.

Google Cloud SQL import database programatically (without prompt)

I want to export a database and import the output into another database programatically. This is what I have so far:
gcloud sql export sql instance_name gs://bucketname/db.gz --database=db_name
gcloud sql databases create new_db --instance=instance_name
gcloud sql import sql instance_name gs://bucketname/db.gz --database=new_db
Created database [new_db].
instance: instance_name
Data from [gs://bucketname/db.gz]
will be imported to [instance_name].
Do you want to continue (Y/n)
As you can see the prompt is the issue.
How can I import it without being prompted? Is there another way to import an export?

You can use --quiet, -q parameter when running your gcloud command as shown below:
gcloud sql import sql instance_name gs://bucketname/db.gz --database=new_db -q
The gcloud Reference official documentation contains the following explanation about this parameter in case you want to take a look on it:
--quiet, -q
Disable all interactive prompts when running gcloud commands. If input
is required, defaults will be used, or an error will be raised.
Overrides the default core/disable_prompts property value for this
command invocation. Must be used at the beginning of commands. This is
equivalent to setting the environment variable
CLOUDSDK_CORE_DISABLE_PROMPTS to 1.
Additionally, you can perform the import/export tasks by using cURL API calls as an alternative option; In this way, you just need to send the authorized requests to the service.
*Importing:
ACCESS_TOKEN="$(gcloud auth application-default print-access-token)"
curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header 'Content-Type: application/json' \
--data '{"importContext":
{"fileType": "SQL",
"uri": "gs://[BUCKET_NAME]/[PATH_TO_DUMP_FILE]",
"database": "[DATABASE_NAME]" }}' \
-X POST \
https://www.googleapis.com/sql/v1beta4/projects/[PROJECT-ID]/instances/[INSTANCE_NAME]/import
*Exporting:
ACCESS_TOKEN="$(gcloud auth application-default print-access-token)" curl --header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header 'Content-Type: application/json' \
--data '{"exportContext":
{"fileType": "SQL",
"uri": "gs://<BUCKET_NAME>/<PATH_TO_DUMP_FILE>",
"databases": ["<DATABASE_NAME1>", "<DATABASE_NAME2>"] }}' \
-X POST \
https://www.googleapis.com/sql/v1beta4/projects/[PROJECT-ID]/instances/[INSTANCE_NAME]/export

Algolia: Delete multiple records from dashboard

How can I delete multiple records at once? Is it possible to select all, say "products" post_type, and delete it or it has to be one by one? (I'm not trying to clear all the records)

Algolia's dashboard is not designed to be a complete graphical interface on top of the API, it's mostly here for convenience, understanding and testing purposes, not complete management of the data.
As soon as you start being limited by the dashboard, you should probably write a small script to achieve what you're trying to do.
Fortunately, it's been designed to be as easy as possible.
With PHP, here's how it would look like:
First, let's create a small folder to hold the script.
mkdir /tmp/clear-algolia && cd /tmp/clear-algolia
If you don't have composer yet, you can simply install it in the current folder by launching the commands described here.
If you've just installed it and want to just use it just for this session:
alias composer=php composer.phar
Then install Algolia using composer:
composer require algolia/algoliasearch-client-php
Write a small script along those lines:
<?php
// removeSpecific.php
require __DIR__ . '/vendor/autoload.php';
$client = new \AlgoliaSearch\Client("YOUR_APP_ID", "YOUR_ADMIN_API_KEY");
$index = $client->initIndex('YOUR_INDEX');
$index->deleteByQuery('', [ 'filters' => 'post_type:products' ]);
?>
Then run it:
php removeSpecific.php
And you're good to go! Next time you want to do an operation on your index, you'll only have to change the last line of the script to achieve what you want.

You can use the REST API.
It can be easier or faster to do it with PostMan.
Here you can check a simple request: https://www.algolia.com/doc/rest-api/search/#delete-by
To first check what you are deleting, you can use:
curl --location --request POST 'https://[AplicationID]-
dsn.algolia.net/1/indexes/[IndexName]/query' \
--header 'X-Algolia-Application-Id: XXXXXXXXXXXX' \
--header 'X-Algolia-API-Key: XXXXXXXXXXXXXXXXXXXXXXXX' \
--header 'Content-Type: application/json' \
--data-raw '{
"params":"numericFilters=id<=9000"
}'
And to delete the records you can use:
curl --location --request POST
'https://[AplicationID].algolia.net/1/indexes/[IndexName]/deleteByQuery' \
--header 'X-Algolia-Application-Id: XXXXXXXXXXXX' \
--header 'X-Algolia-API-Key: XXXXXXXXXXXXXXXXXXXXX' \
--header 'Content-Type: application/json' \
--data-raw '{
"params":"numericFilters=id<=8000"
}'
The "params" should receive a Search Parameter, you can find a list here: https://www.algolia.com/doc/api-reference/search-api-parameters/

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Slow importing into Google Cloud SQL - google-cloud-sql

Related

Getting most recent execution by job via RunDeck API

keycloak group members causes LDAP query for each individual user

Trying to create a usergroup in Jira Cloud with REST API

Google Cloud SQL import database programatically (without prompt)

Algolia: Delete multiple records from dashboard

Categories

Resources