Cannot start, stop, enter VE using OpenVZ - openvz

I'm using Debian Unstable kernel 2.6.32-5-openvz-amd64 (But I don't think it's a problem).
After install and run our VEs for several month, our hard disk is nearly full and we add 3 more hard drives to make new RAID 5 Array, format it as ext4 then mount with location /openvz
I have a VE with ID 112, I desire to change its configuration to make private area from /var/lib/vz/private/112 (1) to /openvz/112 (2)
After syncing all data from (1) to (2), I cannot start VE 112. I revert configuration back to original, but, when I use vzctl status 112:, it shows:
# vzctl status 112
VEID 112 exist mounted running
and cannot enter the ve:
# vzctl enter 112
enter into VE 112 failed
cannot stop or restart with error: Operation time out.
I've tried many ways: try to umount, mount the private area, or use MAKEDEV to make tty or pty, using vzctl chkpnt 112 --kill but it does not work.
I dont want to reboot this server, it contains 2 others VEs that are running well without problem. If someone did face with same problem, please let me know your solutions.
Thank you very much,
--hung

Are you able to exec commands within your CT using 'vzctl exec'?
If it is possible try
vzctl exec 112 ps aux
to check what is running within your CT.

If you cannot login to your CT because of missing /dev/pts you may mount it with 'vzctl exec':
vzctl exec 112 mount devpts /dev/pts -t devpts

The answer for my question is: I reformat new partition with ext3 and re-sync data. everything worked as normal :)

Have you tried starting the VPS up in verbose mode? You can do this by doing:
vzctl --verbose start 112

Related

Unable to connect to SSH neither throught cloud shell or SCP

Before this error happens:
I have a VM, I tried to change the permission of all folders to 777, in order to get past an error from data transfer to Cloud Run.
leads to "sudo: /etc/sudo.conf is world writable sudo: /usr/bin/sudo must be owned by uid 0 and have the setuid bit set" when I use SSH
I fixed it by mounting this infected disk to a temp Instance , changed it back with "chmod 755 /etc/sudo.conf
chmod 4755 /usr/bin/sudo"
Now,
I have 2 problems.
I still not able to connect to SSH.
I tried troubleshooting and all ticks are green, and I did not have an IAP problem before.
Nor FTP (I use Puttygen to create a private key then update VM's meta)
the 20 GB disk became 65 GB. Is this what caused the problem? anyway to revert back to 20GB without damaging the disk
Right now, I can still access the site, it runs fine. https://www.nasavape.com

Lost ZFS pool and looking for ways to recover

In a Proxmox machine I noticed some of the backups of some VM's were failing, so I wanted to test stuff.
Whilst testing the whole host stopped responding and I forced a reboot.
After the reboot I seem to have lost the whole data store.
Almost every zfs command results in a freeze.
zpool status,zpool list, you name it, it locks up and you can't even ctrl break out of it.
I can still create a new SSH session and try other things though.
In an attempt to see what is causing the commands to hang I thought about running
zpool set failmode=continue
hoping it will show me an error, but as you can guess, that command also hangs.
It's a pool created on two nvme drives. The original command to create the pool was
zpool create -f -o ashift=12 storage-vm /dev/nvme0n1 /dev/nvme1n1
First thing I thought was that one of the nvme's had gone bad so I checked the SMART status, but it shows both drives are perfectly healthy.
Then before trying other stuff I decided to backup the drives to an NFS share with the dd command.
dd if=/dev/nvme0n1 of=/mnt/pve/recovery/nvme0n1
dd if=/dev/nvme1n1 of=/mnt/pve/recovery/nvme1n1
Both commands completed and on the NFS share I have 2 images of the exact same size (2TB each)
Then I tried to do a non destructive read/write test with dd on both the nvme's and got no errors.
In order to rule out as much as possible I build another Proxmox machine using spare hardware (same brand and type etc.) and place the drives in there.
On the new machine all zpool commands also hang. If i run zpool status with the drives removed from the motherboard, it does not hang, but obviously it has nothing to show.
So I placed the nvme's back in the original machine.
zdb -l /dev/nvme0n1 gives
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
which kind of worries me. It does the same for the other nvme.
And now I'm running out of ideas. I have little knowledge of the zfs system and don't know what is possible to save the data.
Obviously, the drives are not really dead as the smart tells me it is healthy and I can dd an image from them.
Things like faulty RAM or motherboard are pretty much ruled out also with the hardware swap.
Is there a way to recover at least some VM's from that storage?
Help/pointers wil be greatly appreciated.
The issue was eventually solved and this is what I did.
Considering the volume was made out of 2 nvme drives I created 2 loop devices using the dd images.
losetup -fP /mnt/pve/recovery/nvme0n1
losetup -fP /mnt/pve/recovery/nvme1n1
You can check the mounted loop devices with lsblk and unmount them with losetup -d /dev/loop[X]
Finally I imported the pool devices into ZFS in readonly mode and I was able to access/recover all my data
zpool import -f -d /dev/loop0p1 -f -d /dev/loop1p1 -o readonly=on storage-vm

Raspberry Pi Losing Mounted Drive After Reboot

Brand new to the world of Pi - like so new that I had never even touched one until three days ago, and know very little about Linux... I have a Western Digital MyBook plugged directly into my router, and I've found I'm able to mount this as a drive with the following command:
sudo mount -t cifs -o user=yourusername,passwd=yourpasswd,rw,file_mode=0777,dir_mode=0777 //mybookIP/public /mnt/mybook
Unfortunately, it seems to drop this mount whenever I reboot. Anyone have a suggestion on how to make this permanent?
Based on the comments here, this is what I did:
First, in Terminal I ran:
sudo nano /etc/fstab
Once that was opened, I added the line:
//mbookIP/public /mnt/mybook cifs _netdev,username=yourusername,password=yourpasswd 0 0
Once I saved this I was able to reboot and the mounted drive was visible when it all loaded back up again.

How can I Increase Memory of Beaglebone Black?

everyone. I have been trying to increase the memory of my BleagleBone Black rev c without success.
I have followed these instructions in order to increase the memory of my BBB with a 16GB microSD card. I have already tried burning 2 different images Debian 9.1 2017-08-31 4GB SD LXQT and Debian 8.7 2017-03-19 4GB SD LXQT (without flashing the eMMc) .
The steps that I have been using are listed below.
What I first did was to burn the image into the microSD card using
Etcher.
Then I inserted the microSD into the BBB, I pushed the boot button
and then I plugged it into my computer to turn it on.
After that, I logged into my BBB using ssh and I checked for the
Debian version and it was correct. Indicating that the boot from the
microSD card was correct, but when I tried to check disk space I
couldn´t find the partition for the microSD.
As you can see in the image below it is supposed to show the rootfs where I have the new BBB image and the 16GB extra space, but I´m not able to see the extra partition. Does anyone know what I could be possibly doing wrong?
I am facing with the same issue and I end up with
Login to your BBB by ssh
Run this command
nano grow_partition.sh
copy code from here then paste it on the terminal
save file by pressing control + o then enter
Exit from nano editor by pressing control + x
Run this command sudo ./grow_partition.sh
Reboot BBB
Enjoy :)
I have a beaglebone black (original with 512MB mem) and I was able to use a different method to add swap memory successfully (unfortunately user3680704's method didn't work for me).
I got the idea from this post which basically says the following:
You can check your current memory with
free -h
And you can create the swap memory by running these commands. Again a more detailed better explanation is in the link above, but in case that ever goes dead you can follow these:
sudo fallocate -l 1G /swapfile
ls -lh /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
Next open the fstab file by running
vi /etc/fstab
and add the following line to the file
/swapfile swap swap defaults 0 0
You can then check your swap by running
swapon --show
This worked well for me, added 1G of swap. You can add more or less by changing the 1G value

Postgres with Docker: Postgres fails to load when persisting data

I'm new to Postgres.
I updated the Dockerfile I use and successfully installed Postgresql on it. (My image runs Ubuntu 16.04 and I'm using Postgres 9.6.)
Everything worked fine until I tried to move the database to a Volume with docker-compose (that was after making a copy of the container's folder with cp -R /var/lib/postgresql /somevolume/.)
The issue is that Postgres just keeps crashing, as witnessed by supervisord:
2017-07-26 18:55:38,346 INFO exited: postgresql (exit status 1; not expected)
2017-07-26 18:55:39,355 INFO spawned: 'postgresql' with pid 195
2017-07-26 18:55:40,430 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-26 18:55:40,763 INFO exited: postgresql (exit status 1; not expected)
2017-07-26 18:55:41,767 INFO spawned: 'postgresql' with pid 197
2017-07-26 18:55:42,841 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-26 18:55:43,179 INFO exited: postgresql (exit status 1; not expected)
(and so on…)
Logs
It's not clear to me what's happening as /var/log/postgresql remains empty.
chown?
I suspect it has to do with the user. If I compare the data folder inside the container and the copy I made of it to the volume, the only difference is that the original is owned by postgres while the copy is owned by root.
I tried running chown -R postgres:postgres on the copy. The operation was performed successfully, however postmaster.pid remains owned by root and I think that would be the issue.
Questions
How can I get more information about the cause of the crash?
How can I make it so that postmaster.id be owned by postgres ?
Should I consider running postgres with root instead?
Any hint welcome.
EDIT: links to the Dockerfile and the docker-compose.xml.
I'll answer my own question:
Logs & errors
What made matters more complicated was that I was not getting any specific error message.
To change that, I disabled the [program:postgresql] section in supervisord and, instead, started postgres manually from the command-line (thanks to Miguel Marques for setting me on the right track with his comment.)
Then I finally got some useful error messages:
2017-08-02 08:27:09.134 UTC [37] LOG: could not open temporary statistics file "/var/run/postgresql/9.6-main.pg_stat_tmp/global.tmp": No such file or directory
Fixing the configuration
I fixed the error above with this, eventually adding them to my Dockerfile:
mkdir -p /var/run/postgresql/9.6-main.pg_stat_tmp
chown postgres.postgres /var/run/postgresql/9.6-main.pg_stat_tmp -R
(Kudos to this guy for the fix.)
To make the data permanent, I also had to do this, for the volume to be accessible by postgres:
mkdir -p /var/lib/postgresql/9.6/main
chmod 700 /var/lib/postgresql/9.6/main
I also used initdb to initialize the data directory. BEWARE! This will erase any data found in that folder. Like so:
rm -R /var/lib/postgresql/9.6/main/*
ls /var/lib/postgresql/9.6/main/
/usr/lib/postgresql/9.6/bin/initdb -D /var/lib/postgresql/9.6/main
Testing
After the above, I could finally run postgres properly. I used this command to run it and test from the command-line:
su postgres
/usr/lib/postgresql/9.6/bin/postgres -D /var/lib/postgresql/9.6/main -c config_file=/etc/postgresql/9.6/main/postgresql.conf # as per the Docker docs
To test, I kept it running and then, from another prompt, checked everything ran fine with this:
su postgres
psql
CREATE TABLE cities ( name varchar(80), location point );
INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
select * from cities; # repeat this command after restarting the container to check that the data does persist
…making sure to restart the container and test again to check the data did persist.
And then finally restored the [program:postgresql] section in supervisord, rebuilt the image and restarted the container, making sure everything ran fine (in particular supervisord: tail /var/log/supervisor/supervisord.log), which it did.
(The command I used inside of supervisord.conf is also /usr/lib/postgresql/9.6/bin/postgres -D /var/lib/postgresql/9.6/main -c config_file=/etc/postgresql/9.6/main/postgresql.conf, as per this Docker article and other postgres+supervisord examples. Other options would have been using pg_ctl or an init.d script, but it's not clear to me why/when one would use those.)
I spent a lot of time on this. Hopefully the detailed answer will help someone down the line.
P.S.: I did end up producing a minimal example of my issue. If that can help anyone, here they are: Dockerfile, supervisord.conf and docker-compose.yml.
I do not know if this would be another way to achieve the same result (I'm new on Docker and Postgres too), but have you try the oficial repository image for Postgres (https://hub.docker.com/_/postgres/)?
I'm getting the data out of the container setting the environment variable PGDATA to '/var/lib/postgresql/data/pgdata' and binding this to an external volume on the run command:
docker run --name bd_TEST --network=my_network --restart=always -e POSTGRES_USER="superuser" -e POSTGRES_PASSWORD="myawesomepass" -e PGDATA="/var/lib/postgresql/data/pgdata" -v /var/local/db_data:/var/lib/postgresql/data/pgdata -itd -p 5432:5432 postgres:9.6
When the volume is empty, all the files are created by the image startup script, and if they already exist, the database start to used it.
From past experience I can see what may be a problem. I can't say if this will help but it is worth a try.
I would have added this as a comment, but I can't because my rep isn't hight enough.
I've spied a couple problems with how you have structured your statements in your Dockerfile. You have installed various things multiple times and also updated sporadically through the code. In my own files i've noticed that this can lead to somewhat random behaviour of my services and installation because of the different layers.
This may not seem to solve your problem directly, but cleaning up your file as is outlined in the best practices has solved many Dockerfile problems for me in the past.
One of the first places upon finding such problems is to start here at the best practices for RUN. This has helped me solve tricky problems in the past and I hope it'll solve or at least make it easier.
Pay special attention to this part:
After building the image, all layers are in the Docker cache. Suppose you later modify apt-get install by adding extra package:
FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y curl nginx
Docker sees the initial and modified instructions as identical and reuses the cache from previous
steps. As a result the apt-get update is NOT executed because the
build uses the cached version. Because the apt-get update is not run,
your build can potentially get an outdated version of the curl and
nginx packages.
After reading this I would start by consolidating all your dependencies.
In my case, having the same error, I debugged it until I found out:
the disk was full and I increased the diskspace to solve this.
(stupid error, easy fix - maybe reading this here helps someone not wasting time)
also linking this questiong for other options:
Supervisord "exit status 1 not expected" running php script
https://serverfault.com/questions/537773/supervisor-process-exits-with-exit-status-1-not-expected/1076115#1076115