-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ceph-OSD errors #49
Comments
I have tried to install this stack 6 times with minor modifications here and there. the ceph-osd does not install. Model Controller Cloud/Region Version SLA App Version Status Scale Charm Store Rev OS Notes Unit Workload Agent Machine Public address Ports Message Machine State DNS Inst id Series AZ Message |
logs from first osd |
Hey There, Looks like you're hitting bug: https://bugs.launchpad.net/charm-ceph-osd/+bug/1776713 Which actually looks more like a bug with LXD. Github Bug: https://github.com/lxc/lxd/issues/4673 I'm guessing if you're using Bionic then its LXD 3.0 , you may want to try using LXD 3.1 in the snap store. $sudo snap install --stable lxd This is a temporary workaround for now. |
So I am not sure how this upgrade to LXD works in Ubuntu. From what I read, 3.1 is a "feature" update? Also, I think I read that feature updates will not be added into the 18.04 LTS repos? How and when will updates to LXD be added back into the main 18.04 repos? Rather than going down the path of installing and updating LXD with snap, I guess I can roll back the ceph-osd version to v261 instead? Scratch that, just looked in the charm store and only the current version is available? |
I ran into this problem yesterday for the first time and filed the bugs to get started with triage, but after spending more time on it and speaking with the lxd folks they tend to believe it could be some sort of race condition, (specifically with charm -262). So I'm still running through some tests and looking at it here also. From what I can gather udevadm burps when trying to reload a single rule that's imported via the charm, related to juju. But I'm not able to reproduce it when run manually. So it may be more charm related after finding that piece. As you have discovered running the older version of the charm appears to resolve the issue also, just as I had found upgrading to lxd 3.1. But if it's a race that may explain why it worked with the upgraded version of lxd. Then again, I have NOT been able to get this working on a straight up fresh install of Bionic on any host. You can tell your bundle to install v261, simply by changing charm:cs:ceph-osd -> charm: cs:ceph-osd-261. Granted by doing so you may introduce new problems, even though you have validated it installs, there may not be particular queens related features introduced in 262 may not work. You can probably sort through the commit log to figure that out. if it's a problem. I'll let you know and update this bug if i find anything useful. ta |
So, I am not getting anywhere. I went back and removed juju and lxd. destroyed the controller etc. So after I had an updated base 18.04 image without lxd and juju I installed lxd and juju via snap. Couldn't install juju using the deb package as it wanted to reinstall lxd 3.0.1 on top of 3.1. I completely reconfig'd lxd and bootstrapped juju. All appeared good, lxd at 3.1 and juju at 2.3.8-bionic-amd64. Now when I run the juju deploy everything just hangs and stays that way. I have done this 4 times with the same results. udeadmin@udedemos:~/openstack-on-lxd$ juju status App Version Status Scale Charm Store Rev OS Notes Unit Workload Agent Machine Public address Ports Message Machine State DNS Inst id Series AZ Message |
@glzavert did you run through the sysctl settings described in: https://docs.openstack.org/charm-guide/latest/openstack-on-lxd.html I have been unable to reproduce this last night and today. Ceph-osd now successfully installs, configures and sits idle in the ready state. Vanilla Bionic installs, using the bundle versions of lxd (3.0) and juju 2.3.8-bionic-arm64. A bit frustrated as now I can't find steps to reproduce this reliably. Will update if I get any more valuable info. $ lsb_release -a ubuntu@hotdog:~/openstack-on-lxd$ uname -a $ lxc --version Model Controller Cloud/Region Version SLA App Version Status Scale Charm Store Rev OS Notes Unit Workload Agent Machine Public address Ports Message |
@sfeole Has anyone figured this out yet? I just had to rip and replace my environment because Horizon began to error every time I tried to edit metadata with "unable to retrieve the namespaces" errors. This left me with only the CLI for doing anything. I did a complete redeploy and if I use charm: cs:ceph-osd-262 it errors every time. If I use charm: cs:ceph-osd-261 I can get it to install. |
I have to say I am more than a little frustrated here. I continue to be unable to consistently get a functioning openstack deployment. Even by following the basic Openstack-on-lxd instructions. I will try to explain what I am doing, what I get, and upload applicable files to the best of my ability. I HOPE SOMEONE WILL ACTUALLY LOOK AT THIS AND HELP! So, I have an Ubuntu 18.04 VM that is up to date with 32G memory and 8 cores. I have setup and fully tested LXD with a storage pool that uses a 3.4TB partitioned, unmounted, and unformatted block device, i.e. /dev/sdb1 for the ZFS storage pool, The host OS is a a separate 500G device. I have used both LXD v3.01 and v3.2 with the exact same results. If I use Juju to deploy bundle-bionic-queens.yaml or any variant of the yaml without specifying the version of the charms, the entire deployment hangs in the middle somewhere and I see kernel messages about hung requests with 120 second timeouts. Again, this happens every time, regardless of using a new VM with a clean 18.04 install or existing VM that has worked before. Since this is bionic I used "default-series: bionic" in the config.yaml, although I have also used xenial with the same results. Also, since this is bionic, I have to use "ppa:openstack-ubuntu-testing/queens" for the repository. I have modified deployment files in order to create the model and relationships I want as follows. I also changed ceph to use bluestore and changed the replication factor from 3 to 1 in all charms as an optional argument, except gnocchi where it is not available in the charm though should be. The reason for this is that after I get ceph working, I plan to use erasure coding and a caching tier to maximize storage resources. To date the only way to get the deployment to complete is to set the charm versions in at least the ceph charms. Then manually have juju upgrade the charms. The results are the same, upgraded or not. If I use my bundle-bionic-queens-kvm-lxd2.yaml deployment I get the following: udeadmin@ude:~/openstack-on-lxd$ juju status App Version Status Scale Charm Store Rev OS Notes Unit Workload Agent Machine Public address Ports Message Machine State DNS Inst id Series AZ Message bundle-bionic-queens-kvm-lxd2.yaml.txt I have ran the bundle-bionic-queens-kvm-lxd.yaml in the past and had everything run cleanly but now it does the same and the bundle-bionic-queens-kvm-lxd2.yaml bundle-bionic-queens-kvm-lxd.yaml.txt My hope is that someone is actually maintaining this and can help identify the issue and how to fix it. |
This install was working but now it has stopped. Everytime I install the ceph-osd'd error.
results of "ceph status" on the osd's
2018-06-19 14:27:57.194686 7fd4c4883700 -1 Errors while parsing config file!
2018-06-19 14:27:57.194733 7fd4c4883700 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2018-06-19 14:27:57.194735 7fd4c4883700 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
2018-06-19 14:27:57.194736 7fd4c4883700 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)
The text was updated successfully, but these errors were encountered: