Partial etcd Recovery on Openshift / Kubernetes

hostnames have been removed to protect the innocent.
Thank God for me and my science oven

The Truth Please! What actually happened?

Ok, you got me. In this case I was the user. I deleted some pretty important stuff in a pretty important namespace/project which caused some pretty nasty things to happen.

Much mem, such CPU
$ oc delete secrets --all -n openshift-infra
$ oc get secrets -n openshift-infra | wc -l
168

Get On With it!

The goal was to get only the information I wanted, out of etcd, into a usable YAML format, where I could simply add it back to the cluster.

scp master01-from-important-client:/var/lib/etcd/snapshot.db /tmp/
snapshot.db 100% 166MB 3.6MB/s 00:45
$ dnf install etcd
dnf install etcd (i am doing screenshots because the code paste just looks ugly)
$ etcdctl snapshot restore /tmp/snapshot.db 
Proof I am not lying
$ ls -la /var/lib/etcd/
total 4
drwxr-xr-x. 3 etcd etcd 26 Jun 11 13:41 .
drwxr-xr-x. 70 root root 4096 Jun 11 11:08 ..
drwx------. 3 etcd etcd 20 Jun 11 13:41 default.etcd
$ etcd --name default --listen-client-urls http://localhost:2379  --advertise-client-urls http://localhost:2379 --listen-peer-urls http://localhost:2380
etcd up!
$ etcdctl get / --prefix --keys-only
$ etcdctl get / --prefix --keys-only | wc -l
12932
$ etcdctl get / --prefix --keys-only  | grep openshift-infra/kubernetes.io/controllers/openshift-infra/hawkular-cassandra-1
/kubernetes.io/controllers/openshift-infra/hawkular-metrics
/kubernetes.io/controllers/openshift-infra/heapster
/kubernetes.io/jobs/openshift-infra/hawkular-metrics-schema
/kubernetes.io/namespaces/openshift-infra
/kubernetes.io/pods/openshift-infra/hawkular-cassandra-1-7ftn8
/kubernetes.io/pods/openshift-infra/hawkular-metrics-5sckh
/kubernetes.io/pods/openshift-infra/heapster-d9gwb
/kubernetes.io/rolebindings/openshift-infra/admin
/kubernetes.io/rolebindings/openshift-infra/edit
/kubernetes.io/rolebindings/openshift-infra/hawkular-view
/kubernetes.io/rolebindings/openshift-infra/system:deployer
/kubernetes.io/rolebindings/openshift-infra/system:deployers
/kubernetes.io/rolebindings/openshift-infra/system:image-builder
/kubernetes.io/rolebindings/openshift-infra/system:image-builders
/kubernetes.io/rolebindings/openshift-infra/system:image-puller
/kubernetes.io/rolebindings/openshift-infra/system:image-pullers
/kubernetes.io/secrets/openshift-infra/build-controller-token-hk9gz
/kubernetes.io/secrets/openshift-infra/build-controller-token-xv2gt
/kubernetes.io/secrets/openshift-infra/builder-dockercfg-qtt2c
/kubernetes.io/secrets/openshift-infra/builder-token-dx9rb
/kubernetes.io/secrets/openshift-infra/builder-token-pvss7
/kubernetes.io/secrets/openshift-infra/cassandra-dockercfg-hqcgd
/kubernetes.io/secrets/openshift-infra/cassandra-dockercfg-jcjm9
/kubernetes.io/secrets/openshift-infra/cassandra-dockercfg-wlhcf
/kubernetes.io/secrets/openshift-infra/cassandra-token-27m8v
/kubernetes.io/secrets/openshift-infra/cassandra-token-khzqx
/kubernetes.io/secrets/openshift-infra/cassandra-token-qq4lh
/kubernetes.io/secrets/openshift-infra/cassandra-token-v4zg7
/kubernetes.io/secrets/openshift-infra/cassandra-token-vt4kn
/kubernetes.io/secrets/openshift-infra/cassandra-token-wgczj
...
$ etcdctl get / --prefix --keys-only  | grep openshift-infra | grep secret | grep default/kubernetes.io/secrets/openshift-infra/default-dockercfg-5w5dq
/kubernetes.io/secrets/openshift-infra/default-rolebindings-controller-dockercfg-xm6k8
/kubernetes.io/secrets/openshift-infra/default-rolebindings-controller-token-9lpq9
/kubernetes.io/secrets/openshift-infra/default-rolebindings-controller-token-pn2n4
/kubernetes.io/secrets/openshift-infra/default-token-5np58
/kubernetes.io/secrets/openshift-infra/default-token-7c7tz
$ etcdctl get /kubernetes.io/secrets/openshift-infra/default-token-7c7tz
$ etcdctl get /kubernetes.io/secrets/openshift-infra/default-token-7c7tz  --print-value-only | auger decode
Mmmmm, YAML….
$ mkdir -p /tmp/yaml ; cd /tmp/yaml
$ etcdctl get / --prefix --keys-only | grep openshift-infra | grep "/secrets/" > secrets.list
$ cat secrets.list/kubernetes.io/secrets/openshift-infra/cassandra-token-27m8v
/kubernetes.io/secrets/openshift-infra/cassandra-token-khzqx
/kubernetes.io/secrets/openshift-infra/cassandra-token-qq4lh
/kubernetes.io/secrets/openshift-infra/cassandra-token-v4zg7
/kubernetes.io/secrets/openshift-infra/cassandra-token-vt4kn
/kubernetes.io/secrets/openshift-infra/cassandra-token-wgczj
/kubernetes.io/secrets/openshift-infra/default-dockercfg-5w5dq
/kubernetes.io/secrets/openshift-infra/default-token-5np58
/kubernetes.io/secrets/openshift-infra/default-token-7c7tz
/kubernetes.io/secrets/openshift-infra/deployer-dockercfg-tdjnz
/kubernetes.io/secrets/openshift-infra/deployer-token-6jc8c
/kubernetes.io/secrets/openshift-infra/deployer-token-8hp6c
/kubernetes.io/secrets/openshift-infra/deployment-trigger-
...
$ for x in $(cat secrets.list) ; do secret_name=$(echo "$x" | awk -F\/ '{print $5}') ; echo $secret_name ; etcdctl get $x  --print-value-only  | auger decode > $secret_name.secret.yaml ; done
$ ls -1
build-config-change-controller-dockercfg-9khqd.secret.yaml
build-config-change-controller-token-2csmb.secret.yaml
build-config-change-controller-token-h2wxd.secret.yaml
build-controller-dockercfg-9chcf.secret.yaml
build-controller-token-hk9gz.secret.yaml
build-controller-token-xv2gt.secret.yaml
builder-dockercfg-qtt2c.secret.yaml
builder-token-dx9rb.secret.yaml
builder-token-pvss7.secret.yaml
cassandra-dockercfg-hqcgd.secret.yaml
cassandra-dockercfg-jcjm9.secret.yaml
cassandra-dockercfg-wlhcf.secret.yaml
...
$ for x in $(ls -1 *.yaml ) ; do oc apply -f $x ; done
$ oc rollout latest deploymentconfig/libretransate
$ oc start-build -n multisite-routing-prod multisite-routing
$ /usr/local/bin/master-restart api
$ /usr/local/bin/master-restart controllers

Closing Thoughts

I hope this helps someone out there that might have a similar situation and that is why I take the time out to write these articles. I would rather be installing a better solution to these types of problems, but I don’t want people to panic and get grey hairs when there is always hope.

I am not a Velero employee, but I do think it’s neat.

What is LSD?

If you saw the words “LSD” and were curious….well….

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Neil White

Neil White

28 Followers

Mech Warrior Overlord @ LSD. I spend my days killing Kubernetes, operating Openshift, hollering at Helm, vanquishing Vaults and conquering Clouds!