<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Kubernetes on ege.dev</title>
    <link>https://ege.dev/tags/kubernetes/</link>
    <description>Hello. I&#39;m Ege.</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <copyright>&#169; 2026 ege.dev</copyright>
    <lastBuildDate>Tue, 19 May 2026 18:15:35 +0000</lastBuildDate>
    <atom:link href="https://ege.dev/tags/kubernetes/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Point-in-time recovery for MySQL on Kubernetes</title>
      <link>https://ege.dev/entries/2026/03/point-in-time-recovery-for-mysql-on-kubernetes/</link>
      <pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2026/03/point-in-time-recovery-for-mysql-on-kubernetes/</guid>
      <description>&lt;p&gt;Since the v1.0.0 release of the new MySQL Operator (K8SPS), point-in-time recovery (PiTR) has been the most anticipated feature. Naturally, we decided to implement it in the upcoming v1.1.0 release.&lt;/p&gt;
&lt;p&gt;PiTR relies on two processes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reliably collecting binary logs from MySQL servers and storing them somewhere safe&lt;/li&gt;
&lt;li&gt;When the recovery is triggered, applying those binary logs up to a specific point&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;collecting-binary-logs&#34;&gt;Collecting binary logs&lt;/h2&gt;
&lt;p&gt;Our Galera replication-based MySQL Operator (K8SPXC) has a binary log collector that was developed by the Cloud Team. It has worked reliably for years but has a certain limitation that is inherent in its design: it depends on flushing the binary logs to collect them. This leads to huge numbers of binary logs on the MySQL server, which becomes a headache after running the collector for a few weeks. We mitigated this for the users by maintaining a cache for binary logs, but it didn&amp;rsquo;t remove the pain—it just became ours.&lt;/p&gt;
&lt;p&gt;At Percona, #FindABetterWay is one of our core values. In the spirit of finding a better way, when we first started to think about PiTR in K8SPS, we decided to improve the process of collecting binary logs. These discussions eventually led to the birth of a new product: &lt;a href=&#34;https://github.com/Percona-Lab/percona-binlog-server&#34;&gt;Percona Binlog Server (PBS)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;PBS works by connecting to the MySQL server as a replica and streaming events. It either uploads these events to S3 or stores them on the filesystem. It supports replication source switchovers and is able to continue from where it left off. On top of these, it provides helper commands to search for a particular GTID or timestamp in the collected binary logs.&lt;/p&gt;
&lt;h2 id=&#34;applying-binary-logs&#34;&gt;Applying binary logs&lt;/h2&gt;
&lt;p&gt;The official &lt;a href=&#34;https://dev.mysql.com/doc/refman/8.4/en/point-in-time-recovery-binlog.html&#34;&gt;MySQL docs&lt;/a&gt; suggest converting each binary log to text using &lt;code&gt;mysqlbinlog&lt;/code&gt; and piping them into the &lt;code&gt;mysql&lt;/code&gt; client for PiTR. This is already what we do in K8SPXC.&lt;/p&gt;
&lt;p&gt;I decided to check if there&amp;rsquo;s a better way. First, I checked old posts on the Percona Blog to see if our experts had written anything about a different PiTR approach. It&amp;rsquo;s not surprising that they did. &lt;a href=&#34;https://www.percona.com/blog/mysql-point-in-time-recovery-right-way/&#34;&gt;Marcelo&lt;/a&gt; wrote about an approach leveraging replication appliers for recovery. It seemed much better than piping &lt;code&gt;mysqlbinlog&lt;/code&gt; output into the client, since with replication we can have multithreading and parallel appliers for recovery. My only problem with this approach was that it required two &lt;code&gt;mysqld&lt;/code&gt; instances. Of course it&amp;rsquo;s possible, but I would love to not have to care about the state of two MySQL servers.&lt;/p&gt;
&lt;p&gt;Luckily, &lt;a href=&#34;https://www.percona.com/blog/mysql-point-in-time-recovery-right-way/#comment-10968578&#34;&gt;lefred&lt;/a&gt; commented that there&amp;rsquo;s an &lt;a href=&#34;https://lefred.be/content/howto-make-mysql-point-in-time-recovery-faster/&#34;&gt;&amp;ldquo;even better&amp;rdquo; approach&lt;/a&gt; that requires only one MySQL server!&lt;/p&gt;
&lt;p&gt;At a high level, the process looks like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Restore the full backup into the MySQL datadir&lt;/li&gt;
&lt;li&gt;Start a temporary &lt;code&gt;mysqld&lt;/code&gt; instance&lt;/li&gt;
&lt;li&gt;Download binary logs and put them as relay logs in the datadir&lt;/li&gt;
&lt;li&gt;Start replication via &lt;code&gt;CHANGE REPLICATION SOURCE TO RELAY_LOG_FILE=..., SOURCE_HOST=&#39;dummy&#39;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Start applying binary logs via &lt;code&gt;START REPLICA SQL_THREAD UNTIL &amp;lt;GTID&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Wait for the SQL thread to stop&lt;/li&gt;
&lt;li&gt;&lt;code&gt;STOP REPLICA; RESET REPLICA ALL;&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I have already created an unpolished but working &lt;a href=&#34;https://github.com/percona/percona-server-mysql-operator/pull/1252&#34;&gt;PoC&lt;/a&gt;. There is also an &lt;a href=&#34;https://github.com/percona/percona-server-mysql-operator/pull/1251&#34;&gt;RFC&lt;/a&gt; for explaining various decisions and tracking open questions that need answers. There&amp;rsquo;s still work to do, but I&amp;rsquo;m confident that this approach is good and we&amp;rsquo;ll release this as a &lt;em&gt;tech preview&lt;/em&gt; in the upcoming v1.1.0 release. If you have thoughts on the RFC or want to try the PoC, we&amp;rsquo;d love to hear your feedback.&lt;/p&gt;
</description>
      <category>mysql</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>My talk at the Cloud‑Native Databases 2026</title>
      <link>https://ege.dev/entries/2026/03/my-talk-at-the-cloudnative-databases-2026/</link>
      <pubDate>Mon, 16 Mar 2026 14:31:03 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2026/03/my-talk-at-the-cloudnative-databases-2026/</guid>
      <description>&lt;p&gt;My talk at the &lt;a href=&#34;https://buildevcon.com/events/cloud-native-databases&#34;&gt;Cloud‑Native Databases Conference&lt;/a&gt; is finally live!&lt;/p&gt;
&lt;p&gt;You can watch it on &lt;a href=&#34;https://www.youtube.com/watch?v=98hNVZ7EK5s&#34;&gt;YouTube&lt;/a&gt; and find the slide deck &lt;a href=&#34;https://ege.dev/decks/cloud-native-databases-2026.pdf&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>gremlins in HAProxy</title>
      <link>https://ege.dev/entries/2026/03/gremlins-in-haproxy/</link>
      <pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2026/03/gremlins-in-haproxy/</guid>
      <description>&lt;p&gt;We are in the process of certifying our operators for &lt;code&gt;&amp;lt;redacted&amp;gt;&lt;/code&gt;. We started
with PostgreSQL Operator and it worked just fine without any adjustments. Then
we moved on to our MySQL Operators and it surfaced a problem in HAProxy.&lt;/p&gt;
&lt;p&gt;HAProxy is used by default in our MySQL clusters. They sit in front of MySQL
instances as the proxy to have read/write splitting. We have our own external
scripts to perform checks for each backend for determining if a MySQL server is
good for that particular backend.&lt;/p&gt;
&lt;p&gt;After deploying the operator on &lt;code&gt;&amp;lt;redacted&amp;gt;&lt;/code&gt;, we realized that our HAProxy pods are
failing to get ready because all external checks are failing due to timeouts.
But why?&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s gotta be the DNS. It&amp;rsquo;s always DNS, isn&amp;rsquo;t it? Turns out, no. I tested DNS
queries from the HAProxy container and it seems they were fast.&lt;/p&gt;
&lt;p&gt;Could this be AppArmor? Maybe &lt;code&gt;&amp;lt;redacted&amp;gt;&lt;/code&gt; has stricter AppArmor profiles? I configured
HAProxy pods to be &lt;code&gt;Unconfined&lt;/code&gt;. It didn&amp;rsquo;t help either.&lt;/p&gt;
&lt;p&gt;Then I decided to increase the timeout from 10 seconds to 30 seconds, just to
see how much time the script needs to finish. To my surprise, the script was
taking 810 milliseconds to finish! How could a check time out in 10 seconds but
finish in less than a second with 30 seconds timeout?&lt;/p&gt;
&lt;p&gt;In every deep debugging session, there is a moment when engineers start to
believe in spiritual beings or gremlins who mess with the software in the
system. One needs to resist this temptation of irrationality. In computers
there&amp;rsquo;s always a rational explanation for problems, it&amp;rsquo;s just very deep and
caused by unlucky combinations of design choices and/or bugs.&lt;/p&gt;
&lt;p&gt;At this point I decided to attach a debug container into the pod and check what
HAProxy is doing with strace. This resulted in the first breakthrough of the
problem: the child process of HAProxy that runs the external check command was
doing shitloads of poll syscalls until it was killed due to timeout. It wasn&amp;rsquo;t
even running the script, it was simply stuck.&lt;/p&gt;
&lt;p&gt;This realization made me shift my focus to HAProxy itself. I started to read
the source code to see what&amp;rsquo;s going on.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/haproxy/haproxy/blob/v2.8.0/src/extcheck.c#L414-L427&#34;&gt;src/extcheck.c&lt;/a&gt;&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;int fd;
sa_family_t family;

/* close all FDs. Keep stdin/stdout/stderr in verbose mode */
fd = (global.mode &amp;amp; (MODE_QUIET|MODE_VERBOSE)) == MODE_QUIET ? 0 : 3;

my_closefrom(fd);

/* restore the initial FD limits */
limit.rlim_cur = rlim_fd_cur_at_boot;
limit.rlim_max = rlim_fd_max_at_boot;
if (raise_rlim_nofile(NULL, &amp;amp;limit) != 0) {
	getrlimit(RLIMIT_NOFILE, &amp;amp;limit);
	ha_warning(&amp;#34;External check: failed to restore initial FD limits (cur=%u max=%u), using cur=%u max=%u\n&amp;#34;,
		   rlim_fd_cur_at_boot, rlim_fd_max_at_boot,
		   (unsigned int)limit.rlim_cur, (unsigned int)limit.rlim_max);
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;HAProxy attempts to close all FDs. What does &lt;em&gt;all&lt;/em&gt; mean? And then it restores
limits to some value? What&amp;rsquo;s that value?&lt;/p&gt;
&lt;p&gt;All FDs means closing all from FD 0 to FD soft limit. &lt;code&gt;my_closefrom&lt;/code&gt; function
first polls the FD and then closes it if poll doesn&amp;rsquo;t fail with &lt;code&gt;POLLNVAL&lt;/code&gt;. OK,
this explains the excessive polling I see with strace. But it&amp;rsquo;s the default
HAProxy behavior for a long time, why didn&amp;rsquo;t we see the same problem on some
other platform, i.e GKE?&lt;/p&gt;
&lt;p&gt;The answer is simple. Turns out, &lt;code&gt;&amp;lt;redacted&amp;gt;&lt;/code&gt; has a much higher soft limit than GKE. On
GKE &lt;code&gt;ulimit -n&lt;/code&gt; returns 1048576, on &lt;code&gt;&amp;lt;redacted&amp;gt;&lt;/code&gt; 1073741816! But still there was
something that troubled me at this point: I vaguely remembered a configuration
option in HAProxy that limits the number of FDs that the process will use. The
option is &lt;code&gt;fd-hard-limit&lt;/code&gt; and
&lt;a href=&#34;https://docs.haproxy.org/2.8/configuration.html#fd-hard-limit&#34;&gt;docs&lt;/a&gt; say its
default value is 1048576. If it had any effect, I wouldn&amp;rsquo;t have seen any
problems. Something was wrong in HAProxy.&lt;/p&gt;
&lt;p&gt;Remember that the external check process was doing something to restore limits
to some value assigned at boot? Those values are assigned in the &lt;code&gt;main&lt;/code&gt;
function here:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/haproxy/haproxy/blob/v2.8.0/src/haproxy.c#L3263-L3269&#34;&gt;src/haproxy.c&lt;/a&gt;&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;/* take a copy of initial limits before we possibly change them */
getrlimit(RLIMIT_NOFILE, &amp;amp;limit);

if (limit.rlim_max == RLIM_INFINITY)
	limit.rlim_max = limit.rlim_cur;
rlim_fd_cur_at_boot = limit.rlim_cur;
rlim_fd_max_at_boot = limit.rlim_max;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a href=&#34;https://github.com/haproxy/haproxy/blob/v2.8.0/src/haproxy.c#L3311-L3312&#34;&gt;The
code&lt;/a&gt;
that limits FDs with &lt;code&gt;fd-hard-limit&lt;/code&gt; is a few lines below:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;if (global.fd_hard_limit &amp;amp;&amp;amp; limit.rlim_cur &amp;gt; global.fd_hard_limit)
	limit.rlim_cur = global.fd_hard_limit;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This means external checks restore limits to the value unbounded by
&lt;code&gt;fd-hard-limit&lt;/code&gt;. Oh, this looks like &lt;a href=&#34;https://github.com/haproxy/haproxy/issues/3299&#34;&gt;a
bug&lt;/a&gt; in HAProxy and explains
why we had this issue!&lt;/p&gt;
</description>
      <category>mysql</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>watching it shatter</title>
      <link>https://ege.dev/entries/2026/03/watching-it-shatter/</link>
      <pubDate>Fri, 06 Mar 2026 00:00:00 +0000</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2026/03/watching-it-shatter/</guid>
      <description>&lt;p&gt;This week, just like the previous, was filled with PostgreSQL (PG). I’m still working on v2.9.0 release of the PG operator.&lt;/p&gt;
&lt;p&gt;Last week I merged the implementation of pg_tde to bring data-at-rest encryption to Kubernetes. Then we started doing QA: oh boy, did it ever surface problems!&lt;/p&gt;
&lt;p&gt;One of the goals of v2.9.0 is declaring major version upgrades production-ready. So we decided to test the upgrade with TDE enabled. Boom, &lt;a href=&#34;https://perconadev.atlassian.net/browse/PG-2240&#34;&gt;data corruption&lt;/a&gt;. &lt;a href=&#34;https://perconadev.atlassian.net/browse/PG-2239&#34;&gt;Point-in-time recovery&lt;/a&gt; is problematic, &lt;a href=&#34;https://perconadev.atlassian.net/browse/PG-2234&#34;&gt;pg_rewind&lt;/a&gt; too…&lt;/p&gt;
&lt;p&gt;This is one of my favorite things about Kubernetes. You put something fragile on it and watch it shatter. Of course I need to give credit to our QA engineers: they don’t stop until they watch the software crumbling in pain.&lt;/p&gt;
&lt;p&gt;Outside of PG stuff, I opened &lt;a href=&#34;https://github.com/bergerx/kubectl-status/pull/590&#34;&gt;a small PR&lt;/a&gt; for adding volume mount information to kubectl-status plugin. I needed to quickly check mounts for a few containers while testing PG major upgrades and did a quick implementation with Claude. Before opening the PR, I was unsure whether I should mention that the code was written by Claude. I still feel embarrassed about using AI code generation. In the end I decided to openly admit it because it’s the right thing to do. I also want our contributors to be open about their usage of AI (I know they use AI).&lt;/p&gt;
&lt;p&gt;Speaking of AI code generation, I feel that the vibe has shifted significantly. I see the pessimists have accepted defeat and I am one of them: AI can write code and the code is not crap. It’s not 2023 anymore and it looks like it’s time to update my priors. I am trying to integrate Claude Code into my workflows and I had a bunch of positive interactions with it the last few weeks. I am also coding a big feature using Claude just for the sake of the experience.​​​​​​​​​​​​​​​​&lt;/p&gt;
</description>
      <category>postgresql</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Meet me at KubeCon Europe</title>
      <link>https://ege.dev/entries/2026/02/meet-me-at-kubecon-europe/</link>
      <pubDate>Wed, 25 Feb 2026 15:10:57 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2026/02/meet-me-at-kubecon-europe/</guid>
      <description>&lt;p&gt;The Percona team is heading to KubeCon + CloudNativeCon Europe in Amsterdam,
and I&amp;rsquo;d love to meet you in person!&lt;/p&gt;
&lt;p&gt;You can find me at &lt;strong&gt;Booth 790&lt;/strong&gt;. This is a great chance to talk with me and
other engineers working on Percona Operators.&lt;/p&gt;
&lt;p&gt;We will be there to discuss:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Running MySQL, PostgreSQL, and MongoDB on Kubernetes&lt;/li&gt;
&lt;li&gt;Production-ready HA setups&lt;/li&gt;
&lt;li&gt;Backup and PITR strategies&lt;/li&gt;
&lt;li&gt;Multi-cluster and multi-region deployments&lt;/li&gt;
&lt;li&gt;Operators roadmap and upcoming features&lt;/li&gt;
&lt;li&gt;Real-world troubleshooting stories&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you’re already running Percona Operators in production (or just getting started), I’d love to hear your feedback and learn about the challenges you&amp;rsquo;re facing. And if you’re just curious—or even a little suspicious—about running databases on Kubernetes, come by! I’d love to answer your questions and share our perspective.&lt;/p&gt;
&lt;h3 id=&#34;want-to-chat&#34;&gt;Want to chat?&lt;/h3&gt;
&lt;p&gt;We have a 20% discount code available for Percona community members. If you&amp;rsquo;re planning to attend and don&amp;rsquo;t have a ticket yet, let me know and I’ll send the code your way.&lt;/p&gt;
&lt;p&gt;If you’d like to carve out some dedicated time to talk, let’s schedule a meeting or grab a coffee together. You can reach out to me directly at &lt;code&gt;ege.gunes [at] percona [dot] com&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;See you in Amsterdam!&lt;/p&gt;
</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Encrypt PostgreSQL Data at Rest on Kubernetes</title>
      <link>https://ege.dev/entries/2025/02/encrypt-postgresql-data-at-rest-on-kubernetes/</link>
      <pubDate>Fri, 07 Feb 2025 15:10:57 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2025/02/encrypt-postgresql-data-at-rest-on-kubernetes/</guid>
      <description>&lt;p&gt;The upcoming Percona Operator for PostgreSQL v2.6.0 release introduces support
for PostgreSQL 17, which opens exciting possibilities for data security. Since
&lt;a href=&#34;https://github.com/percona/pg_tde&#34;&gt;pg_tde&lt;/a&gt; comes pre-installed in Percona’s
official PostgreSQL 17 images, this release presents an excellent opportunity
to implement Transparent Data Encryption in your Kubernetes-deployed databases.
Let’s look at how to configure and use pg_tde with the Percona PostgreSQL
Operator.&lt;/p&gt;
&lt;h3 id=&#34;understanding-pg_tde&#34;&gt;Understanding pg_tde&lt;/h3&gt;
&lt;p&gt;Transparent Data Encryption (TDE) offers encryption at the table level and
solves the problem of protecting data at rest. The Percona Distribution for
PostgreSQL currently offers the only open source implementation of TDE for
PostgreSQL. The encryption is transparent for users, allowing them to access
and manipulate the data without requiring application modifications.&lt;/p&gt;
&lt;p&gt;While pg_tde has a build available for PostgreSQL Community Server, the
extension leverages extended APIs introduced in Percona Server for PostgreSQL
to provide more complete and performant encryption. You can read more about the
pg_tde extension in the &lt;a href=&#34;https://www.percona.com/blog/open-source-postgresql-pg-tde-beta/&#34;&gt;TDE beta announcement blog
post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The new Percona Operator release will include pg_tde pre-installed with
PostgreSQL 17, simplifying the implementation of encryption at rest. Note:
Currently, pg_tde is in the beta phase. Please try it out on non-production
setups and &lt;a href=&#34;https://github.com/percona/pg_tde/discussions/151&#34;&gt;share your feedback&lt;/a&gt;!&lt;/p&gt;
&lt;h3 id=&#34;initial-configuration&#34;&gt;Initial configuration&lt;/h3&gt;
&lt;p&gt;To begin, we need to configure PostgreSQL to load pg_tde during startup. This
configuration is managed through the deploy/cr.yaml file:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;patroni:
  dynamicConfiguration:
    postgresql:
      parameters:
        shared_preload_libraries: pg_tde
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;After updating the configuration, apply it and restart your PostgreSQL pods:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;kubectl apply -f deploy/cr.yaml

for sts in $(kubectl get statefulset -o name | grep cluster1-instance1); do
    kubectl rollout restart $sts
done
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;strong&gt;Note: Attempting to enable pg_tde without properly configuring
shared_preload_libraries will result in an error indicating that pg_tde can
only be loaded at server startup.&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id=&#34;secure-key-management-with-hashicorp-vault&#34;&gt;Secure key management with HashiCorp Vault&lt;/h3&gt;
&lt;p&gt;While pg_tde supports file-based key storage, it is recommended that production
environments use a Key Management Service such as HashiCorp Vault or OpenBao to
ensure secure key management.&lt;/p&gt;
&lt;p&gt;Deploy Vault in your Kubernetes cluster:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault 
  --disable-openapi-validation 
  --version 0.16.1 
  --namespace vault 
  --set dataStorage.enabled=false 
  --set global.logLevel=trace 
  --set global.platform=kubernetes
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;After deploying Vault, we need to initialize it and obtain the root token. This
process involves creating an initial set of encryption keys and unsealing the
Vault instance:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;kubectl exec -it vault-0 -- vault operator init 
  -tls-skip-verify 
  -key-shares=1 
  -key-threshold=1 
  -format=json &amp;gt;vault.json

unsealKey=$(jq -r &amp;#34;.unseal_keys_b64[]&amp;#34; &amp;lt;vault.json)
token=$(jq -r &amp;#34;.root_token&amp;#34; &amp;lt;vault.json)

kubectl exec -it vault-0 -- vault operator unseal -tls-skip-verify &amp;#34;$unsealKey&amp;#34;

kubectl exec -it vault-0 -- sh -c 
  &amp;#34;export VAULT_TOKEN=$token &amp;amp;&amp;amp; 
   export VAULT_LOG_LEVEL=trace &amp;amp;&amp;amp; 
   vault secrets enable --version=1 -tls-skip-verify -path=secret kv&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, we can configure pg_tde to use Vault as the key provider:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT pg_tde_add_key_provider_vault_v2(
  &amp;#39;vault-provider&amp;#39;,
  &amp;#39;&amp;lt;rootToken&amp;gt;&amp;#39;,
  &amp;#39;http://vault.vault.svc.cluster.local:8200&amp;#39;,
  &amp;#39;secret&amp;#39;,
  NULL
);

SELECT pg_tde_set_principal_key(&amp;#39;tde-principal-key&amp;#39;,&amp;#39;vault-provider&amp;#39;);
&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;creating-encrypted-tables&#34;&gt;Creating encrypted tables&lt;/h3&gt;
&lt;p&gt;Percona Server for PostgreSQL offers two encryption methods:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;tde_heap&lt;/strong&gt;: Available exclusively in Percona Server for PostgreSQL 17,
this method provides comprehensive encryption of tuples, Write Ahead Log
(WAL), and indexes with optimized performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;tde_heap_basic&lt;/strong&gt;: Compatible with Community PostgreSQL 16 and 17, this
method encrypts tuples and WAL. Unfortunately this method does not encrypt
indexes, which is an important limitation to the production use cases.
Considering the performance limitations of this method encrypting/decrypting
data whenever it enters/exits shared buffers it is not recommended to use this
method in production use cases.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;When using Percona Server for PostgreSQL images, we can utilize the more comprehensive tde_heap method:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;CREATE TABLE encrypted_data (
  id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  t text not null
) USING tde_heap;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Verify the encryption status of your table:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT pg_tde_is_encrypted(&amp;#39;encrypted_data&amp;#39;);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;For more detailed information about pg_tde capabilities and configurations,
please refer to the &lt;a href=&#34;https://percona.github.io/pg_tde/main/index.html&#34;&gt;official Percona documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;current-implementation-status-and-future-plans&#34;&gt;Current implementation status and future plans&lt;/h3&gt;
&lt;p&gt;In the current release of Percona Operator for PostgreSQL, implementing pg_tde
requires manual configuration steps as outlined above. Users need to explicitly
configure shared libraries, set up key providers, and manage the encryption
infrastructure themselves. This approach, while functional, requires careful
attention to detail and a thorough understanding of both pg_tde and Kubernetes
operations.&lt;/p&gt;
&lt;p&gt;However, future releases of the Percona Operator will introduce seamless
integration with pg_tde, significantly simplifying the encryption
implementation process. The operator will handle the underlying configuration
automatically, allowing users to focus primarily on their database design. In
these upcoming versions, enabling encryption will be as straightforward as
creating tables with the tde_heap access method while the operator manages all
the necessary infrastructure and configuration behind the scenes.&lt;/p&gt;
</description>
      <category>postgresql</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Authenticating Your Clients to Mongodb on Kubernetes Using X509 Certificates</title>
      <link>https://ege.dev/entries/2022/02/authenticating-your-clients-to-mongodb-on-kubernetes-using-x509-certificates/</link>
      <pubDate>Sun, 13 Feb 2022 20:42:57 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2022/02/authenticating-your-clients-to-mongodb-on-kubernetes-using-x509-certificates/</guid>
      <description>&lt;p&gt;Managing database users and their passwords can be a hassle. Sometimes, they
could even wait in various configuration files, hardcoded. Using certificates
can help you avoid the toil of managing, rotating, and securing user passwords,
so let’s see how to have x509 certificate authentication with the &lt;a href=&#34;https://www.percona.com/doc/kubernetes-operator-for-psmongodb/index.html&#34;&gt;Percona
Server for MongoDB
Operator&lt;/a&gt;
and cert-manager.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://cert-manager.io/&#34;&gt;cert-manager&lt;/a&gt; is our recommended way to manage TLS
certificates on Kubernetes clusters. The operator is already integrated with it
to generate certificates for TLS and cluster member authentication. We’re going
to leverage cert-manager APIs to generate valid certificates for MongoDB
clients.&lt;/p&gt;
&lt;p&gt;There are rules to follow to have a valid certificate for user authentication:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A single Certificate Authority (CA) MUST sign all certificates.&lt;/li&gt;
&lt;li&gt;The certificate’s subject MUST be unique.&lt;/li&gt;
&lt;li&gt;The certificate MUST not be expired.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For the complete requirements, check the &lt;a href=&#34;https://docs.mongodb.com/manual/core/security-x.509/#client-certificate-requirements&#34;&gt;MongoDB
docs&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;creating-valid-certificates-for-clients&#34;&gt;Creating Valid Certificates for Clients&lt;/h2&gt;
&lt;p&gt;Let’s check our current certificates:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ kubectl get cert
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;NAME                      READY   SECRET                    AGE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cluster1-ssl              True    cluster1-ssl              17h
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cluster1-ssl-internal     True    cluster1-ssl-internal     17h
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The operator configures MongoDB nodes to use &amp;ldquo;cluster1-ssl-internal&amp;rdquo; as the
certificate authority. We’re going to use it to sign the client certificates to
conform to Rule 1.&lt;/p&gt;
&lt;p&gt;First, we need to create an Issuer:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ kubectl apply -f - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;apiVersion: cert-manager.io/v1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;kind: Issuer
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;metadata:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; name: cluster1-psmdb-x509-ca
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;spec:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; ca:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   secretName: cluster1-ssl-internal
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then, our certificate:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ kubectl apply -f - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;apiVersion: cert-manager.io/v1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;kind: Certificate
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;metadata:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; name: cluster1-psmdb-egegunes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;spec:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; secretName: cluster1-psmdb-egegunes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; isCA: false
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; commonName: egegunes
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; subject:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   organizations:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;     - percona
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   organizationalUnits:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;     - cloud
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; usages:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   - digital signature
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   - client auth
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt; issuerRef:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   name: cluster1-psmdb-x509-ca
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   kind: Issuer
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;   group: cert-manager.io
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &amp;ldquo;usages&amp;rdquo; field is important. You shouldn’t touch its values. You can change
the &amp;ldquo;subject&amp;rdquo; and &amp;ldquo;commonName&amp;rdquo; fields as you wish. They’re going to construct
the Distinguished Name (DN) and DN will be the username.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ kubectl get secret cluster1-psmdb-egegunes -o yaml &lt;span style=&#34;color:#ae81ff&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;&lt;/span&gt;    | yq3 r - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;data.&amp;#34;tls.crt&amp;#34;&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;&lt;/span&gt;    | base64 -d &lt;span style=&#34;color:#ae81ff&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;&lt;/span&gt;    | openssl x509 -subject -noout
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;subject&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;O &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; percona, OU &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; cloud, CN &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; egegunes
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Let&amp;rsquo;s create the user:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;rs0&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;PRIMARY&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;db&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;getSiblingDB&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;$external&amp;#34;&lt;/span&gt;).&lt;span style=&#34;color:#a6e22e&#34;&gt;runCommand&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#a6e22e&#34;&gt;createUser&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;CN=egegunes,OU=cloud,O=percona&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#a6e22e&#34;&gt;roles&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;readWrite&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;db&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;test&amp;#39;&lt;/span&gt; }]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ok&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;$clusterTime&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;clusterTime&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;Timestamp&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1643099623&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;signature&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;hash&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;BinData&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;EdPrmPJqfgRpMEZwGMeKNLdCe10=&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;keyId&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;NumberLong&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;7056790236952526853&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;operationTime&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;Timestamp&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1643099623&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We’re creating the user in the &amp;ldquo;$external&amp;rdquo; database. You need to use
&amp;ldquo;$external&amp;rdquo; as your authentication source. Note that we’re reversing the
subject fields, this is important.&lt;/p&gt;
&lt;h2 id=&#34;authenticating-with-the-certificate&#34;&gt;Authenticating With the Certificate&lt;/h2&gt;
&lt;p&gt;I have created a simple Go application to show how you can use x509
certificates to authenticate. It’s redacted here for brevity:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// ca.crt is mounted from secret/cluster1-ssl
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;caFilePath&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;/etc/mongodb-ssl/ca.crt&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// tls.pem consists of tls.key and tls.crt, they&amp;#39;re mounted from
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;secret&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;cluster1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;psmdb&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;egegunes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;certKeyFilePath&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;/tmp/tls.pem&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;endpoint&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;cluster1-rs0.psmdb.svc.cluster.local&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;uri&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;fmt&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Sprintf&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;mongodb+srv://%s/?tlsCAFile=%s&amp;amp;tlsCertificateKeyFile=%s&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#a6e22e&#34;&gt;endpoint&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#a6e22e&#34;&gt;caFilePath&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#a6e22e&#34;&gt;certKeyFilePath&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;credential&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;options&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Credential&lt;/span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#a6e22e&#34;&gt;AuthMechanism&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;MONGODB-X509&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#a6e22e&#34;&gt;AuthSource&lt;/span&gt;:    &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;$external&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;opts&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;options&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Client&lt;/span&gt;().&lt;span style=&#34;color:#a6e22e&#34;&gt;SetAuth&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;credential&lt;/span&gt;).&lt;span style=&#34;color:#a6e22e&#34;&gt;ApplyURI&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;uri&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;_&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;:=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;mongo&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;Connect&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;ctx&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;opts&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The important part is using &amp;ldquo;MONGODB-X509&amp;rdquo; as the authentication mechanism. We
also need to pass the CA and client certificate in the MongoDB URI.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ kubectl logs psmdb-x509-tester-688c989567-rmgxv
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;2022/01/25 07:50:09 Connecting to database
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;2022/01/25 07:50:09 URI: mongodb+srv://cluster1-rs0.psmdb.svc.cluster.local/?tlsCAFile&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;/etc/mongodb-ssl/ca.crt&amp;amp;tlsCertificateKeyFile&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;/tmp/tls.pem
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;2022/01/25 07:50:09 Username: O&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;percona,OU&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;cloud,CN&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;egegunes
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;2022/01/25 07:50:09 Connected to database
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;2022/01/25 07:50:09 Successful ping
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You can see the complete example in &lt;a href=&#34;https://github.com/egegunes/psmdb-x509-tester&#34;&gt;this
repository&lt;/a&gt;. If you have any
questions, please add a comment or create a topic in the &lt;a href=&#34;http://forums.percona.com/&#34;&gt;Percona
Forums&lt;/a&gt;.&lt;/p&gt;
</description>
      <category>mongodb</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Disaster Recovery for MongoDB on Kubernetes</title>
      <link>https://ege.dev/entries/2021/10/disaster-recovery-for-mongodb-on-kubernetes/</link>
      <pubDate>Fri, 08 Oct 2021 15:10:57 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2021/10/disaster-recovery-for-mongodb-on-kubernetes/</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a joint post with Sergey Pronin.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As per the glossary, Disaster Recovery (DR) protocols are an organization’s
method of regaining access and functionality to its IT infrastructure in events
like a natural disaster, cyber attack, or even business disruptions related to
the COVID-19 pandemic. When we talk about data, storing backups on remote
servers is enough to pass DR compliance checks for some companies. But for
others, Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) are
extremely tight and require more than just a backup/restore procedure.&lt;/p&gt;
&lt;p&gt;In this blog post, we are going to show you how to set up MongoDB on two
distant Kubernetes clusters with Percona Distribution for MongoDB Operator to
meet the toughest DR requirements.&lt;/p&gt;
&lt;h3 id=&#34;what-to-expect&#34;&gt;What to Expect&lt;/h3&gt;
&lt;p&gt;Here is what we are going to do:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Setup two Kubernetes clusters&lt;/li&gt;
&lt;li&gt;Deploy Percona Distribution for MongoDB Operator on both of them. The
Disaster Recovery site will run a MongoDB cluster in unmanaged mode.&lt;/li&gt;
&lt;li&gt;We are going to simulate the failure and perform a failover to DR site&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In the 1.10.0 version of the Operator, we have added the Technology Preview of
the new feature which enables users to deploy unmanaged MongoDB nodes and
connect them to existing Replica Sets.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://ege.dev/images/k8spsmdb-disaster-recovery/blog_mongodr_0.png&#34; alt=&#34;figure-0&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;set-it-all-up&#34;&gt;Set it All Up&lt;/h3&gt;
&lt;p&gt;We are not going to cover the configuration of the Kubernetes clusters, but in
our tests, we relied on two Google Kubernetes Engine (GKE) clusters deployed in
different regions.&lt;/p&gt;
&lt;h3 id=&#34;prepare-main-site&#34;&gt;Prepare Main Site&lt;/h3&gt;
&lt;p&gt;We have shared all the resources for this blog post in this GitHub repo. As a
first step we are going to deploy the operator on the Main site:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ kubectl apply -f bundle.yaml
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Deploy the MongoDB managed cluster with &lt;code&gt;cr-main.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ kubectl apply -f cr-main.yaml
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It is important to understand that we will need to expose ReplicaSet nodes
through a dedicated service. This includes Config Servers. This is required to
ensure that ReplicaSet nodes on Main and DR can reach each other. So it is like
a full mesh:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://ege.dev/images/k8spsmdb-disaster-recovery/blog_mongodr_1.png&#34; alt=&#34;figure-1&#34;&gt;&lt;/p&gt;
&lt;p&gt;To get there, cr-main.yaml has the following changes:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;spec:
  replsets:
  - rs0:
    expose:
      enabled: true
      exposeType: LoadBalancer
  sharding:
    configsvrReplSet:
      expose:
        enabled: true
        exposeType: LoadBalancer
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We are using the LoadBalancer Kubernetes Service object as it is just simpler
for us, but there are other options – ClusterIP, NodePort. It is also possible
to utilize 3rd party tools like Submariner to implement a private connection.&lt;/p&gt;
&lt;p&gt;If you have an already running MongoDB cluster in Kubernetes, you can expose
the ReplicaSets without downtime by changing these variables.&lt;/p&gt;
&lt;h3 id=&#34;prepare-disaster-recovery-site&#34;&gt;Prepare Disaster Recovery Site&lt;/h3&gt;
&lt;p&gt;The configuration of the Disaster Recovery site could be broken down into the
following steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Copy the Secrets from the Main cluster.
&lt;ol&gt;
&lt;li&gt;system users secrets&lt;/li&gt;
&lt;li&gt;SSL keys – both used for external connections and internal replication traffic&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;Tune Custom Resource:
&lt;ol&gt;
&lt;li&gt;run nodes in unmanaged mode – Operator does not control replicaset configuration and secrets generation&lt;/li&gt;
&lt;li&gt;expose ReplicaSets (the same way we do it on the Main cluster)&lt;/li&gt;
&lt;li&gt;disable backups – backups can be only taken on the cluster managed by the Operator&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;copy-the-secrets&#34;&gt;Copy the Secrets&lt;/h3&gt;
&lt;p&gt;System user’s credentials are stored by default in my-cluster-name-secrets
Secret object and defined in spec.secrets.users. Apply this secret in the DR
cluster with kubectl apply -f yaml-with-secrets. If you don’t have it in your
source code repository or if you rely on the Operator to generate it, you can
get the secret from Kubernetes itself, remove the unnecessary metadata and
apply.&lt;/p&gt;
&lt;p&gt;On main execute:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ kubectl get secret my-cluster-name-secrets -o yaml &amp;gt; my-cluster-secrets.yaml
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now remove the following lines from metadata:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;annotations
creationTimestamp
resourceVersion
selfLink
uid
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Save the file and apply it to the DR cluster.&lt;/p&gt;
&lt;p&gt;The procedure to copy SSL keys is almost the same as for users. The difference
is the names of the Secret objects – they are usually called &amp;lt;CLUSTER_NAME&amp;gt;-ssl
and &amp;lt;CLUSTER_NAME&amp;gt;-ssl-internal. It is also possible to specify them in
secrets.ssl and secrets.sslInternal in the Custom Resource. Copy these two keys
from Main to DR and reference them in the CR.&lt;/p&gt;
&lt;h3 id=&#34;tune-custom-resource&#34;&gt;Tune Custom Resource&lt;/h3&gt;
&lt;p&gt;cr-replica.yaml will have the following changes:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;  secrets:
    users: my-cluster-name-secrets
    ssl: replica-cluster-ssl
    sslInternal: replica-cluster-ssl-internal
 
  replsets:
  - name: rs0
    size: 3
    expose:
      enabled: true
      exposeType: LoadBalancer
 
  sharding:
    enabled: true
    configsvrReplSet:
      size: 3
      expose:
        enabled: true
        exposeType: LoadBalancer
 
  backup:
    enabled: false
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Once the Custom Resource is applied, the services are going to be created. We
will need the IP addresses of each ReplicaSet node to configure the DR site.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ kubectl get services
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)           AGE
replica-cluster-cfg-0    LoadBalancer   10.111.241.213   34.78.119.1       27017:31083/TCP   5m28s
replica-cluster-cfg-1    LoadBalancer   10.111.243.70    35.195.138.253    27017:31957/TCP   4m52s
replica-cluster-cfg-2    LoadBalancer   10.111.246.94    146.148.113.165   27017:30196/TCP   4m6s
...
replica-cluster-rs0-0    LoadBalancer   10.111.241.41    34.79.64.213      27017:31993/TCP   5m28s
replica-cluster-rs0-1    LoadBalancer   10.111.242.158   34.76.238.149     27017:32012/TCP   4m47s
replica-cluster-rs0-2    LoadBalancer   10.111.242.191   35.195.253.107    27017:31209/TCP   4m22s
&lt;/code&gt;&lt;/pre&gt;&lt;h3 id=&#34;add-external-nodes-to-main&#34;&gt;Add External Nodes to Main&lt;/h3&gt;
&lt;p&gt;At this step, we are going to add unmanaged nodes to the Replica Set on the
Main site. In cr-main.yaml we should add externalNodes under replsets.[] and
sharding.configsvrReplSet:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;  replsets:
  - name: rs0
    externalNodes:
    - host: 34.79.64.213
      priority: 1
      votes: 1
    - host: 34.76.238.149
      priority: 1
      votes: 1
    - host: 35.195.253.107
      priority: 0
      votes: 0
 
  sharding:
    configsvrReplSet:
      externalNodes:
      - host: 34.78.119.1
        priority: 1
        votes: 1
      - host: 35.195.138.253
        priority: 1
        votes: 1
      - host: 146.148.113.165
        priority: 0
        votes: 0
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Please note that we add three nodes, but only two are voters. We do this to
avoid split-brain situations and do not start the primary election if the DR
site is down or there is a network disruption between the Main and DR sites.&lt;/p&gt;
&lt;h3 id=&#34;failover&#34;&gt;Failover&lt;/h3&gt;
&lt;p&gt;Once all the configuration above is applied, the situation will look like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://ege.dev/images/k8spsmdb-disaster-recovery/blog_mongodr_2.png&#34; alt=&#34;figure-2&#34;&gt;&lt;/p&gt;
&lt;p&gt;We have three voters in the main cluster and two voters in the replica cluster.
That means replica nodes won’t have the majority in case of main cluster
failure and they won’t be able to elect a new primary. Therefore we need to
step in and perform a manual failover.&lt;/p&gt;
&lt;p&gt;Let’s kill the main cluster:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;gcloud compute instances list \
    | grep my-main-gke-demo \
    | awk &amp;#39;{print $1}&amp;#39; \
    | xargs gcloud compute instances delete --zone europe-west3-b

gcloud container node-pools delete \
    --zone europe-west3-b \
    --cluster my-main-gke-demo \
    default-pool
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I deleted the nodes and the node pool of the main Kubernetes cluster so now the cluster is in an unhealthy state. Let’s see what mongos on the DR site says when we try to read or write through it:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;% ./psmdb-tester
2021/09/03 18:19:19 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:19:40 read failed: (FailedToSatisfyReadPreference) Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: &amp;#34;primary&amp;#34; } for set cfg
2021/09/03 18:19:49 write failed: (FailedToSatisfyReadPreference) Could not find host matching read preference { mode: &amp;#34;primary&amp;#34; } for set cfg
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;img src=&#34;https://ege.dev/images/k8spsmdb-disaster-recovery/blog_mongodr_3.png&#34; alt=&#34;figure-3&#34;&gt;&lt;/p&gt;
&lt;p&gt;Normally, we can only alter the replica set configuration from the primary node
but in this kind of situation where you don’t have a primary and only have a
few surviving members, MongoDB allows us to force the reconfiguration from any
alive member.&lt;/p&gt;
&lt;p&gt;Let’s connect to one of the secondary nodes in the replica cluster and perform
the failover:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;kubectl exec -it psmdb-client-7b9f978649-pjb2k -- mongo &amp;#39;mongodb://clusterAdmin:&amp;lt;pass&amp;gt;@replica-cluster-rs0-0.replica.svc.cluster.local/admin?ssl=false&amp;#39;
...
rs0:SECONDARY&amp;gt; cfg = rs.config()
rs0:SECONDARY&amp;gt; cfg.members = [cfg.members[3], cfg.members[4], cfg.members[5]]
rs0:SECONDARY&amp;gt; rs.reconfig(cfg, {force: true})
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Note that the indexes of surviving members may differ in your environment. You
should check rs.status() and rs.config() outputs first. The main idea is to
repopulate config members with only surviving members.&lt;/p&gt;
&lt;p&gt;After the reconfiguration, the replica set will have just three members and two
of them will have votes and a majority. So, they’ll be able to select a new
primary. After performing the same process on the cfg replica set, we will be
able to read and write through mongos again:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;% ./psmdb-tester
2021/09/03 18:41:48 Successfully connected and pinged 34.141.3.189:27017
2021/09/03 18:41:49 read succeed
2021/09/03 18:41:50 read succeed
2021/09/03 18:41:51 read succeed
2021/09/03 18:41:52 read succeed
2021/09/03 18:41:53 read succeed
2021/09/03 18:41:54 read succeed
2021/09/03 18:41:55 read succeed
2021/09/03 18:41:56 read succeed
2021/09/03 18:41:57 read succeed
2021/09/03 18:41:58 read succeed
2021/09/03 18:41:58 write succeed
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Once the replica cluster has become the primary, you should reconfigure all
clients that connect to the old main cluster and point them to the DR site.&lt;/p&gt;
</description>
      <category>mongodb</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Cluster Statuses in Percona Kubernetes Operators</title>
      <link>https://ege.dev/entries/2021/07/cluster-statuses-in-percona-kubernetes-operators/</link>
      <pubDate>Wed, 21 Jul 2021 15:10:57 +0300</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2021/07/cluster-statuses-in-percona-kubernetes-operators/</guid>
      <description>&lt;p&gt;In Kubernetes, all resources have a status field separated from their spec. The
status field is an interface both for humans or applications to read the
perceived state of the resource.&lt;/p&gt;
&lt;p&gt;When you deploy our Percona Kubernetes Operators –  Percona Operator for
MongoDB or Percona Operator for MySQL – in your Kubernetes cluster, you’re
creating a custom resource (CR for short) and it has its own status, too. Since
Kubernetes operators mimic the human operator and aim to have the required
expertise to run software in a Kubernetes cluster; the status of the custom
resources should be smart.&lt;/p&gt;
&lt;p&gt;You can get cluster status with the commands below, or via (Kubernetes API) for
Percona Operator for MySQL:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;% kubectl get pxc
NAME            ENDPOINT                                   STATUS   PXC   PROXYSQL   HAPROXY   AGE
lisette-18537   lisette-18537-haproxy.subjectivism-22940   ready    3                3         87m

% kubectl get pxc &amp;lt;cluster-name&amp;gt; -o jsonpath=&amp;#39;{.status}&amp;#39;
{
  &amp;#34;backup&amp;#34;: {
    &amp;#34;version&amp;#34;: &amp;#34;8.0.23&amp;#34;
  },
  &amp;#34;conditions&amp;#34;: [
    {
      &amp;#34;lastTransitionTime&amp;#34;: &amp;#34;2021-07-12T13:13:46Z&amp;#34;,
      &amp;#34;status&amp;#34;: &amp;#34;True&amp;#34;,
      &amp;#34;type&amp;#34;: &amp;#34;initializing&amp;#34;
    }
  ],
  &amp;#34;haproxy&amp;#34;: {
    &amp;#34;labelSelectorPath&amp;#34;: &amp;#34;...&amp;#34;,
    &amp;#34;ready&amp;#34;: 3,
    &amp;#34;size&amp;#34;: 3,
    &amp;#34;status&amp;#34;: &amp;#34;ready&amp;#34;
  },
  &amp;#34;host&amp;#34;: &amp;#34;lisette-18537-haproxy.subjectivism-22940&amp;#34;,
  &amp;#34;logcollector&amp;#34;: {
    &amp;#34;version&amp;#34;: &amp;#34;1.8.0&amp;#34;
  },
  &amp;#34;observedGeneration&amp;#34;: 2,
  &amp;#34;pmm&amp;#34;: {
    &amp;#34;version&amp;#34;: &amp;#34;2.12.0&amp;#34;
  },
  &amp;#34;proxysql&amp;#34;: {},
  &amp;#34;pxc&amp;#34;: {
    &amp;#34;image&amp;#34;: &amp;#34;percona/percona-xtradb-cluster:8.0.22-13.1&amp;#34;,
    &amp;#34;labelSelectorPath&amp;#34;: &amp;#34;...&amp;#34;,
    &amp;#34;ready&amp;#34;: 2,
    &amp;#34;size&amp;#34;: 3,
    &amp;#34;status&amp;#34;: &amp;#34;initializing&amp;#34;,
    &amp;#34;version&amp;#34;: &amp;#34;8.0.22-13.1&amp;#34;
  },
  &amp;#34;ready&amp;#34;: 5,
  &amp;#34;size&amp;#34;: 6,
  &amp;#34;state&amp;#34;: &amp;#34;initializing&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And for Percona Operator for MongoDB:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;% kubectl get psmdb
NAME             ENDPOINT                                                     STATUS   AGE
cynodont-26997   cynodont-26997-mongos.subjectivism-22940.svc.cluster.local   ready    85m


% kubectl get psmdb &amp;lt;cluster-name&amp;gt; -o jsonpath=&amp;#39;{.status}&amp;#39;
{
  &amp;#34;conditions&amp;#34;: [
    {
      &amp;#34;lastTransitionTime&amp;#34;: &amp;#34;2021-07-12T13:13:39Z&amp;#34;,
      &amp;#34;status&amp;#34;: &amp;#34;True&amp;#34;,
      &amp;#34;type&amp;#34;: &amp;#34;initializing&amp;#34;
    }
  ],
  &amp;#34;host&amp;#34;: &amp;#34;cynodont-26997-mongos.subjectivism-22940.svc.cluster.local&amp;#34;,
  &amp;#34;mongoImage&amp;#34;: &amp;#34;percona/percona-server-mongodb:4.4.6-8&amp;#34;,
  &amp;#34;mongoVersion&amp;#34;: &amp;#34;4.4.6-8&amp;#34;,
  &amp;#34;mongos&amp;#34;: {
    &amp;#34;ready&amp;#34;: 1,
    &amp;#34;size&amp;#34;: 3,
    &amp;#34;status&amp;#34;: &amp;#34;initializing&amp;#34;
  },
  &amp;#34;observedGeneration&amp;#34;: 2,
  &amp;#34;ready&amp;#34;: 3,
  &amp;#34;replsets&amp;#34;: {
    &amp;#34;cfg&amp;#34;: {
      &amp;#34;ready&amp;#34;: 1,
      &amp;#34;size&amp;#34;: 3,
      &amp;#34;status&amp;#34;: &amp;#34;initializing&amp;#34;
    },
    &amp;#34;rs0&amp;#34;: {
      &amp;#34;initialized&amp;#34;: true,
      &amp;#34;ready&amp;#34;: 2,
      &amp;#34;size&amp;#34;: 3,
      &amp;#34;status&amp;#34;: &amp;#34;initializing&amp;#34;
    }
  },
  &amp;#34;size&amp;#34;: 6,
  &amp;#34;state&amp;#34;: &amp;#34;initializing&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see there are several fields in the output: conditions, cluster
size, number of ready cluster members, statuses and versions of different
components, and the “state”. In the following sections, we’ll take a look at
every possible value of the state field.&lt;/p&gt;
&lt;h2 id=&#34;initializing&#34;&gt;Initializing&lt;/h2&gt;
&lt;p&gt;While the cluster is progressing to readiness, CR status is “initializing”. It
includes creating the cluster, scaling it up or down, and updating the CR that
triggers a rolling restart of pods (for instance updating Percona Operator for
MySQL memory limits).&lt;/p&gt;
&lt;p&gt;Percona Operator for MongoDB also reconfigures the replica set config if
necessary (for instance it adds the new pods as members to replset or removes
terminated ones). Replica set in MongoDB is a set of servers that implements
replication and automatic failover. Although they have the same name, it’s
different from the Kubernetes replica set. While this configuration is
happening or if there is an unknown/unpredicted error during it, the status is
also “initializing”.&lt;/p&gt;
&lt;p&gt;Since version 1.7.0, the Percona Operator for MySQL can handle full crash
recovery if necessary. If a pod waits for the recovery, the cluster status is
“initializing”.&lt;/p&gt;
&lt;h2 id=&#34;ready&#34;&gt;Ready&lt;/h2&gt;
&lt;p&gt;The operator keeps track of the status of each component in the cluster.
Percona Operator for MongoDB has the following components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;mongod StatefulSet&lt;/li&gt;
&lt;li&gt;configsvr StatefulSet if sharding is enabled&lt;/li&gt;
&lt;li&gt;mongos Deployment if sharding is enabled&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Percona Operator for MySQL components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;PXC StatefulSet&lt;/li&gt;
&lt;li&gt;HAProxy StatefulSet if enabled&lt;/li&gt;
&lt;li&gt;ProxySQL StatefulSet if enabled&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All components need to be in “ready” status for CR to be “ready”. If the number
of ready pods controlled by the stateful set reaches the desired number, the
operator marks the component as ready. The readiness of the pods is tracked by
Kubernetes using readiness probes for each container in the pod. For example,
for a Percona XtraDB Cluster container to be ready “wsrep_cluster_status” needs
to be “Primary” and “wsrep_local_state” should be “Synced” or “Donor”. For a
Percona Server for MongoDB container to be ready, accepting TCP connections on
27017 is enough.&lt;/p&gt;
&lt;p&gt;But ready as the CR status means more than that. CR “ready” means the cluster
(Percona Server for MongoDB or Percona XtraDB Cluster) is up and running and
ready to receive traffic. So, even if all components are ready, the cluster
status can be “initializing”. In the Percona Operator for MongoDB, the replica
set needs to be initialized and its config up-to-date. Also, with the 1.9.0
release of both operators, the load balancer needs to be ready if the cluster
is exposed with &lt;code&gt;exposeType: LoadBalancer&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;stopping&#34;&gt;Stopping&lt;/h2&gt;
&lt;p&gt;Version 1.9.0 introduced two new statuses:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Stopping&lt;/li&gt;
&lt;li&gt;Paused&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Stopping means the cluster is paused or deleted and its pods are terminating right now.&lt;/p&gt;
&lt;p&gt;If you run &lt;code&gt;kubectl delete psmdb &amp;lt;cluster-name&amp;gt;&lt;/code&gt; or `kubectl delete pxc
&lt;cluster-name&gt;`` the resource can be deleted quickly without a chance to see
“stopping” status. If you had finalizers (for example
“delete-pxc-pods-in-order” in Percona Operator for MySQL) deletion will be
blocked until the finalizer list is exhausted and you can observe “stopping”
status.&lt;/p&gt;
&lt;h2 id=&#34;paused&#34;&gt;Paused&lt;/h2&gt;
&lt;p&gt;Once the cluster is paused and all pods are terminated, the CR status becomes “paused”.&lt;/p&gt;
&lt;p&gt;To pause the cluster: &lt;code&gt;kubectl patch &amp;lt;psmdb|pxc&amp;gt; &amp;lt;cluster-name&amp;gt; --type=merge -p &#39;{&amp;quot;spec&amp;quot;: {&amp;quot;pause&amp;quot;: true}}&#39;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Keep in mind, when the cluster is paused and exposeType is LoadBalancer – Load
balancers are still there and you continue to pay for them.&lt;/p&gt;
&lt;h2 id=&#34;error&#34;&gt;Error&lt;/h2&gt;
&lt;p&gt;Before 1.9.0, “error” status could mean two different things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;An error occurred in the operator during the reconciliation of the CR&lt;/li&gt;
&lt;li&gt;One or more pods in a component are not schedulable&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With 1.9.0, the “error” status means only the operator errors. If there is an
unschedulable pod, the cluster’s status will be initializing. If the cluster is
stuck in initializing for too long, it’s better to check the operator logs to
investigate.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;% kubectl logs &amp;lt;operator-pod-name&amp;gt;
...
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095618.9982307,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;Created a new mongo key&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;KeyName&amp;#34;:&amp;#34;cynodont-26997-mongodb-keyfile&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095619.0032709,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;Created a new mongo key&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;KeyName&amp;#34;:&amp;#34;cynodont-26997-mongodb-encryption-key&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095687.3783236,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;initiating replset&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;pod&amp;#34;:&amp;#34;cynodont-26997-rs0-1&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095694.020591,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;replset was initialized&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;pod&amp;#34;:&amp;#34;cynodont-26997-rs0-1&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;error&amp;#34;,&amp;#34;ts&amp;#34;:1626095694.622869,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;failed to reconcile cluster&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;error&amp;#34;:&amp;#34;undefined state of the replset member cynodont-26997-rs0-0.cynodont-26997-rs0.subjectivism-22940.svc.cluster.local:27017: 6&amp;#34;,&amp;#34;errorVerbose&amp;#34;:&amp;#34;undefined state of the replset member cynodont-26997-rs0-0.cynodont-26997-rs0.subjectivism-22940.svc.cluster.local:27017: 6\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:210\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:449\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371&amp;#34;,&amp;#34;stacktrace&amp;#34;:&amp;#34;github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:451\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88&amp;#34;}
% kubectl logs &amp;lt;operator-pod-name&amp;gt;
 
...
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095618.9982307,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;Created a new mongo key&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;KeyName&amp;#34;:&amp;#34;cynodont-26997-mongodb-keyfile&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095619.0032709,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;Created a new mongo key&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;KeyName&amp;#34;:&amp;#34;cynodont-26997-mongodb-encryption-key&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095687.3783236,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;initiating replset&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;pod&amp;#34;:&amp;#34;cynodont-26997-rs0-1&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;info&amp;#34;,&amp;#34;ts&amp;#34;:1626095694.020591,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;replset was initialized&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;pod&amp;#34;:&amp;#34;cynodont-26997-rs0-1&amp;#34;}
{&amp;#34;level&amp;#34;:&amp;#34;error&amp;#34;,&amp;#34;ts&amp;#34;:1626095694.622869,&amp;#34;logger&amp;#34;:&amp;#34;controller_psmdb&amp;#34;,&amp;#34;msg&amp;#34;:&amp;#34;failed to reconcile cluster&amp;#34;,&amp;#34;Request.Namespace&amp;#34;:&amp;#34;subjectivism-22940&amp;#34;,&amp;#34;Request.Name&amp;#34;:&amp;#34;cynodont-26997&amp;#34;,&amp;#34;replset&amp;#34;:&amp;#34;rs0&amp;#34;,&amp;#34;error&amp;#34;:&amp;#34;undefined state of the replset member cynodont-26997-rs0-0.cynodont-26997-rs0.subjectivism-22940.svc.cluster.local:27017: 6&amp;#34;,&amp;#34;errorVerbose&amp;#34;:&amp;#34;undefined state of the replset member cynodont-26997-rs0-0.cynodont-26997-rs0.subjectivism-22940.svc.cluster.local:27017: 6\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:210\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:449\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371&amp;#34;,&amp;#34;stacktrace&amp;#34;:&amp;#34;github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:451\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88&amp;#34;}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can try new statuses in version 1.9.0 of both Percona Operator for MongoDB
and Percona Operator for MySQL. Percona Operator for MongoDB was released in
June and Percona Operator for MySQL is on the way.&lt;/p&gt;
</description>
      <category>kubernetes</category>
      <category>mongodb</category>
      <category>mysql</category>
    </item>
    <item>
      <title>Kubernetes Resource Management</title>
      <link>https://ege.dev/entries/2020/11/kubernetes-resource-management/</link>
      <pubDate>Fri, 13 Nov 2020 00:00:00 +0000</pubDate>
      <guid isPermaLink="false">https://ege.dev/entries/2020/11/kubernetes-resource-management/</guid>
      <description>&lt;p&gt;I had the chance to listen to &lt;a href=&#34;https://www.youtube.com/watch?v=ss6p7pjnd1U&#34;&gt;Bekir Doğan’s
presentation&lt;/a&gt;, a former Kartaca
employee, at an event in 2017. I was very impressed when I heard that they set
up and distribute all the services they manage with
&lt;a href=&#34;https://openvz.org/&#34;&gt;OpenVZ&lt;/a&gt; in containers in
2005. Was anyone really into this type of thing?&lt;/p&gt;
&lt;p&gt;Apparently, yes. Since the early 2000s, most of the industry and the Linux
community have been trying to make containers into what they are today. In
particular, Google has been a pioneer in making containers mainstream with its
contributions. You might have thought of Kubernetes right away, but this time
I’m talking about &lt;a href=&#34;https://en.wikipedia.org/wiki/Cgroups&#34;&gt;cgroups&lt;/a&gt; technology.&lt;/p&gt;
&lt;h2 id=&#34;a-brief-introduction-to-cgroups&#34;&gt;A brief introduction to cgroups&lt;/h2&gt;
&lt;p&gt;It is easy to mistake a container for the entire system’s sole owner since it
isolates a group of processes and runs on the same core with other containers
and applications instead of virtualizing the whole system. An aggressively
resource-consuming container can also destabilize the fellow containers, making
the system unstable.&lt;/p&gt;
&lt;p&gt;To prevent this, we can allocate the entire system to a container using
virtualization, but this will waste resources most of the time. For example, in
&lt;a href=&#34;https://research.google/pubs/pub43438/&#34;&gt;Borg’s design documents&lt;/a&gt;, maximum
utilization of resources is stated as one of the project’s main objectives.&lt;/p&gt;
&lt;p&gt;At this point, cgroups comes into play.&lt;/p&gt;
&lt;p&gt;Google engineers started developing cgroups in 2006, and it was included in
Linux 2.6.24 in 2008. It is a disruptive feature that shaped the ecosystem with
the domino effect it creates.&lt;/p&gt;
&lt;p&gt;With the inclusion of the code in Linux, system administrators can group the
system’s processes/tasks and subject them to common constraints. Process
priorities and resource limits can be configured and included in the
accounting. Moreover, this kernel capability paved the way for software that
radically changed system management such as LXC and later Docker.&lt;/p&gt;
&lt;p&gt;Here is a small cgroups demo for you:&lt;/p&gt;
&lt;script id=&#34;asciicast-372532&#34; src=&#34;https://asciinema.org/a/372532.js&#34; async&gt;&lt;/script&gt;
&lt;p&gt;To summarize the demo;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We create a control group called &lt;code&gt;fibtest&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Within the group we create, we run a small C application called &lt;code&gt;fibtest&lt;/code&gt;. This
application generates Fibonacci sequences continuously.&lt;/li&gt;
&lt;li&gt;With systemd-cgtop, we monitor resource consumption in all groups in the
system.&lt;/li&gt;
&lt;li&gt;The real story starts here. Inside the group, we change two values:
&lt;code&gt;cpu.cfs_period_us&lt;/code&gt; and &lt;code&gt;cpu.cfs_quota_us&lt;/code&gt;. While determining how many
microseconds each CPU period will last (50000) with &lt;code&gt;cfs_period_us&lt;/code&gt;, with
&lt;code&gt;cfs_quota_us&lt;/code&gt;, we determine the maximum number of microseconds the program can
use in each period (1000). Long story short, we choke the program.&lt;/li&gt;
&lt;li&gt;We take back the values, and &lt;code&gt;fibtest&lt;/code&gt; breathes again.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can also get the cgcreate and cgexec tools we used in the demo by
installing &lt;code&gt;cgroup-tools&lt;/code&gt; in Ubuntu 18.04, and &lt;code&gt;libcgroup-tools&lt;/code&gt; in Fedora 32.&lt;/p&gt;
&lt;h2 id=&#34;kubernetes-and-cgroups&#34;&gt;Kubernetes and cgroups&lt;/h2&gt;
&lt;p&gt;By default, system programs and containers run competitively on the machine
resources. The containers’ resource consumption can make the machine unstable
if no resources are allocated for system operations.&lt;/p&gt;
&lt;p&gt;Kubernetes also provides resource isolation for system and user processes with
cgroups. A cgroup called kubepods is created on every machine (if it doesn’t
already exist). For the Kubernetes system and services, the cgroup is not
created automatically; system administrators need to reconfigure kubelet.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;apiVersion&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;kubelet.config.k8s.io/v1beta1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;kind&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;KubeletConfiguration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;systemReserved&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;500m&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;500M&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;kubeReserved&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;500m&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;500M&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;https://ege.dev/images/allocatable.png&#34; alt=&#34;Allocatable&#34;&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;$ kubectl describe node&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;...
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;Capacity&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;attachable-volumes-gce-pd&lt;/span&gt;:  &lt;span style=&#34;color:#ae81ff&#34;&gt;127&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;:                        &lt;span style=&#34;color:#ae81ff&#34;&gt;8&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;ephemeral-storage&lt;/span&gt;:          &lt;span style=&#34;color:#ae81ff&#34;&gt;47259264Ki&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;hugepages-2Mi&lt;/span&gt;:              &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;:                     &lt;span style=&#34;color:#ae81ff&#34;&gt;30879764Ki&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;pods&lt;/span&gt;:                       &lt;span style=&#34;color:#ae81ff&#34;&gt;110&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;Allocatable&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;attachable-volumes-gce-pd&lt;/span&gt;:  &lt;span style=&#34;color:#ae81ff&#34;&gt;127&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;:                        &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;ephemeral-storage&lt;/span&gt;:          &lt;span style=&#34;color:#ae81ff&#34;&gt;18858075679&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;hugepages-2Mi&lt;/span&gt;:              &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;:                     &lt;span style=&#34;color:#ae81ff&#34;&gt;29831188Ki&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;pods&lt;/span&gt;:                       &lt;span style=&#34;color:#ae81ff&#34;&gt;110&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;Capacity&lt;/code&gt; shows the total resources seen by Kubernetes in the machine, and
&lt;code&gt;Allocatable&lt;/code&gt; shows the total resources allocated to user pods.&lt;/p&gt;
&lt;h2 id=&#34;resource-requests-and-limits&#34;&gt;Resource requests and limits&lt;/h2&gt;
&lt;p&gt;Two concepts immediately appear in Kubernetes resource management: resource
request and resource limit.&lt;/p&gt;
&lt;p&gt;Resource request is taken into account by the scheduler for placing pods on
machines. As long as there is enough space in the resource allocated to the
user pods, the pod is assigned to a machine. Once the pod is assigned to a
machine, &lt;code&gt;kubelet&lt;/code&gt; guarantees that the container can always use the requested
resource. “Pending” status means that the pod is waiting to be assigned. Every
pod’s lifecycle includes a “Pending” status; however, if a pod spends a long
time in this situation, it is useful to review its resource request. It
consists of the sum of the requests of the containers within the source request
of a pod.&lt;/p&gt;
&lt;p&gt;It is crucial to define resource requests for each container to efficiently
benefit from user pods’ total resources. The scheduler checks whether the
machine’s available capacity is higher than the pods’ entire resource request
while assigning the pods. In practice, even if the pods in the machine consume
fewer resources than their request, if the sum of their requests is equal to
the available resources, no new pods are assigned to this machine. Because, as
I mentioned above, the source requested by the container is always guaranteed
by &lt;code&gt;kubelet&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Resource limit is taken into account by &lt;code&gt;kubelet&lt;/code&gt; to prevent a pod from consuming
all the system resources.&lt;/p&gt;
&lt;p&gt;A container can consume more resources than it requests. However, if it
consumes more memory than its request and the machine runs low on memory, it
will be evicted.&lt;/p&gt;
&lt;p&gt;What happens to a container that consumes more than its limit depends on the
respective resource:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If the memory limit is exceeded, the container can be terminated and restarted if possible.&lt;/li&gt;
&lt;li&gt;If the CPU limit is exceeded, the container is not terminated; only the CPU usage is throttled.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In practice, the total resource request of all containers on the machine cannot
exceed the resource allocated to user pods; however, the sum of resource limits
may be well above the available resources.&lt;/p&gt;
&lt;p&gt;The total resource limit can exceed the maximum resource limit, just as the
aircraft companies sell extra tickets and do overbooking. In this case, it will
be more important to allocate resources for the system and Kubernetes
processes, as I explained above.&lt;/p&gt;
&lt;h2 id=&#34;namespace-level-resource-management&#34;&gt;Namespace level resource management&lt;/h2&gt;
&lt;p&gt;If you use Kubernetes namespaces to separate your services or environments
(such as test, qa), you can set predefined resource requests and limits for
each namespace. Therefore you can use default values for each container without
configuring the resources separately.&lt;/p&gt;
&lt;p&gt;For the default resource configuration, it is necessary to define a &lt;code&gt;LimitRange&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;apiVersion&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;v1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;kind&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;LimitRange&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;metadata&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;name&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;qa-limitrange&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;spec&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;limits&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#f92672&#34;&gt;default&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;512Mi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;defaultRequest&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;cpu&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;500m&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;256Mi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;type&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;Container&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;$ kubectl describe limits&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;Name&lt;/span&gt;:       &lt;span style=&#34;color:#ae81ff&#34;&gt;qa-limitrange&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;Namespace&lt;/span&gt;:  &lt;span style=&#34;color:#ae81ff&#34;&gt;qa&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;Type        Resource  Min  Max  Default Request  Default Limit  Max Limit/Request Ratio&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;----        --------  ---  ---  ---------------  ------------- -----------------------
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;Container   cpu       -    -    500m             1              -&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;Container   memory    -    -    256Mi            512Mi          -&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Since we are dealing with so many YAML files enough to experience minor crying
outbursts during the day, it is desirable to write four lines less in each
file. However, &lt;code&gt;LimitRange&lt;/code&gt; does more than that. Especially in multi-tenant
Kubernetes clusters, we can define &lt;code&gt;LimitRange&lt;/code&gt; to approve resource requests and
limits.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;apiVersion: v1
kind: LimitRange
metadata:
  name: dev-limitrange
spec:
  limits:
  - max:
      cpu: 2
      memory: 1Gi
    min:
      cpu: 1
      memory: 500Mi
    type: Container



$ kubectl describe limits
Name:       dev-limitrange
Namespace:  dev
Type        Resource  Min    Max  Default Request  Default Limit  Max Limit/Request Ratio
----        --------  ---    ---  ---------------  ------------- -----------------------
Container   cpu       500m   2    2                2              -
Container   memory    256Mi  1Gi  1Gi              1Gi            -
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When we define &lt;code&gt;LimitRange&lt;/code&gt; for a namespace, we create an admission controller,
enabling us to add any pod to the cluster after being approved in terms of
resource configuration before being accepted. Those who do not conform to the
&lt;code&gt;LimitRange&lt;/code&gt; rules get rejected. This way, third parties, independent of the
system administrator, can install new pods without making other services
unstable.&lt;/p&gt;
&lt;p&gt;Our control over namespace is not limited to this. We can also limit the total
resources of the namespace by defining a &lt;code&gt;ResourceQuota&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;apiVersion&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;v1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;kind&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;ResourceQuota&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;metadata&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;name&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;qa-resourcequota&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;namespace&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;qa&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;spec&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;hard&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;requests.cpu&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;requests.memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;8Gi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;limits.cpu&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;limits.memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;8Gi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;---
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;apiVersion&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;v1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;kind&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;ResourceQuota&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;metadata&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;name&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;test-resourcequota&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;namespace&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;test&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;spec&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;hard&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;requests.cpu&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;requests.memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;4Gi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;limits.cpu&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;limits.memory&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;4Gi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With the above configuration, we guarantee the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The total memory request and limit of pods in Namespace cannot exceed 8 GB for
qa and 4 GB for testing.&lt;/li&gt;
&lt;li&gt;The total CPU request and limit of pods in Namespace cannot exceed 2 CPUs for
qa and 1 CPU for testing.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;qos-classes&#34;&gt;QoS classes&lt;/h1&gt;
&lt;p&gt;The QoS (Quality of Service) classes belonging to pods are essential for both
scheduling and eviction. There are three classes of QoS we can use:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Guaranteed&lt;/strong&gt;: If the resource requests and limits of all containers within the
pod are equal.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Burstable&lt;/strong&gt;: Pod does not classify as Guaranteed and requests at least one
container resource.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BestEffort&lt;/strong&gt;: If no container in the pod has any resource requests or limits.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As you can see, these classes are assigned by Kubernetes based on pods’ source
configuration.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ kubectl get pod &amp;lt;pod&amp;gt; -o yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod
spec:
  containers:
  - name: container-1
    image: ...
    resources:
      limits:
        memory: 1Gi
      requests:
        cpu: 100m
        memory: 512Mi
status:
  ...
  phase: Running
  qosClass: Burstable
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;eviction&#34;&gt;Eviction&lt;/h2&gt;
&lt;p&gt;Despite all the configurations I explained above, our machines’ resources may
run out, and the cluster may become unstable. In this case, &lt;code&gt;kubelet&lt;/code&gt; will try to
recover resources quickly. If their efforts are futile, eviction process
begins.&lt;/p&gt;
&lt;p&gt;For eviction, &lt;code&gt;kubelet&lt;/code&gt; puts the pods in a row:&lt;/p&gt;
&lt;p&gt;Those pods belonging to BestEffort or Burstable QoS classes, which use more
than request sources from pods, are ranked according to their priorities and
how much resources they consume out of their requests, and are evicted.
Guaranteed pods and Burstable pods that consume fewer resources than their
requests are evicted the last. As their name suggests, Guaranteed pods are
assured that they will be not be evicted due to other pods’ resource
consumption. However, if the resources allocated to the system or Kubernetes
start to run out, they can also be evicted, beginning with the lowest priority
pod.&lt;/p&gt;
&lt;p&gt;Priority is a value that we can set with &lt;code&gt;PriorityClass&lt;/code&gt;, again as Kubernetes
administrators. In order not to complicate this blog post further, I end it
here after sharing relevant documents.&lt;/p&gt;
&lt;p&gt;This post is originally published in Kartaca&amp;rsquo;s blog. Thanks to lovely
&lt;a href=&#34;https://tr.linkedin.com/in/gizemterzi&#34;&gt;Gizem&lt;/a&gt; for editing and translating the
text.&lt;/p&gt;
</description>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
