Problem - kubeflow pipeline’s mysql is not working after docker restart of worker node

I removed nvidia-device-plugin daemonsets and installed gpu-operator , So I have to restart docker daemon to re-apply docker daemon.json

I restarted docker on my kubeflow nodes and got gcr.io/ml-pipeline/mysql is not running and show error


│ 2024-04-12T04:52:11.367223Z 0 [Note] InnoDB: Highest supported file format is Barracuda.                                                                                                                        │
│ 2024-04-12T04:52:11.578169Z 0 [ERROR] InnoDB: Ignoring the redo log due to missing MLOG_CHECKPOINT between the checkpoint 2392029097 and the end 2392029025.                                                    │
│ 2024-04-12T04:52:11.578214Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error                                                                                                            │
│ 2024-04-12T04:52:12.078809Z 0 [ERROR] Plugin 'InnoDB' init function returned error.                                                                                                                             │
│ 2024-04-12T04:52:12.078846Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.                                                                                                                  │
│ 2024-04-12T04:52:12.078856Z 0 [ERROR] Failed to initialize builtin plugins.                                                                                                                                     │
│ 2024-04-12T04:52:12.078862Z 0 [ERROR] Aborting

Find pvc and pv volume path

I do this because mysql pod mount pv as datadir , and datadir bound to pv with nfs-storage (which is pointing NAS volume)



│ apiVersion: v1                                                                                                                                                                                                  │
│ kind: PersistentVolumeClaim
...
│     requests:                                                                                                                                                                                                   │
│       storage: 20Gi                                                                                                                                                                                             │
│   storageClassName: nfs-client                                                                                                                                                                                  │
│   volumeMode: Filesystem                                                                                                                                                                                        │
│   volumeName: pvc-7f636ab1-0c3c-426e-81aa-884416119d36

---

pv

│   nfs:                                                                                                                                                                                                          │
│     path: /volume1/xxx-prod-storage/kubeflow-mysql-pv-claim-pvc-7f636ab1-0c3c-426e-81aa-884416119d36

Check mount path in Synology NAS using ssh


bash-4.4# pwd
/volume1/xxx-prod-storage/kubeflow-mysql-pv-claim-pvc-7f636ab1-0c3c-426e-81aa-884416119d36
bash-4.4# ls
auto.cnf  ca-key.pem  client-cert.pem  ib_buffer_pool  ib_logfile0  ibtmp1  mlpipeline	performance_schema  public_key.pem   server-key.pem
cachedb   ca.pem      client-key.pem   ibdata1	       ib_logfile1  metadb  mysql	private_key.pem     server-cert.pem  sys

move ib_logfile0,1 to backup for data loss


bash-4.4# mv ib_logfile0 ib_log.backup0
bash-4.4# mv ib_logfile1 ib_log.backup1

Then restart mysql deployment

Now I see different logs but still log sequence error is occuring


 mysql 2024-04-12T05:00:25.976661Z 0 [ERROR] InnoDB: Page [page id: space=0, page number=465] log sequence number 2391995870 is in the future! Current system log sequence number 1981303345.                    │
│ mysql 2024-04-12T05:00:25.976665Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7 │
│ mysql 2024-04-12T05:00:26.142008Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"                                                                                                             │
│ mysql 2024-04-12T05:00:26.142038Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables

Put mysql innodb options by using Configmap

I want to put innodb options when start mysql, so I’ll create configmaps

my.cnf ( Check innodb_force_recovery options on below https://www.notion.so/kade93/MySQL-ERROR-InnoDB-Ignoring-the-redo-log-2024-6e0c23bd1ee04751b8e7d3a78cb322bd?pvs=4#d8049954c79a491e8f1607b82bad6a30)


[mysqld]
innodb_log_checksums = ON
innodb_force_recovery = 1 # Be aware to use this options


kubectl create configmap pipeline-mysql-config --from-file=my.cnf=./my.cnf --dry-run=client -oyaml
apiVersion: v1
data:
  my.cnf: |+
    [mysqld]
    innodb_log_checksums = ON
    innodb_force_recovery = 1

kind: ConfigMap
metadata:
  creationTimestamp: null
  name: pipeline-mysql-config


kubectl create configmap pipeline-mysql-config --from-file=my.cnf=./my.cnf --dry-run=client -oyaml > mysql-config.yaml

edit mysql-config.yaml


apiVersion: v1
data:
  my.cnf: |
    [mysqld]
    innodb_log_checksums = ON
    innodb_force_recovery = 1
kind: ConfigMap
metadata:
  creationTimestamp: null
  name: pipeline-mysql-config
  namespace: kubeflow

edit mysql deployment to mount configvolume


apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
  labels:
    app: mysql
    application-crd-id: kubeflow-pipelines
  name: mysql
  namespace: kubeflow
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: mysql
      application-crd-id: kubeflow-pipelines
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: mysql
        application-crd-id: kubeflow-pipelines
    spec:
      containers:
      - args:
        - --ignore-db-dir=lost+found
        - --datadir
        - /var/lib/mysql
        env:
        - name: MYSQL_ALLOW_EMPTY_PASSWORD
          value: "true"
        image: gcr.io/ml-pipeline/mysql:5.7.37
        imagePullPolicy: IfNotPresent
        name: mysql
        ports:
        - containerPort: 3306
          name: mysql
          protocol: TCP
        resources:
          requests:
            cpu: "1"
            memory: 1Gi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/mysql
          name: mysql-persistent-storage
        - mountPath: /etc/mysql/conf.d/my-custom.cnf
          subPath: my.cnf
          name: config-volume
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: mysql
      serviceAccountName: mysql
      terminationGracePeriodSeconds: 30
      volumes:
      - name: mysql-persistent-storage
        persistentVolumeClaim:
          claimName: mysql-pv-claim
      - name: config-volume
        configMap:
          name: pipeline-mysql-config
~

Finally run mysql on k8s successed


 2024-04-12T05:21:58.556526Z 0 [Note] mysqld: ready for connections.                                                                                                                                             │
│ Version: '5.7.37'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)                                                                                                              │
│ 2024-04-12T05:21:59.464324Z 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of page [page id: space=40, page number=3]. You may have to recover from a backup.                         │
│ 2024-04-12T05:21:59.464365Z 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):                                                                                                                          │
│  len 16384; hex 52263e4200000003ffffffffffffffff000000008e936ea245bf00000000000000000000002800021747800300000000008c00050000000100000000000000000000000000000000004a000000280000000200f200000028000000020032010 │
│ InnoDB: End of page dump                                                                                                                                                                                        │
│ 2024-04-12T05:21:59.505108Z 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 1378238018, calculated checksums for field1: crc32 1378238018/3377886394, innodb 1936845683, none 3735928559, stored  │
│ InnoDB: Page may be an update undo log page                                                                                                                                                                     │
│ InnoDB: Page may be an index page where index id is 74                                                                                                                                                          │
│ 2024-04-12T05:21:59.505126Z 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index pag │
│ 2024-04-12T05:22:00.530733Z 0 [Note] InnoDB: Buffer pool(s) load completed at 240412  5:22:00                                                                                                                   │
│ 2024-04-12T05:22:18.221555Z 3 [ERROR] InnoDB: innodb_force_recovery is on. We do not allow database modifications by the user. Shut down mysqld and edit my.cnf to set innodb_force_recovery=0                  │
│ 2024-04-12T05:22:18.239105Z 3 [Note] Aborted connection 3 to db: 'cachedb' user: 'root' host: '127.0.0.1' (Got an error reading communication packets)

References

InnoDB: Ignoring the redo log due to missing MLOG_CHECKPOINT

I am working with mysql version 5.7.14 (homebrew installation on OS X El Capitan). My system did not shutdown properly while the mysql was running and after rebooting when i try starting the mysql ...

https://dba.stackexchange.com/questions/163445/innodb-ignoring-the-redo-log-due-to-missing-mlog-checkpoint

MySQL :: MySQL 5.7 Reference Manual :: 14.22.2 Forcing InnoDB Recovery

To investigate database page corruption, you might dump your tables from the database with SELECT ... INTO OUTFILE. Usually, most of the data obtained in this way is intact. Serious corruption might cause SELECT * FROM tbl_name statements or InnoDB background operations to unexpectedly exit or assert, or even cause InnoDB roll-forward recovery to crash. In such cases, you can use the innodb_force_recovery option to force the InnoDB storage engine to start up while preventing background operations from running, so that you can dump your tables. For example, you can add the following line to the [mysqld] section of your option file before restarting the server:

https://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html

kubeflow MySQL [ERROR] InnoDB Ignoring the redo log

Problem - kubeflow pipeline’s mysql is not working after docker restart of worker node

Find pvc and pv volume path

Check mount path in Synology NAS using ssh

Put mysql innodb options by using Configmap

References