kade.im
kubeflow MySQL [ERROR] InnoDB Ignoring the redo log

kubeflow MySQL [ERROR] InnoDB Ignoring the redo log

Tags
k8s
Infra
Debug
Wrote
2024.04

Problem - kubeflow pipeline’s mysql is not working after docker restart of worker node

I removed nvidia-device-plugin daemonsets and installed gpu-operator , So I have to restart docker daemon to re-apply docker daemon.json
 
│ 2024-04-12T04:52:11.367223Z 0 [Note] InnoDB: Highest supported file format is Barracuda. │ │ 2024-04-12T04:52:11.578169Z 0 [ERROR] InnoDB: Ignoring the redo log due to missing MLOG_CHECKPOINT between the checkpoint 2392029097 and the end 2392029025. │ │ 2024-04-12T04:52:11.578214Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error │ │ 2024-04-12T04:52:12.078809Z 0 [ERROR] Plugin 'InnoDB' init function returned error. │ │ 2024-04-12T04:52:12.078846Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. │ │ 2024-04-12T04:52:12.078856Z 0 [ERROR] Failed to initialize builtin plugins. │ │ 2024-04-12T04:52:12.078862Z 0 [ERROR] Aborting
 
 

Find pvc and pv volume path

  • I do this because mysql pod mount pv as datadir , and datadir bound to pv with nfs-storage (which is pointing NAS volume)
│ apiVersion: v1 │ │ kind: PersistentVolumeClaim ... │ requests: │ │ storage: 20Gi │ │ storageClassName: nfs-client │ │ volumeMode: Filesystem │ │ volumeName: pvc-7f636ab1-0c3c-426e-81aa-884416119d36 --- pv │ nfs: │ │ path: /volume1/xxx-prod-storage/kubeflow-mysql-pv-claim-pvc-7f636ab1-0c3c-426e-81aa-884416119d36
 

Check mount path in Synology NAS using ssh

bash-4.4# pwd /volume1/xxx-prod-storage/kubeflow-mysql-pv-claim-pvc-7f636ab1-0c3c-426e-81aa-884416119d36 bash-4.4# ls auto.cnf ca-key.pem client-cert.pem ib_buffer_pool ib_logfile0 ibtmp1 mlpipeline performance_schema public_key.pem server-key.pem cachedb ca.pem client-key.pem ibdata1 ib_logfile1 metadb mysql private_key.pem server-cert.pem sys
 
  • move ib_logfile0,1 to backup for data loss
bash-4.4# mv ib_logfile0 ib_log.backup0 bash-4.4# mv ib_logfile1 ib_log.backup1
 
  • Then restart mysql deployment
    • notion image
       
  • Now I see different logs but still log sequence error is occuring
mysql 2024-04-12T05:00:25.976661Z 0 [ERROR] InnoDB: Page [page id: space=0, page number=465] log sequence number 2391995870 is in the future! Current system log sequence number 1981303345. │ │ mysql 2024-04-12T05:00:25.976665Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7 │ │ mysql 2024-04-12T05:00:26.142008Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" │ │ mysql 2024-04-12T05:00:26.142038Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
 

Put mysql innodb options by using Configmap

  • I want to put innodb options when start mysql, so I’ll create configmaps
[mysqld] innodb_log_checksums = ON innodb_force_recovery = 1 # Be aware to use this options
 
kubectl create configmap pipeline-mysql-config --from-file=my.cnf=./my.cnf --dry-run=client -oyaml apiVersion: v1 data: my.cnf: |+ [mysqld] innodb_log_checksums = ON innodb_force_recovery = 1 kind: ConfigMap metadata: creationTimestamp: null name: pipeline-mysql-config
kubectl create configmap pipeline-mysql-config --from-file=my.cnf=./my.cnf --dry-run=client -oyaml > mysql-config.yaml
 
  • edit mysql-config.yaml
apiVersion: v1 data: my.cnf: | [mysqld] innodb_log_checksums = ON innodb_force_recovery = 1 kind: ConfigMap metadata: creationTimestamp: null name: pipeline-mysql-config namespace: kubeflow
 
  • edit mysql deployment to mount configvolume
apiVersion: apps/v1 kind: Deployment metadata: annotations: labels: app: mysql application-crd-id: kubeflow-pipelines name: mysql namespace: kubeflow spec: replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: mysql application-crd-id: kubeflow-pipelines strategy: type: Recreate template: metadata: labels: app: mysql application-crd-id: kubeflow-pipelines spec: containers: - args: - --ignore-db-dir=lost+found - --datadir - /var/lib/mysql env: - name: MYSQL_ALLOW_EMPTY_PASSWORD value: "true" image: gcr.io/ml-pipeline/mysql:5.7.37 imagePullPolicy: IfNotPresent name: mysql ports: - containerPort: 3306 name: mysql protocol: TCP resources: requests: cpu: "1" memory: 1Gi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /var/lib/mysql name: mysql-persistent-storage - mountPath: /etc/mysql/conf.d/my-custom.cnf subPath: my.cnf name: config-volume dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: mysql serviceAccountName: mysql terminationGracePeriodSeconds: 30 volumes: - name: mysql-persistent-storage persistentVolumeClaim: claimName: mysql-pv-claim - name: config-volume configMap: name: pipeline-mysql-config ~
 
 
  • Finally run mysql on k8s successed
2024-04-12T05:21:58.556526Z 0 [Note] mysqld: ready for connections. │ │ Version: '5.7.37' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server (GPL) │ │ 2024-04-12T05:21:59.464324Z 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of page [page id: space=40, page number=3]. You may have to recover from a backup. │ │ 2024-04-12T05:21:59.464365Z 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes): │ │ len 16384; hex 52263e4200000003ffffffffffffffff000000008e936ea245bf00000000000000000000002800021747800300000000008c00050000000100000000000000000000000000000000004a000000280000000200f200000028000000020032010 │ │ InnoDB: End of page dump │ │ 2024-04-12T05:21:59.505108Z 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 1378238018, calculated checksums for field1: crc32 1378238018/3377886394, innodb 1936845683, none 3735928559, stored │ │ InnoDB: Page may be an update undo log page │ │ InnoDB: Page may be an index page where index id is 74 │ │ 2024-04-12T05:21:59.505126Z 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index pag │ │ 2024-04-12T05:22:00.530733Z 0 [Note] InnoDB: Buffer pool(s) load completed at 240412 5:22:00 │ │ 2024-04-12T05:22:18.221555Z 3 [ERROR] InnoDB: innodb_force_recovery is on. We do not allow database modifications by the user. Shut down mysqld and edit my.cnf to set innodb_force_recovery=0 │ │ 2024-04-12T05:22:18.239105Z 3 [Note] Aborted connection 3 to db: 'cachedb' user: 'root' host: '127.0.0.1' (Got an error reading communication packets)
 
 

References