Table of Contents
Problem Statement
Customer anticipated to have different user for Grid infrastructure (grid) and Oracle database (ora12c) installation however OEDA has been generated with single user (ora12c) for entire oracle stack.
Solution Implemented
Changing Grid infrastructure ownership is not straight forward. Therefore we decided to rollback whole Exadata configuration using OEDA deployment script (install.sh).
Run OEDA script from “root” user with same configuration file (CONFIG-demo1.xml) created during initial setup.
Directory (/u01/onecommand/linux-x64) hosts setup files on the first database server.
Rollback steps performed by OEDA
- Validate Configuration File
- Update Nodes for Eighth Rack
- Setup Required Files
- Create Users
- Setup Cell Connectivity
- Verify Infiniband
- Calibrate Cells
- Create Cell Disks
- Create Grid Disks
- Configure Alerting
- Install Cluster Software
- Initialize Cluster Software
- Install Database Software
- Relink Database with RDS
- Create ASM Diskgroups
- Create Databases
- Apply Security Fixes
- Install Autonomous Health Framework
- Setup ASR Alerting
- Create Installation Summary
Log file of each rollback step is located in /u01/onecommand/linux-x64/log directory.
Diagnostics files generated in compressed format after OEDA failure.
Directory /u01/onecommand/linux-x64/WorkDir contains information relevant to the error.
Undo step option -u { <number-number> | <step num> }.
Steps performed in descending order. Change directory to /u01/onecommand/linux-x64.
Create Installation Summary
./install.sh -cf CONFIG-demo1.xml -u 20 Initializing Undoing Create Installation Summary This is an invalid option... Successfully completed execution of step Create Installation Summary [elapsed Time [Elapsed = 5 mS [0.0 minutes] Tue Oct 08 19:16:29 IST 2019]]
Setup ASR Alerting
./install.sh -cf CONFIG-demo1.xml -u 19 Initializing Undoing Setup ASR Alerting Successfully completed execution of step Setup ASR Alerting [elapsed Time [Elapsed = 5 mS [0.0 minutes] Tue Oct 08 19:16:46 IST 2019]]
Install Autonomous Health Framework
./install.sh -cf CONFIG-demo1.xml -u 18 Initializing Undoing Install Autonomous Health Framework Uninstalling AHF from all the nodes.. Successfully completed execution of step Install Autonomous Health Framework [elapsed Time [Elapsed = 22075 mS [0.0 minutes] Tue Oct 08 19:17:20 IST 2019]]
Apply Security Fixes
./install.sh -cf CONFIG-demo1.xml -u 17 Initializing Undoing Apply Security Fixes This is an invalid option... Successfully completed execution of step Apply Security Fixes [elapsed Time [Elapsed = 5 mS [0.0 minutes] Tue Oct 08 19:20:44 IST 2019]]
Create Databases
./install.sh -cf CONFIG-demo1.xml -u 16 Initializing Undoing Create Databases Deleting databases... Deleting database [db1db1] ====== ERROR from node demo1dbadm01.domain.com ====== Command = demo1dbadm01.domain.com | ora12c | ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1;export ORACLE_HOME; ORACLE_BASE=/u01/app/oracle;export ORACLE_BASE;cd /u01/app/oracle/product/12.1.0.2/dbhome_1; /u01/app/oracle/product/12.1.0.2/dbhome_1/bin/dbca -deleteDatabase -silent -sourceDB db1db1 -continueOnNonFatalErrors true Ret code = <2> from node demo1dbadm01.domain.com Auth fail ## ERROR End from node demo1dbadm01.domain.com ====== End ERROR Output from node demo1dbadm01.domain.com ====== Error: Command [ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1;export ORACLE_HOME; ORACLE_BASE=/u01/app/oracle;export ORACLE_BASE;cd /u01/app/oracle/product/12.1.0.2/dbhome_1; /u01/app/oracle/product/12.1.0.2/dbhome_1/bin/dbca -deleteDatabase -silent -sourceDB db1db1 -continueOnNonFatalErrors true] run on node demo1dbadm01.domain.com as user ora12c did not execute successfully... Collecting diagnostics... Errors occurred. Send /u01/onecommand/linux-x64/WorkDir/Diag-200128_192129.zip to Oracle to receive assistance. ERROR: Command [ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1;export ORACLE_HOME;ORACLE_BASE=/u01/app/oracle; export ORACLE_BASE;cd /u01/app/oracle/product/12.1.0.2/dbhome_1; /u01/app/oracle/product/12.1.0.2/dbhome_1/bin/dbca -deleteDatabase -silent -sourceDB db1db1 -continueOnNonFatalErrors true] run on node demo1dbadm01.domain.com as user ora12c did not execute successfully... Error: Errors occurred... ERROR: Error running oracle.onecommand.deploy.software.SoftwareUtils method deleteDatabases Error: Errors occurred... Errors occured, exiting...
Deleting database failed with highlighted error. ora12c user locked and reset using pam_tally2
passwd -u ora12c pam_tally2 --user ora12c --reset
./install.sh -cf CONFIG-demo1.xml -u 16 Initializing Undoing Create Databases Deleting databases... Deleting database [db1db1] Deconfiguring Huge Pages after dropping database [db1db1] Successfully completed execution of step Create Databases [elapsed Time [Elapsed = 124229 mS [2.0 minutes] Tue Oct 08 19:34:19 IST 2019]]
Create ASM Diskgroups
./install.sh -cf CONFIG-demo1.xml -u 15 Initializing Undoing Create ASM Diskgroups Successfully completed execution of step Create ASM Diskgroups [elapsed Time [Elapsed = 14557 mS [0.0 minutes] Tue Oct 08 19:35:30 IST 2019]]
Relink Database with RDS
./install.sh -cf CONFIG-demo1.xml -u 14 Initializing Undoing Relink Database with RDS This is an invalid option... Successfully completed execution of step Relink Database with RDS [elapsed Time [Elapsed = 4 mS [0.0 minutes] Tue Oct 08 19:35:58 IST 2019]]
Install Database Software
./install.sh -cf CONFIG-demo1.xml -u 13 Initializing Undoing Install Database Software Deinstalling database home software... Deinstalling database home DbHome_1 Successfully completed execution of step Install Database Software [elapsed Time [Elapsed = 67306 mS [1.0 minutes] Tue Oct 08 19:37:31 IST 2019]] ====> DB HOME not removed
Initialize Cluster Software
./install.sh -cf CONFIG-demo1.xml -u 12 Initializing Undoing Initialize Cluster Software Successfully completed execution of step Initialize Cluster Software [elapsed Time [Elapsed = 288219 mS [4.0 minutes] Tue Oct 08 19:43:38 IST 2019]]
Install Cluster Software
./install.sh -cf CONFIG-demo1.xml -u 11 Initializing Undoing Install Cluster Software Deinstalling cluster demo1-cluster Successfully completed execution of step Install Cluster Software [elapsed Time [Elapsed = 114216 mS [1.0 minutes] Tue Oct 08 19:46:05 IST 2019]]
Configure Alerting
./install.sh -cf CONFIG-demo1.xml -u 10 Initializing Undoing Configure Alerting DeConfiguring Alerting Deconfiguring cell alerting on node demo1celadm01.domain.com Deconfiguring cell alerting on node demo1celadm02.domain.com Deconfiguring cell alerting on node demo1celadm03.domain.com Deconfiguring alerting on node demo1dbadm01.domain.com Deconfiguring alerting on node demo1dbadm02.domain.com Successfully completed execution of step Configure Alerting [elapsed Time [Elapsed = 26313 mS [0.0 minutes] Tue Oct 08 19:48:12 IST 2019]]
Create Grid Disks
./install.sh -cf CONFIG-demo1.xml -u 9 Initializing Undoing Create Grid Disks Dropping grid disks... If you want to drop grid disks and are aware of potential data loss, re-run this step with -override flag Successfully completed execution of step Create Grid Disks [elapsed Time [Elapsed = 10 mS [0.0 minutes] Tue Oct 08 19:48:41 IST 2019]]
./install.sh -cf CONFIG-demo1.xml -u 9 -override Initializing Undoing Create Grid Disks Dropping grid disks... Going to drop grid disks in cluster demo1-cluster Deleting quorum devices... Successfully completed execution of step Create Grid Disks [elapsed Time [Elapsed = 21559 mS [0.0 minutes] Tue Oct 08 19:49:46 IST 2019]]
Create Cell Disks
./install.sh -cf CONFIG-demo1.xml -u 8 Initializing Undoing Create Cell Disks If you want to drop cell disks and are aware of potential data loss, re-run this step with -override flag Successfully completed execution of step Create Cell Disks [elapsed Time [Elapsed = 11 mS [0.0 minutes] Tue Oct 08 19:50:36 IST 2019]]
./install.sh -cf CONFIG-demo1.xml -u 8 -override Initializing Undoing Create Cell Disks Reset FlashCache mode to WriteThrough in [demo1celadm01.domain.com, demo1celadm02.domain.com, demo1celadm03.domain.com] Successfully completed execution of step Create Cell Disks [elapsed Time [Elapsed = 92577 mS [1.0 minutes] Tue Oct 08 19:52:30 IST 2019]]
Calibrate Cells
./install.sh -cf CONFIG-demo1.xml -u 7 Initializing Undoing Calibrate Cells This is an invalid option... Successfully completed execution of step Calibrate Cells [elapsed Time [Elapsed = 4 mS [0.0 minutes] Tue Oct 08 19:53:21 IST 2019]]
Verify Infiniband
./install.sh -cf CONFIG-demo1.xml -u 6 Initializing Undoing Verify Infiniband This is an invalid option... Successfully completed execution of step Verify Infiniband [elapsed Time [Elapsed = 5 mS [0.0 minutes] Tue Oct 08 19:53:35 IST 2019]]
Setup Cell Connectivity
./install.sh -cf CONFIG-demo1.xml -u 5 Initializing Undoing Setup Cell Connectivity Deleting cellip.ora and cellinit.ora... Deleting cellip.ora and cellinit.ora on cluster demo1-cluster Done deleting cellip.ora and cellinit.ora... Successfully completed execution of step Setup Cell Connectivity [elapsed Time [Elapsed = 25735 mS [0.0 minutes] Tue Oct 08 19:54:19 IST 2019]]
Create Users
./install.sh -cf CONFIG-demo1.xml -u 4 Initializing Undoing Create Users Deleting cluster users... Deleting cluster users... Deleting groups... Done deleting users and groups on all clusters... Successfully completed execution of step Create Users [elapsed Time [Elapsed = 17122 mS [0.0 minutes] Tue Oct 08 19:56:57 IST 2019]]
Setup Required Files
./install.sh -cf CONFIG-demo1.xml -u 3 Initializing Undoing Setup Required Files Zip files have been deleted. If you want to delete patch and ship home directories, re-run this step with -delete flag Successfully completed execution of step Setup Required Files [elapsed Time [Elapsed = 12103 mS [0.0 minutes] Tue Oct 08 19:58:31 IST 2019]]
Update Nodes for Eighth Rack
./install.sh -cf CONFIG-demo1.xml -u 2 Initializing Undoing Update Nodes for Eighth Rack Skip Eighth rack configuration in compute node demo1dbadm02.domain.com running reset on: demo1celadm02 running reset on: demo1celadm03 running reset on: demo1celadm01 demo1celadm03 has required number of total cores enabled :64 demo1celadm02 has required number of total cores enabled :64 demo1celadm01 needs total CPU cores set from 32 to 64 Skip Eighth rack configuration in compute node demo1dbadm01.domain.com Successfully completed execution of step Update Nodes for Eighth Rack [elapsed Time [Elapsed = 32636 mS [0.0 minutes] Tue Oct 08 20:00:09 IST 2019]]
Validate Configuration File
./install.sh -cf CONFIG-demo1.xml -u 1 Initializing Undoing Validate Configuration File This is an invalid option... Successfully completed execution of step Validate Configuration File [elapsed Time [Elapsed = 4 mS [0.0 minutes] Tue Oct 08 20:00:34 IST 2019]]