MySQL InnoDB Cluster setup

Hey folks, today I would like to show how easy is to do a HA cluster with MySQL InnoDB Cluster.

First a few things we must know: You’ll need at least 3 servers (for database) to do the job. We’ll need MySQL Shell on every server. Every node needs a two-way connection with each node (he need to be able to connect to the other node, and other nodes needs to be able to connect to the one). You will need a fourth server to work as a router. I’ll assume you already have a well configured MySQL Server on all three servers. Today I’ll be using version 8.

To be easier, every server will have an entry to the others on the hosts file. So let’s begin creating a common user on every server and give privileges enough to do the JOB. Be aware that I’m no focusing on security issues, like allowing the user to access the database from any IP, the best solution would be set the servers IPs.

CREATE USER 'syncron'@'%' IDENTIFIED BY 'Abacaxi22';
GRANT ALL PRIVILEGES ON *.* TO 'syncron'@'%' WITH GRANT OPTION;

1) Creating the cluster

Now let’s fire MySQL Shell on every node and prepare the server to join/create a InnoDB Cluster. If you never used this Shell, give it a try, it’s an awesome tool and it’s really easy to use and learn. We’ll basically use the dba commando for all of our needs today. Run dba.configureInstance() on every node and then, on the “main” server run the following.

You can run the Shell with mysqlch --uri=syncron@mysql1:3306 and voila. You’re on the server 1.

cluster = dba.createCluster('tasks') --Cluster name, take note
dba.getCluster('tasks').status()

After each command, you should see the following results. Look that I’m actually running the command on a server named mysql1.

Look at the status result, and see that after creating the cluster, we already have a node running and he’s a R/W node. And look at the ‘statusText’.. The cluster still not fault tolerant.

2) Adding a node to the cluster

Now let’s connect to the second server. On the second server, instead of connecting to itself, we’ll connect to the mysql1 server, so use the same command as before to spin the Shell. After that, let’s add the second instance to the cluster.

cluster = dba.getCluster('tasks') -- the cluster name, remember?
cluster.addInstance('syncron@mysql2:3306') -- now we use the second server uri

IF everything works as expected you’ll see the following as a result.

You can run the status on the cluster again to see how it’s now. But it still not fault tolerant. Do the same job on the third server (mysql3 on my environment) and THEN you can get the status and see the magic (or at least the status page of it).

Be aware that, the server now IS fault tolerant BUT, can tolerate just one. But take a closer look and see that we have just one node with R/W permission, the other two are R/O… and that’s how it works (in this case at least). How we handle the connections now that we have three different servers running? Easy… use the Mysql Router… Now to the fourth server.

3) The Router

That’s the easy step… just run .\mysqlrouter.exe --bootstrap syncron@mysql1:3306 and you’ll get the config file generated. That’s just the default configuration, take a look at the manual to achieve better results. The router will give you two port to handle the connections, one with R/W permission and the other one with R/O. It’ll handle the “reconnect” for your application if needed.

Easy huh? Go ahead and shut down one instance and look at the cluster status again. Now two extra tips..

  • If you get the following when trying to configure or join a cluster, you need to check the grant of the user, here we used the most of it, but you can manage the permissions to achieve a better (and more secure) login.

Access denied; you need SYSTEM_VARIABLES_ADMIN and PERSIST_RO_VARIABLES_ADMIN privileges for this operation (MySQL Error 3630)
  • If you plan to upgrade your server, do it with caution. You MUST do the R/W node at the END. Before shut down the server run set persist group_replication_start_on_boot=0; and set to true again after the upgrade and then restart it again.

Cheers!

Starting ASM: ORA-29701: unable to connect to Cluster Synchronization Service

Hey all,
So, I bet you have seen this error already, as this is quite common when messing up with Cluster configuration, which DBAs love to do…. no?

Well, here is what you may be facing:

SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
SQL>

The error is kind of clear: Cluster Synchronization Service (CSS) is not available. So, let’s start it from ASM Cluster (or HAS).

$GRID_HOME/bin/crsctl start resource -all

Or, for Standalone:

$GRID_HOME/bin/crsctl start has

To check on status:

$GRID_HOME/bin/crsctl status resource -t

Complete example (attention to CSSD):

[root@greporasrv1 ~]# crsctl start has
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
[root@greporasrv1 ~]# crsctl start resource -all
CRS-5702: Resource ‘ora.evmd’ is already running on ‘greporasrv1’
CRS-2501: Resource ‘ora.ons’ is disabled
CRS-2672: Attempting to start ‘ora.cssd’ on ‘greporasrv1’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘greporasrv1’
CRS-2676: Start of ‘ora.diskmon’ on ‘greporasrv1’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘greporasrv1’ succeeded
CRS-4000: Command Start failed, or completed with errors.
[root@greporasrv1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ons
OFFLINE OFFLINE greporasrv1 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 ONLINE ONLINE greporasrv1 STABLE
ora.diskmon
1 OFFLINE OFFLINE STABLE
ora.evmd
1 ONLINE ONLINE greporasrv1 STABLE
--------------------------------------------------------------------------------
[root@greporasrv1 ~]#

Hope that worked! 😀

Oh, it didn’t? Did you changed hostname name or something? In this case, you may want to deconfig HAS and reconfigure using root.sh (part regular installation):

cd $ORACLE_HOME
./crs/install/roothas.pl -deconfig -force
./crs/install/roothas.pl -delete -force
./root.sh

 

Hey! Be careful with that, it might be unrecoverable. 😉


									

GRID Infrastructure life after rootcrs.pl -deconfig -force -verbose -lastnode

Hi all,

I had a client which asked us to reconfigure the GRID infrastructure on 11g after the did a:

$GI_HOME/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode

The “deconfig” option is used when we need to remove the GI configuration cleanly and the “lastnode”  is executed on the last cluster node.

But what we need to do to recreate the the cluster? Well most would say “Run root.sh again” and that should solve it on most cases.

But when I tried to execute it I have several issues on crsconfig_params file. I tried to manually add the missing data and as there was much info to add but what to do next?

A colleague pointed to 2 MOS notes:

How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure (Doc ID 1377349.1)
How to Configure or Re-configure Grid Infrastructure With config.sh/config.bat (Doc ID 1354258.1)

So if you follow those notes  you should prepare a response run the config.sh to create a proper crsconfig_params and then run root.sh

$GI_HOME/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'greporarac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'greporarac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'greporarac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'greporarac1'
CRS-2676: Start of 'ora.diskmon' on 'greporarac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'greporarac1' succeeded
PROT-1: Failed to initialize ocrconfig
PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

Failed to create Oracle Cluster Registry configuration, rc 255
Oracle Grid Infrastructure Repository configuration failed at /app/11.2.0.4/grid/crs/install/crsconfig_lib.pm line 6911.
/app/11.2.0.4/grid/perl/bin/perl -I/app/11.2.0.4/grid/perl/lib -I/app/11.2.0.4/grid/crs/install /app/11.2.0.4/grid/crs/install/rootcrs.pl execution failed
$GI_HOME/crs/config/config.sh -silent -responseFile $GI_HOME/crs/config/grid_configwizard_1.rsp -ignorePreReq

As a root user, execute the following script(s):
1. /app/11.2.0.4/grid/root.sh

Execute /app/11.2.0.4/grid/root.sh on the following nodes:
[rac1, rac1]

Successfully Setup Software.
[WARNING] [INS-32091] Software installation was successful. But some configuration assistants failed, were cancelled or skipped.
ACTION: Refer to the logs or contact Oracle Support Services.
oracle@rac1:/app/11.2.0.4/grid/crs/config>


root@rac1 /app/11.2.0.4/grid/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.conf

CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded

ASM created and started successfully.

Disk Group CRS created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Successful addition of voting disk 04713db813e14f5abf6b385896b1ca1d.
Successful addition of voting disk 8910f31e58db4f30bfdca40e34a0ffbd.
Successful addition of voting disk 7fefe24a8d4b4fcbbfd2d25a4307a1e0.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 04713db813e14f5abf6b385896b1ca1d (/dev/mapper/ora-pure-ractestwt1-crs-1) [CRS]
2. ONLINE 8910f31e58db4f30bfdca40e34a0ffbd (/dev/mapper/ora-pure-ractestwt1-crs-2) [CRS]
3. ONLINE 7fefe24a8d4b4fcbbfd2d25a4307a1e0 (/dev/mapper/ora-pure-ractestwt1-crs-3) [CRS]
Located 3 voting disk(s).


CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.CRS.dg' on 'rac1'
CRS-2676: Start of 'ora.CRS.dg' on 'rac1' succeeded
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.

p_raghu's password on rac1:
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.conf

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node rac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

 

VMWare: Adding Shared Disks for Clustered Oracle Database

Hi folks!
Today a friend asked about how to configure disks on VMWare to create a virtualized cluster database. I revisited my old notes and decided to share. Here it goes…

First, I really have some constraints about it:
– Fake “high availability”: To have HA with VM it’s not needed 2 vms, if a host fail VMWare should make a VMotion (if well configured), and no services will be affected. So, one VM is ok.
– Not real “horizontally scallated”: It probably would be better to use one server as physical than have two vms on it. Not make sense to do it…

So, why?
To prove concept, evaluate RAC configuration (caches on sequences, etc) and labs, to learn and practice RAC stuffs…

Ok, now how to make it happen?

1. Add new disk to one of the machines. Some way, one will be the “primary” and share disks with another.
add1.jpg

2. Set Mode Thick Eager Zeroed

add2.jpg

3. Create a specific controller to this “shared disks”

add3.jpg

4. Set controller to virtual sharing

add4.jpg

# Other Machine
5. Adding the existent disk to other VM (not primary, but from primary)

add5.jpg
6. Select disk from primary

add6.jpg

add7.jpg

7. Create a new controller, as you made on primary and select it:

add8.jpgadd9.jpg

8. Set controller to virtual sharing
add10

OBS:
If this error happen, one of your controller is not in sharing mode. Please check it.
add11

 

And here we are! 🙂
Good lab!
Matheus.

CRSCTL: AUTO_START of Cluster Services (ACFS)

As I sad long time ago (Manually Mounting ACFS)… Here is it:

To set autostart of a resource (in my case an ACFS) by CRSCTL, here the simple example:

# Check How it is currently configured:

[root@db1database1p bin]# ./crs_stat -p ora.dghome.dbhome.acfs |grep AUTO_START
AUTO_START=restore

# Set Autostart (and check):

[root@db1database1p bin]# ./crsctl modify resource ora.dghome.dbhome.acfs -attr AUTO_START=always
[root@db1database1p bin]# ./crs_stat -p ora.dghome.dbhome.acfs |grep AUTO_START
AUTO_START=always

It can be done also with “AUTO_START=1”. We have 3 possibilities (always, restore and never).

# KB
http://docs.oracle.com/cd/E11882_01/rac.112/e16794/resatt.htm#CWADD91444

Matheus.