X

Best Practices from Oracle Development's A‑Team

Switches Inside Oracle's Engineered Systems

Continuing from my last blog about InfiniBand building blocks, now lets review the network switches used inside Oracle's Engineered Systems a little bit in detail. This will help you in understanding the overall integration, network design, architecture and troubleshooting in later articles.

There are total two category of network switches used to prepare computing environment inside the rack.

InfiniBand Switches - two models used depending on requirements

  1. Sun Oracle 36-port InfiniBand Switch
  2. Sun Oracle InfiniBand Gateway Switch

Ethernet Switch - primarily for management purposes

  1. Cisco Catalyst 4948

The following table will get you started quickly and save me a lot of writing.

 

External IB Ports IB SignalBitrate IB Port Labels Ethernet Ports
Sun Oracle 36-port InfiniBand Switch 36
QSFP+
40Gbps 0A-17A
0B-17B
Sun Oracle InfiniBand GatewaySwitch 36-4
=32
QSFP+
40Gbps 0A-15A
0B-15B
EoIB
Two QSFP+
10Gbps per port
0A-ETH-[1 to 4]
1A-ETH-[1 to 4]
CiscoCatalyst 4948 48 [1-48]
10/100/1000
Base-T

 

Let me first give you some more insight on the InfiniBand switches and then we will talk about the Cisco Catalyst 4948. The following picture shows the 36-port IB switch. Gateway switch also looks similar with slight difference for the EoIB ports on extreme right.

switches-inside-Oracle-ES-IB

Common information that applies to both of these InfiniBand switches

      • Form Factor: One rack unit (1U) height
  • Power Supplies: Two
  • Cooling Fans: Five
  • IB Subnet Management: Yes
  • Firmware Upgradeable: Yes
  • Command Line Access: Yes. Via ssh and usb-serial access
  • Web Based Management: Yes
  • SNMP Access: Yes

As you might have figured out by now that the IB Gateway switch is almost like a super set of 36-port switch in terms of features and capabilities.

Differences between 36-port and Gateway InfiniBand switches

Comparatively, there are four additional IB ports on 36-port switch. On the Gateway switch these are internally consumed to enable Ethernet over InfiniBand (EoIB) functionality. I am sure you are wondering how this is done. The simple explanation here is that there are two additional hardware devices installed inside IB Gateway switch. These are called Bridge-X, each of which internally connects to InfiniBand fabric via two IB ports. Hence, I showed the math of 36-4=32 in the table above. Towards the external world, they expose EoIB ports as 0A-ETH and 1A-ETH in QSFP+ form factor. But all devices in the the Ethernet world may not understand QSFP+ and we are not commonly using 40Gbps Ethernet too, so these are split into four (4) SFP+ at 10Gbps signalling rate each. Thats why the final port label on EoIB side is 0A-ETH-[N] and 1A-ETH-[N] where N has a fixed value from 1 to 4.

Why do we have two Ethernet ports on the InfiniBand switches ?

For those who have seen or will get their hands on these two InfiniBand switches, let me clarify something about the Ethernet management port. Visually, you will see two RJ45 ports on the switch but there is only one target interface inside. There is a small bridge inside the switch which connects to the management Ethernet and provides two connections to outside world. No, this is not for redundancy or high availability. It is there to allow you to create linear bus topology, if you need it. In simple term, you can daisy chain more than one such switch.

switches-inside-Oracle-Daisy-Chain

What about these Leaf and Spine switches?

Okay, now that I have talked about these two InfiniBand switches... let me introduce you to two keywords which you will be hearing a lot and this will set the ground for further discussions.

  • Spine Switch
  • Leaf Switch

These are roles of a switch in the topology or connectivity layouts. I may write more about the topologies later but for now lets just keep this blog short, concise and in context of Oracle's Engineered Systems.

The switch where hosts are directly connected takes up the role of Leaf Switch.

The switch where there are no direct hosts attached but does have inter switch links (ISL) to provide alternate paths or for expanding the fabric takes up the role of Spine Switch.

In Exadata and SuperCluster racks, both roles are provided by 36-port InfiniBand switches.

In Exalogic racks, Leaf role is provided by Gateway switches whereas Spine role is provided by a 36-port switch.

How is the InfiniBand connectivity and topology build out?

Consider all hosts with one dual-port HCA installed in their PCI-E slots. Connect port-1 to designated leaf switch-1 with an IB cable. When you are done, this completes a star topology. Now repeat the same on port-2 but this time use designated leaf switch-2. So, each host is connected to two leaf switches via independent port. This sets up your dual star topology. But wait, we need some inter switch links also. Why ? To ensure guaranteed communication in an asymmetric topology. For example, host A may be using port-1 while host-B may switch to port-2 for some reason.

Inter switch links may be as simple as cables between two leaf switches or they may go through another switch, which is known as Spine switch. I will not go into micro level details here as you can read more about how ISLs are chosen in various rack configurations in respective product guides.

Cisco Catalyst 4948

switches-inside-Oracle-ES-cisco

Each host and end point has a management network port. This is always Ethernet based. Cisco 4948 switch integrates all such management ports inside the rack. Everything is pre-wired and all you need is to connect an uplink from this Cisco switch to your data center access switch. Now be careful and do not connect two cables into your data center access switch without planning for Spanning Tree Protocol. This switch is fully managed and also provides VLAN capabilities based on 802.1Q specifications. By default, all hosts inside rack connected to this switch are on same VLAN.

 

 

Overall Network Design

At a very high level, we have the following setup:

  • Ethernet based management network served through Cisco Catalyst 4948 switch
  • InfiniBand internal network served through InfiniBand switches in redundant configuration for high availability
    • This network facilitates all the internal communications within the Engineered Systems framework
  • Ethernet based external world connectivity
    • In Exadata and SuperCluster, this is achieved via physical 10Gbps Ethernet from individual hosts. There are dedicated 10Gbps NICs installed in hosts. Their switching environment is outside of the rack.
    • In Exalogic, this is is achieved via virtual 10Gbps Ethernet from individual hosts. We have been referring to this as EoIB. From hosts' view, there is no additional hardware or cable. Same IB media path carries this traffic as well.

Next time, I will talk more about the virtual networks that are carried over this physical network. Thanks for reading and I welcome all your comments and questions.

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha