The Data Sync tool provides the ability to extract from both on-premise, and cloud data sources, and to load that data into BI Cloud Service (BICS), and other relational databases. In some use cases, both the source databases, and the target, may be in 'the Cloud'. Rather than run the Data Sync tool 'On-Premise' to extract data down from the cloud, only to load it back up again, this article outlines an approach where the Data Sync tool is installed and run in an Oracle Compute Instance in the Cloud. In this way all data movement and processing happens in 'the cloud' and no on-premise install is required.
In this example Data Sync will be installed into its own Instance in Oracle Compute.
In theory you could install into any existing compute instance, for example JCS, DBCS, etc, although there the Data Sync tool would be sharing the same file system as other applications. This could, for example, be a problem in the case of a restore where files may be overwritten. Where possible, it is therefore recommended that a separate Compute Instance is created for Data Sync.
1. In Compute, chose a suitable Image, Shape and Storage for the planned workload. It is recommended to give Data Sync at least 8 GB of memory. It is suggested NOT to select the 'minimal' image as that will require additional packages to be loaded later.
2. In this example the OL-6.6-20GB-x11-RD image was used, along with a general purpose oc4 shape with 15 GB of memory and 20GB of storage:
3. Once created, obtain the Public IP from the instance.
We will set up an SSH connection and a VNC session on the Compute Instance for Data Sync to run in. When the user disconnects from the session, Data Sync will continue to operate. It will also allow multiple developers to connect to VNC and share the same session from anywhere in the world.
There are many SSH tools, in this case the free windows tool, Putty, will be used, although other tools can be configured in a similar manner. Putty can be download from here.
1. Open Putty and Set Up a Connection using the IP of the Instance obtained in step (a) and port 22.
2. Expand the 'Connection' / 'SSH' / 'Auth' menu item. Browse in the 'Private key file for authentication' section to the Private Key companion to the Public Key used in the creation of the Compute Instance in the previous section.
3. Return to the 'Session' section, give the session a name and save it. Then hit 'Open' to start the connection to the Compute Instance.
4. For the 'Login as' user, enter 'opc' and when prompted for the 'Passphrase', use the passphrase for the SSH Key.
If the connection is successful, then a command prompt should appear after these have been entered:
5. As the opc user, edit sshd_config.
sudo vi /etc/ssh/sshd_config
Uncomment all instances of X11Forwarding and change the following word to be 'yes'
6. Save the file, and then restart sshd by running the following command:
sudo /etc/init.d/sshd restart
7. Switch to the Oracle user
sudo -su oracle
8. Run the following command to prevent the Window Manager from displaying a lock screen:
gconftool-2 -s -t bool /apps/gnome-screensaver/lock_enabled false
9. Start VNC server with the following command:
vncserver :1 -depth 16 -alwaysshared -geometry 1200x750 -s off
NOTE - in the command line above, the part '1200x750' is a lower case 'X' character. When viewed in a browser it may change this to '×' character. That will not work if cut and paste, so be sue to use a lowercase 'x'.
10. Figure out which port VNC is using
We're going to use SSH port forwarding. To do this, we need to confirm the port that is being used by VNC.
Typically the port is 5900 + N, where N is the display number.
In the screenshot below when VNC was started, it shows the screen is number 1 (the value after the ':' in "d32f4d : 1" ) so in this case the port is 5901. This will typically be the port number, but if other VNC sessions are already running, then it may be different.
To test this, run this command:
netstat -anp | grep 5901
This should confirm the process listening on that port - in this case, VNC:
11. Exit the putty session by typing 'exit' and return once to exit the oracle user, and 'exit' and return again to exit the putty session.
1. Create the SSH Tunnel
Open putty again and load the saved session from earlier. Open the 'Connection' / 'SSH' / 'Tunnel' menu item.
We need to create an SSH tunnel to forward VNC traffic from the local host to port 5901 on the Compute Instance.
In this example we enter the Local Port also as 5901, and then in the Destination, the IP address of the Compute Instance, followed by a ':' and then 5901. Select 'Add' to set up the tunnel.
2. Return back to the top 'Session' menu and 'Save' the session again to capture the changes, then Open the session again and connect as 'opc' and enter the passphrase.
3. If a VNC client is not installed on the user's machine, download one. In this case the free viewer from RealVNC which can be downloaded from here is being used.
4. Open VNC viewer and for the target, enter 'localhost:5901'. VNC will attempt to connect to the local port 5901, which will then be redirected by SSH to port 5901 on the target.
Anytime a VNC session is going to be used, the putty session must be open (although some VNC tools will also set up the SSH session for you, in which case you can use that if preferred).
5. Enter the VNC password and the session will be connected. If there is an error message within the VNC session stating 'Authentication is Required to set the network proxy used for downloading packages', then click 'Cancel' to ignore it.
1. Within the connected VNC session, open a Terminal session
2. To turn on copy and paste between the client and the VNC session, enter:
vncconfig -nowin &
3. Download the Data Sync and JDK Software
Open Firefox within the VNC session and download the required software.
Be sure to download and install the latest version of the Data Sync Tool from OTN through this link.
Data Sync requires JDK8. You can download that through this link.
Note - customers still using BI Cloud Service (BICS) and loading into the schema service database, can not use this version of Data Sync. The version for BICS is available to download here, but does not have all the functionality of the OAC / OAAC version. Although this version will work just fine in compute for this scenario.
For the JDK, select one of the Linux x64 versions.
4. Plan where to in install the software.
Take a look at the file system and see which makes the most case in your scenario. In this example we are using the /home/oracle directory in a sub-directory we created called 'datasync'. Depending on the configuration of the Compute Instance and its storage, there may be better choices.
5. Extract both the JDK and Data Sync software to that directory.
6. Edit the 'config.sh' file to point to the location of the JDK
7. Start Data Sync by running
Then go through the standard steps for setting up and configuring the Data Sync tool.
For more information on setting up Data Sync, see this article.
For information on setting up Data Sync to source from Cloud OTBI environments, see this article.
Other Data Sync documentation can be found here.
Once the VNC session has been set up, then other users can also connect. They will just need to complete the following steps from above:
Create SSH Session and Install VNC, Steps 1, 2 & 3
Create SSH Tunnel and Start VNC Session, Steps 1 & 2
This article walked through the steps to create a Compute Instance, accessible through VNC over SSH, and then to install Data Sync into that for loading scenarios where an on-premise footprint is not required.
For other A-Team articles about BICS and Data Sync, click here.