Tuesday, September 13, 2016

[Big Data] First steps on Oracle Big Data SQL - helpful troubleshooting

Hi everyone,

In this post I will talk about usage of  Oracle Big Data SQL service on Exadata and some practices..

As we all know, using Oracle Big Data SQL, you can query data stored in a Hadoop cluster using from your Oracle database.

However, you need to take care of some problems related with Kerberos and some configuration issues with Exadata side.

Oracle BDS 3.0, BDA v4.5, Exadata 12c image




After enabling BDS on Oracle Big Data Appliance,  you run an installation script on Exadata machines. 



When you done installation, a service called bds_DBNAME_HADOOPCLUSTERNAME starts on your oracle database.

And some configurations created on under $ORACLE_HOME/bigdatasql directory .

$cd $ORACLE_HOME/bigdatasql/
$ls -lrt
total 837752
drwxr-xr-x  8 oracle oinstall      4096 Apr  1 07:20 jdk1.8.0_92
drwxr-xr-x 10 oracle oinstall      4096 Jun  2 02:34 hive-1.1.0-cdh5.7.1
drwxr-xr-x 14 oracle oinstall      4096 Jun  2 02:52 hadoop-2.6.0-cdh5.7.1
drwxr-xr-x  2 oracle oinstall      4096 Jun 14 19:52 default_dir
-rw-r--r--  1 oracle oinstall 311585484 Jun 14 19:52 hadoop-2.6.0-cdh5.7.0.tar.gz
-rw-r--r--  1 oracle oinstall 116082695 Jun 14 19:52 hive-1.1.0-cdh5.7.0.tar.gz
drwxr-xr-x  3 oracle oinstall      4096 Jun 14 19:52 bigdata_config
drwxr-xr-x  2 oracle oinstall      4096 Jun 22 17:16 jlib
-rw-r--r--  1 oracle oinstall 313514587 Jun 22 17:16 hadoop-2.6.0-cdh5.7.1.tar.gz
-rw-r--r--  1 oracle oinstall 115774390 Jun 22 17:16 hive-1.1.0-cdh5.7.1.tar.gz
-rw-r--r--  1 oracle oinstall       932 Jun 22 17:16 hadoop_HADOOPCLUSTERNAME.env
drwxr-xr-x  2 oracle oinstall      4096 Sep  3 12:42 log

Under bigdata_config directory, you can find related configuration parameters for your hadoop cluster.

$ls -lrt bigdata_config
total 48
-rw-r--r-- 1 oracle oinstall   362 Jun 14 19:52 bdscatcon-36886_catcon_40449.lst
-rw-r--r-- 1 oracle oinstall   362 Jun 14 19:52 bdscatcon-36886_catcon_40473.lst
-rw-r--r-- 1 oracle oinstall  1822 Jun 14 19:52 bdscatcon-368860.log
drwxr-xr-x 2 oracle oinstall  4096 Jun 22 17:16 HADOOPCLUSTERNAME
-rw-r--r-- 1 oracle oinstall 28154 Jun 22 17:16 bigdata.properties
-rw-r--r-- 1 oracle oinstall  1217 Jun 22 17:16 bigdata-log4j.properties

Here under  HADOOPCLUSTERNAME directory you can find hadoop configuration files.

$ls -lrt HADOOPCLUSTERNAME/
total 60
-rw-r--r-- 1 oracle oinstall 6078 Jun 22 17:16 yarn-site.xml
-rw-r--r-- 1 oracle oinstall 1510 Jun 22 17:16 topology.py
-rw-r--r-- 1 oracle oinstall  734 Jun 22 17:16 topology.map
-rw-r--r-- 1 oracle oinstall  315 Jun 22 17:16 ssl-client.xml
-rw-r--r-- 1 oracle oinstall    0 Jun 22 17:16 redaction-rules.json
-rw-r--r-- 1 oracle oinstall 5581 Jun 22 17:16 mapred-site.xml
-rw-r--r-- 1 oracle oinstall  310 Jun 22 17:16 log4j.properties
-rw-r--r-- 1 oracle oinstall 6274 Jun 22 17:16 hive-site.xml
-rw-r--r-- 1 oracle oinstall 1382 Jun 22 17:16 hive-env.sh
-rw-r--r-- 1 oracle oinstall 3570 Jun 22 17:16 hdfs-site.xml
-rw-r--r-- 1 oracle oinstall  557 Jun 22 17:16 hadoop-env.sh
-rw-r--r-- 1 oracle oinstall 4135 Jun 22 17:16 core-site.xml

All of this configurations are created automatically when you run bds-exa-install.sh..

Now lets play it..

Check first dba_hive_tables view to test your configuration with BDA is ok, you must see all your tables in hive databases..

SQL>select count(*) from dba_hive_tables;

and you must see an entry for  dblinks to bigdata..


SQL>select owner,db_link,host from dba_db_links

PUBLIC BDSQL$_HADOOPCLUSTERNAME ,(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER))(CONNECT_DATA=(PRESENTATION=RO)(SID=bds_DBNAME_HADOOPCLUSTERNAME)))

PUBLIC BDSQL$_DEFAULT_CLUSTER, (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER))(CONNECT_DATA=(PRESENTATION=RO)(SID=bds_DBNAME_HADOOPCLUSTERNAME)))
..

SQL> select owner,directory_path from dba_directories
ORACLE_BIGDATA_CONFIG,  $ORACLE_HOME/bigdatasql/bigdata_config

First step is create an external hadoop table on oracle database.

You can use    dbms_hadoop.create_extddl_for_hive procedure for this.

SQL> DECLARE
   DDLout VARCHAR2(4000);
BEGIN
   dbms_hadoop.create_extddl_for_hive(
      CLUSTER_ID=>'HADOOPCLUSTERNAME',
      DB_NAME=>'HIVEDBNAME',
      HIVE_TABLE_NAME=>'HIVETABLENAME',
      TABLE_NAME=>'EXADATATABLENAME',
      PERFORM_DDL=>FALSE,
      TEXT_OF_DDL=>DDLout
   );
   dbms_output.put_line(DDLout);
END;

/

When you run this script, it produces following

SQL > CREATE TABLE bds_test (  col1 number,col2 varchar2(4000),....)  ORGANIZATION EXTERNAL
     (TYPE ORACLE_HIVE
     DEFAULT DIRECTORY DEFAULT_DIR
     ACCESS PARAMETERS (
    com.oracle.bigdata.cluster=HADOOPCLUSTERNAME
    com.oracle.bigdata.tablename=HIVEDBNAME.HIVETABLENAME)    )  paralel 2  REJECT LIMIT UNLIMITED;


Once you create oracle table, you can query that table..

SQL> select /*bds_a*count(*) from bds_test;

You can see wait event on Oracle Enterprise Manager for your session 




And now your configuration for Oracle database is OK and you can play with Oracle BDS.


Troubleshooting for BDS service on Exadata side


1- When running dbms_hadoop.create_extddl_for_hive procedure, ensure that you talk about partition

DECLARE
   DDLout VARCHAR2(4000);
BEGIN
   dbms_hadoop.create_extddl_for_hive(
      CLUSTER_ID=>'HADOOPCLUSTERNAME',
      DB_NAME=>'HIVEDBNAME',
      HIVE_TABLE_NAME=>'HIVETABLENAME',
      TABLE_NAME=>'EXADATATABLENAME',
   HIVE_PARTITION=>false,           --<< add this line
      PERFORM_DDL=>FALSE,
      TEXT_OF_DDL=>DDLout
   );
   dbms_output.put_line(DDLout);
END;

/


2-  Be sure that all nodes of exadata has valid kerberos ticket and BDS service get access to ticket..

You may hit those error..

SQL> select /*bds_a*count(*) from bds_test;
select /*bds_a*count(*) from bds_test;
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-11504: error from external driver: MetaException(message:Could not connect
to meta store using any of the URIs provided. Most recent failure:
org.apache.thrift.transport.TTransportException: GSS initiate failed
at

org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.ja
va:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:
37)
at
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTra
nsport.java:52)
at
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTra
nsport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1
693)
at
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTran
sport.java:49)
at org.apache.hadoop.hive 

Get kerberos ticket for oracle database user, which is typically "oracle", and restart bds service

$ kinit oracle ( on every node)

$ crs_stop bds_DBNAME_HADOOPCLUSTERNAME;

Attempting to stop `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE1`
Attempting to stop `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE2`
Attempting to stop `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE3`
Attempting to stop `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE4`
Stop of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE1` succeeded.
Stop of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE2` succeeded.
Stop of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE3` succeeded.
Stop of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE4` succeeded.


$ crs_start bds_DBNAME_HADOOPCLUSTERNAME;

Attempting to start `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE4`
Attempting to start `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE2`
Attempting to start `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE3`
Attempting to start `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE1`
[bds_DBNAME_HADOOPCLUSTERNAME] ORACLE_HOME = "$ORACLE_HOME"AGTCTL_HOME = "" (default: NULL)MTA init file = "$ORACLE_HOME/hs/admin/initbds_DBNAME_HADOOPCLUSTERNAME.ora"MTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" is not running!Checking MTA init parameters...[O]  INIT_LIBRARY=$ORACLE_HOME/lib/libkubsagt12.so[O]  INIT_FUNCTION=kubsagtMTAInit[O]  BDSQL_CLUSTER=HADOOPCLUSTERNAME[O]  BDSQL_CONFIGDIR=$ORACLE_HOME/bigdatasql/bigdata_configMTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" started!oracle 49944 1 4 16:14 ? 00:00:00 extprocbds_DBNAME_HADOOPCLUSTERNAME -mt
[bds_DBNAME_HADOOPCLUSTERNAME] ORACLE_HOME = "$ORACLE_HOME"AGTCTL_HOME = "" (default: NULL)MTA init file = "$ORACLE_HOME/hs/admin/initbds_DBNAME_HADOOPCLUSTERNAME.ora"MTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" is not running!Checking MTA init parameters...[O]  INIT_LIBRARY=$ORACLE_HOME/lib/libkubsagt12.so[O]  INIT_FUNCTION=kubsagtMTAInit[O]  BDSQL_CLUSTER=HADOOPCLUSTERNAME[O]  BDSQL_CONFIGDIR=$ORACLE_HOME/bigdatasql/bigdata_configMTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" started!oracle 161863 1 7 16:14 ? 00:00:00 extprocbds_DBNAME_HADOOPCLUSTERNAME -mt
[bds_DBNAME_HADOOPCLUSTERNAME] ORACLE_HOME = "$ORACLE_HOME"AGTCTL_HOME = "" (default: NULL)MTA init file = "$ORACLE_HOME/hs/admin/initbds_DBNAME_HADOOPCLUSTERNAME.ora"MTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" is not running!Checking MTA init parameters...[O]  INIT_LIBRARY=$ORACLE_HOME/lib/libkubsagt12.so[O]  INIT_FUNCTION=kubsagtMTAInit[O]  BDSQL_CLUSTER=HADOOPCLUSTERNAME[O]  BDSQL_CONFIGDIR=$ORACLE_HOME/bigdatasql/bigdata_configMTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" started!oracle 60542 1 3 16:14 ? 00:00:00 extprocbds_DBNAME_HADOOPCLUSTERNAME -mt
[bds_DBNAME_HADOOPCLUSTERNAME] ORACLE_HOME = "$ORACLE_HOME"AGTCTL_HOME = "" (default: NULL)MTA init file = "$ORACLE_HOME/hs/admin/initbds_DBNAME_HADOOPCLUSTERNAME.ora"MTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" is not running!Checking MTA init parameters...[O]  INIT_LIBRARY=$ORACLE_HOME/lib/libkubsagt12.so[O]  INIT_FUNCTION=kubsagtMTAInit[O]  BDSQL_CLUSTER=HADOOPCLUSTERNAME[O]  BDSQL_CONFIGDIR=$ORACLE_HOME/bigdatasql/bigdata_configMTA process "extprocbds_DBNAME_HADOOPCLUSTERNAME -mt" started!oracle 177648 1 0 16:14 ? 00:00:00 extprocbds_DBNAME_HADOOPCLUSTERNAME -mt
Start of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE4` succeeded.
Start of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE3` succeeded.
Start of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE2` succeeded.
Start of `bds_DBNAME_HADOOPCLUSTERNAME` on member `NODE1` succeeded.

Now you can query your hive table.

3-  If you think BDS service on Exadata is up and oracle user has valid tickets and you hit TGT errors, make sure that oracle can access KRB cache 

$ ls -l /tmp/krb5cc_ORACLEUSERID
-rw------- 1 oracle oinstall 744 Sep 7 11:30 /tmp/krb5cc_ORACLEUSERID

$ chmod g+r /tmp/krb5cc_ORACLEUSERID

And ensure that you have correct JAVA Security Policy files wrt Oracle BDA .

$ ls -lrt $ORACLE_HOME/bigdatasql/jdk1.8.0_92/jre/lib/security
total 172
...
-rw-r--r-- 1 oracle oinstall 3035 Aug 24 13:42 local_policy.jar
-rw-r--r-- 1 oracle oinstall 3023 Aug 24 13:43 US_export_policy.jar


Those files must be same version with BDA side.

Ok, that is all, you can also visit Oracle BDS blog for detailed usage.

Thanks for reading.

Enjoy & share.

Source:



No comments :

Post a Comment