Skip to content
This repository has been archived by the owner on Dec 16, 2021. It is now read-only.

Add Kerberos Support -Thrift Based #69

Open
surajnayak opened this issue Jul 7, 2017 · 1 comment
Open

Add Kerberos Support -Thrift Based #69

surajnayak opened this issue Jul 7, 2017 · 1 comment

Comments

@surajnayak
Copy link

Does reair support replicating tables to Secured/Kerberized cluster?

Am facing below exception:

2017-07-07 01:52:06,930 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2017-07-07 01:52:06,951 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: database 0, table some_hive_db.some_table_in_hive got exception
	at com.airbnb.reair.batch.hive.MetastoreReplicationJob$Stage1ProcessTableMapperWithTextInput.map(MetastoreReplicationJob.java:622)
	at com.airbnb.reair.batch.hive.MetastoreReplicationJob$Stage1ProcessTableMapperWithTextInput.map(MetastoreReplicationJob.java:593)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: com.airbnb.reair.common.HiveMetastoreException: org.apache.thrift.transport.TTransportException
	at com.airbnb.reair.common.ThriftHiveMetastoreClient.getTable(ThriftHiveMetastoreClient.java:126)
	at com.airbnb.reair.incremental.primitives.TaskEstimator.analyzeTableSpec(TaskEstimator.java:84)
	at com.airbnb.reair.incremental.primitives.TaskEstimator.analyze(TaskEstimator.java:68)
	at com.airbnb.reair.batch.hive.TableCompareWorker.processTable(TableCompareWorker.java:136)
	at com.airbnb.reair.batch.hive.MetastoreReplicationJob$Stage1ProcessTableMapperWithTextInput.map(MetastoreReplicationJob.java:614)
	... 9 more
Caused by: org.apache.thrift.transport.TTransportException
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:340)
	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1263)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1249)
	at com.airbnb.reair.common.ThriftHiveMetastoreClient.getTable(ThriftHiveMetastoreClient.java:121)
	... 13 more

I saw code in TableCompareWorker.java#L90 calling getMetastoreClient() of HardCodedCluster.java#L62. I don't see any implementation which checks if the connection need to be in secure mode or not. Based on the security flag in config, we can switch which type of TTransport to use.

Any thoughts on how to implement it?

I was trying to connect to thrift over standalone program based on https://github.com/joshelser/krb-thrift. But no luck. Now I get

Exception in thread "main" org.apache.thrift.transport.TTransportException: SASL authentication not complete

Stnadalone Java program snippet:

TTransport transport = new TSocket(host, port);
Map<String,String> saslProperties = new HashMap<String,String>();
        // Use authorization and confidentiality
        saslProperties.put(Sasl.QOP, "auth-conf");
        saslProperties.put(Sasl.SERVER_AUTH, "true");
        System.out.println("Security is enabled: " + UserGroupInformation.isSecurityEnabled());
        // Log in via UGI, ensures we have logged in with our KRB credentials
UserGroupInformation.loginUserFromKeytab("someuser","/etc/security/keytabs/someuser.headless.keytab");
UserGroupInformation currentUser = UserGroupInformation.getCurrentUser();
System.out.println("Current user: " + currentUser);
// SASL client transport -- does the Kerberos lifting for us
        TSaslClientTransport saslTransport = new TSaslClientTransport(
                "GSSAPI", // tell SASL to use GSSAPI, which supports Kerberos
                null, // authorizationid - null
                args[0], // kerberos primary for server - "myprincipal" in myprincipal/[email protected]
                args[1], // kerberos instance for server - "my.server.com" in myprincipal/[email protected]
                saslProperties, // Properties set, above
                null, // callback handler - null
                transport); // underlying transport
        // Make sure the transport is opened as the user we logged in as
        TUGIAssumingTransport ugiTransport = new TUGIAssumingTransport(saslTransport, currentUser);
        ThriftHiveMetastore.Client client = new ThriftHiveMetastore.Client(new TBinaryProtocol(ugiTransport));
        transport.open();

Any guidance will help me contribute back patch for enabling support for Kerberos.

@surajnayak
Copy link
Author

I think I found a way to connect securely using Hive's [HiveMetaStoreClient#L397] Implementation (https://github.com/apache/hive/blob/release-2.0.1/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L397). I will add the feature/patch code once the testing is successful.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant