Run it from local spark instance on eclipse #49

santooudnur · 2018-01-15T14:14:19Z

Is it possible to access the IBM Cloudstorage outside apache instance in Bluemix?

Basically I am trying to use
this library for access COS objects from scala program run on local apache spark.
I am trying to connect to cloudstorage instance on my Bluemix account , and access temperatureUS.csv object in bucket tests from Scala code.

Test code can be found here
SparkCosS.txt
Always getting following error

18/01/15 19:29:50 DEBUG request: Received error response: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: null; Status Code: 403; Error Code: 403 Forbidden; Request ID: 8cee1d0b-c4d8-4800-a75f-06ff49e76a5b), S3 Extended Request ID: null
18/01/15 19:29:50 DEBUG COSAPIClient: Not found cos://tests.myCos/temperatureUS.csv
18/01/15 19:29:50 WARN COSAPIClient: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 8cee1d0b-c4d8-4800-a75f-06ff49e76a5b), S3 Extended Request ID: null
Exception in thread "main" org.apache.spark.sql.AnalysisException: Path does not exist: cos://tests.myCos/temperatureUS.csv;
at org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:626)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:355)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:349)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
at test.SparkCosFinalSL$.main(SparkCosSL.scala:86)
at test.SparkCosFinalSL.main(SparkCosSL.scala)

However I am able to connect to service through java API.
SDKGlobalConfiguration.IAM_ENDPOINT = "https://iam.bluemix.net/oidc/token";

    String bucketName = "testb5e78bd1988d453f81ec11cbfced949a";//"<bucketName>";
    String api_key = "L_-uMLV9AU-ZBWr0BE6JmiHMYFqsORXndMmfrpaqJIgG";//"<apiKey>";
    String service_instance_id = "crn:v1:bluemix:public:cloud-object-storage:global:a/647b189897a37a7ac4dbf0a3ef43fc42:866ec777-5c98-4e1c-b2bf-e5d0b1d13694::";//"<resourceInstanceId>";
    String endpoint_url = "https://s3-api.us-geo.objectstorage.softlayer.net";
    String location =  "us-geo"; //"us";

    System.out.println("Current time: " + new Timestamp(System.currentTimeMillis()).toString());
    _s3Client = createClient(api_key, service_instance_id, endpoint_url, location);
    
    listObjects(bucketName, _s3Client);
    listBuckets(_s3Client);

Let me know if there is anything missed by me.

However only observation I have seen when run spark from eclipse is that Hadoop library not loaded

18/01/15 19:42:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Appreciate your quick response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run it from local spark instance on eclipse #49

Run it from local spark instance on eclipse #49

santooudnur commented Jan 15, 2018

Run it from local spark instance on eclipse #49

Run it from local spark instance on eclipse #49

Comments

santooudnur commented Jan 15, 2018