Skip to content

Commit

Permalink
add minimal example for how Google Cloud and AWS S3 access works diff…
Browse files Browse the repository at this point in the history
…erent @tpietzsch
  • Loading branch information
StephanPreibisch committed Oct 25, 2024
1 parent f594aaa commit f6a417e
Showing 1 changed file with 37 additions and 0 deletions.
37 changes: 37 additions & 0 deletions src/main/java/util/URITools.java
Original file line number Diff line number Diff line change
Expand Up @@ -671,8 +671,45 @@ public static void copyFile( final File inputFile, final File outputFile ) throw
}
}

public static void minimalExampleTobiS3GS() throws URISyntaxException, IOException
{
URI gcURI = URITools.toURI( "gs://janelia-spark-test/I2K-test/dataset.xml" );
URI s3URI = URITools.toURI( "s3://janelia-bigstitcher-spark/Stitching/dataset.xml" );

// assemble scheme + bucket and location for Google Cloud
URI gcBucket = new URI( gcURI.getScheme(), gcURI.getHost(), null, null );
URI gcLocation = new URI( null, null, gcURI.getPath(), null );

System.out.println( "Google cloud: Instantiating N5Reader to grab a key-value-access for: '" + gcBucket + "'" );
System.out.println( "Google cloud: Lock for reading on: '" + gcLocation + "'");

final N5Reader gcN5 = instantiateN5Reader(StorageFormat.N5, gcBucket );
final KeyValueAccess gcKVA = ((GsonKeyValueN5Reader)gcN5).getKeyValueAccess();
gcKVA.lockForReading( gcLocation.toString()).newInputStream().skip( 1000 );

// assemble scheme + bucket and location for AWS S3 (using full path for location, which should actually be relative, same as for google cloud above)
URI s3Bucket = new URI( s3URI.getScheme(), s3URI.getHost(), null, null );
URI s3Location = new URI( s3URI.getScheme(), s3URI.getHost(), s3URI.getPath(), null );

System.out.println( "AWS: Instantiating N5Reader to grab a key-value-access for: '" + s3Bucket + "'" );
System.out.println( "AWS: Lock for reading on: '" + s3Location + "'");

final N5Reader s3N5 = instantiateN5Reader(StorageFormat.N5, s3Bucket );
final KeyValueAccess s3KVA = ((GsonKeyValueN5Reader)s3N5).getKeyValueAccess();
s3KVA.lockForReading( s3Location.toString() ).newInputStream().skip( 1000 );

// assemble relative location for AWS S3 (which fails)
s3Location = new URI( null, null, s3URI.getPath(), null );
System.out.println( "AWS: Lock for reading on: '" + s3Location + "' (fails)");
s3KVA.lockForReading( s3Location.toString() ).newInputStream().skip( 1000 );
}

public static void main( String[] args ) throws SpimDataException, IOException, URISyntaxException
{
minimalExampleTobiS3GS();

System.exit( 0 );

URI gcURI = URITools.toURI( "gs://janelia-spark-test/I2K-test/dataset.xml" );
System.out.println( "isGC: " + isGC(gcURI) + " [" + gcURI + "]" );
SpimData2 sdGC = loadSpimData(gcURI, new XmlIoSpimData2() );
Expand Down

1 comment on commit f6a417e

@StephanPreibisch
Copy link
Member Author

@StephanPreibisch StephanPreibisch commented on f6a417e Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Google cloud: Instantiating N5Reader to grab a key-value-access for: 'gs://janelia-spark-test'
Google cloud: Lock for reading on: '/I2K-test/dataset.xml'
AWS: Instantiating N5Reader to grab a key-value-access for: 's3://janelia-bigstitcher-spark'
AWS: Lock for reading on: 's3://janelia-bigstitcher-spark/Stitching/dataset.xml'
AWS: Lock for reading on: '/Stitching/dataset.xml' (fails)
Exception in thread "main" org.janelia.saalfeldlab.n5.N5Exception$N5NoSuchKeyException: No such key
	at org.janelia.saalfeldlab.n5.s3.AmazonS3KeyValueAccess$S3ObjectChannel.newInputStream(AmazonS3KeyValueAccess.java:611)
	at util.URITools.minimalExampleTobiS3GS(URITools.java:704)
	at util.URITools.main(URITools.java:709)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: MM3ET3F14MK8XXXT; S3 Extended Request ID: OHN373UB59gzJUmNnaTOzFwMWBlpG6Z4Unc1EkZPU5mKoAbeO9J/sUZpRPWZLn4ACFuAR3SV6aw=; Proxy: null), S3 Extended Request ID: OHN373UB59gzJUmNnaTOzFwMWBlpG6Z4Unc1EkZPU5mKoAbeO9J/sUZpRPWZLn4ACFuAR3SV6aw=
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1912)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1450)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1419)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1183)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:838)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:805)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:779)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:735)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:717)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:581)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5575)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5522)
	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1569)
	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1423)
	at org.janelia.saalfeldlab.n5.s3.AmazonS3KeyValueAccess$S3ObjectChannel.newInputStream(AmazonS3KeyValueAccess.java:608)
	... 2 more

to test if it works, I simply call .newInputStream().skip( 1000 ). Just the relative path on AWS S3 fails ... the other two work fine

Please sign in to comment.