Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19354. S3AInputStream to be created by factory under S3AStore #7322

Conversation

ahmarsuhail
Copy link
Contributor

Description of PR

PR to merge #7214 into feature branch.

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

steveloughran and others added 7 commits January 23, 2025 11:25
First iteration
* Factory interface with a parameter object creation method
* Base class AbstractS3AInputStream for all streams to create
* S3AInputStream subclasses that and has a factory
* Production and test code to use it

Not done
* Input stream callbacks pushed down to S3Store
* S3Store to dynamically choose factory at startup, stop in close()
* S3Store to implement the factory interface, completing final binding
  operations (callbacks, stats)

Change-Id: I8d0f86ca1f3463d4987a43924f155ce0c0644180
Revision

API: Make clear this is part of the fundamental store Model:

* abstract stream class is now ObjectInputStream
* interface is ObjectInputStreamFactory
* move to package org.apache.hadoop.fs.s3a.impl.model

Implementation: Prefetching stream is created this way too;
adds one extra parameter.

Maybe we should pass conf down too

Change-Id: I5bbb5dfe585528b047a649b6c82a9d0318c7e91e
Change-Id: If42bdd0b227c4da07c62a410a998e6d8c35581f6
Moves all prefetching stream related options into the prefetching stream
factory; the standard ReadOpContext removes them, so
a new PrefetchingOptions is passed around.

Stream factories can now declare how many extra shared threads they
want and whether or not to create a future pool around the bounded pool.
This is used in S3AFileSystem when creating its thread pools -this class
no longer reads in any of the prefetching options.

All tests which enable/disable prefetching, or probe for its state,
now use S3ATestUtils methods for this.
This avoids them having to now explicitly unset two properties,
set the new input stream type, and any more complications in test
setup in future.

Everything under S3AStore is a service, so service lifecycle matches everywhere
-and store just adds to the list of managed services for start/stop/close
integration.

+ adjust assertions in ITestS3AInputStreamLeakage for prefetching
+ update the prefetching.md doc for factory changs
+ javadocs
+ add string values of type names to Constants

Once the analytics stream is in, a full doc on "stream performance"
will be needed.

package for this stuff is now impl.streams

Change-Id: Id6356d2ded2c477ba16cbb9027ac0cfbece2a542
Push factory construction into the enum itself

Store implements stream capabilities, which are then
relayed to the active factory. This avoids the FS having
to know what capabilities are available in the stream.

Abstract base class for stream factories.

Change-Id: Ib757e6696f29cc7e0e8edd1119e738c6adc6f98f
Change-Id: Id79f8aa019095c1601bb0b2a282c51bdb0b7b817
- Add callbacks from stream factories to creator.
- Initial operation is to ask for an async client.
- Callbacks and wiring up done in S3AStoreImpl.

Change-Id: I544f05da15e3b57e9a538d337b972e4e07dc8877
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants