[SPARK-50710][CONNECT] Add support for optional client reconnection to sessions after release #49342
+110
−36
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Adds a new boolean
allow_reconnect
field toReleaseSessionRequest
.When set to
true
in the request, the server will not place the session in theclosedSessionsCache
ofSparkConnectSessionManager
.The session's clean-up process is unmodified.
Why are the changes needed?
Currently, the connect server will, by default, tombstone all sessions that have either been released explicitly (through a
ReleaseSession
request) or cleaned up due to inactivity/idleness in periodic checks.Tombstoning prevents clients from reconnecting with the same
userId
andsessionId
. This mechanism ensures that clients do not accidentally end up with a 'fresh' server-side session, which may be disastrous/fatal as all previously held state is lost (e.g., Temporary views, temporary UDFs, modified configs, current catalog, etc.).Consider a client that runs simple non-state dependant queries (e.g
select count from ...
), perhaps sparsely during the lifetime of the application. Such a client may prefer to opt out of tombstoning for the following reasons:userId
/sessionId
on each reconnect may be inconvenient for tracking/observability purposes.Currently, the only way to allow clients to reconnect is to set
spark.connect.session.manager.closedSessionsTombstonesSize
to0
. However, this is not ideal as it would allow all clients to reconnect, which as previously pointed out, may be dangerous.As an improvement, allowing specific clients to explicitly signal/request the reconnection possibility addresses the needs mentioned earlier.
Does this PR introduce any user-facing change?
Yes.
When the client releases a session with
allow_reconnect
set totrue
, a reconnection will lead to the server generation a fresh session and not result in an error like[INVALID_HANDLE.SESSION_CLOSED] The handle 271dab46-a9a0-4458-ad3a-71442eaa9a21 is invalid. Session was closed. SQLSTATE: HY000
Full example (gRPC based):
Default/
allow_reconnect
set tofalse
Create a session via a
Config
request:Release session via
ReleaseSession
request:Retry the earlier config request, the error
[INVALID_HANDLE.SESSION_CLOSED] The handle 271dab46-a9a0-4458-ad3a-71442eaa9a21 is invalid. Session was closed. SQLSTATE: HY000
is hit.Default/
allow_reconnect
set totrue
Create a session via a
Config
request:Release session via
ReleaseSession
request:Retry the earlier config request, the request goes through and it can be noted the
server_side_session_id
in the response of the last config request is different from the first one as a new server side session was generated.How was this patch tested?
New unit test + existing tests.
Was this patch authored or co-authored using generative AI tooling?
No.