Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Spark connector leaves multiple sessions open #314

Closed
jeremyprime opened this issue Jan 28, 2022 · 5 comments · Fixed by #323
Closed

[BUG] Spark connector leaves multiple sessions open #314

jeremyprime opened this issue Jan 28, 2022 · 5 comments · Fixed by #323
Assignees
Labels
bug Something isn't working High Priority size: 3

Comments

@jeremyprime
Copy link
Collaborator

Environment

  • Vertica Spark Connector version: 3.0.1

Problem Description

When performing a read or write operation using the Spark Connector, it creates one or more sessions in Vertica. These sessions are not cleaned up until the connection is closed and the client application exits.

For example, the pyspark example creates 3 sessions, 1 for write and 2 for read. These remain until the python script completes.

We likely need to close these sessions when an operation completes since we do not appear to reuse a given session id. That way, even if multiple operations are performed in a single script/client, the sessions should be closed without having to close the client application.

To get the session information, or close all open sessions, run the following:

docker exec -it docker_vertica_1 vsql

-- Get all sessions
SELECT session_id, node_name, user_name, client_hostname, login_timestamp, statement_start, current_statement, last_statement, client_type, client_os FROM v_monitor.sessions ORDER BY user_name;

-- Close all sessions
SELECT close_all_sessions();

Spark Connector Logs

@jeremyprime jeremyprime added the bug Something isn't working label Jan 28, 2022
@jeremyprime jeremyprime self-assigned this Jan 28, 2022
@jeremyprime
Copy link
Collaborator Author

jeremyprime commented Feb 4, 2022

I have run a number of example applications in both Python and Scala. This was done using sbt run, submitting in Spark, and using Jupyter Notebook (also Spark).

For each operation (read or write) one or more sessions will be created in Vertica. These sessions are then closed when the application itself exits. The only time when sessions persist is when I use an application that is long-running, such as Jupyter Notebook, where the kernel continues to run even after execution of the code has completed. When the Jupyter Notebook kernel is shutdown, the Vertica sessions are closed.

Although, since we close the connection after each operation, we might expect the session to also be closed even before the application/client exits. According to this page, a session may persist for a few different reasons, such as if a transaction is not committed. Will need to investigate if we are not correctly committing transactions or closing connections in some cases.

@jeremyprime
Copy link
Collaborator Author

May be related to #171 (planInputPartitions being called twice).

@jeremyprime
Copy link
Collaborator Author

Note that there was an internal ticket, VER-76615, that tried to address this in the past. The test code on that ticket (simply performed a write in a loop, 100 times) still exhausts the number of available sessions when run against the main branch:

java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](4060) FATAL: New session rejected due to limit, already 105 sessions active

The SQL queries in SchemaTools.getColumnInfo and TableUtils.tableExists leave open sessions when called. There may be other queries as well, these are just the ones that are seen with the current test code. These sessions keep building up and are not closed until the application exits.

@jeremyprime
Copy link
Collaborator Author

The SQL connection to Vertica is the source of the dangling sessions. The connection is created under VerticaJdbcLayer.VerticaJdbcLayer(). This JDBC layer is currently created once for reads, VerticaPipeFctory.getReadPipe(), and writes, VerticaPipeFctory.getWritePipe(). However, there are multiple places where the read and write pipes are created, and thus multiple places where connections are created.

The creation of the connection and/or the creation of the JDBC layer needs to be a singleton. However, this also complicates closing of the connection as other parts of the code will try to reuse the closed connection.

Perhaps connection pooling would help to better manage minimal connections across reads and writes, and all of the underlying queries they perform. It would be ideal to create and reuse a single connection, provided that it does not hang around when the operations are complete (such as when performing long-lived operations in Jupyter Notebook).

@jeremyprime
Copy link
Collaborator Author

jeremyprime commented Feb 9, 2022

Recreating the JDBC layer if the child connection is closed seems to address the issue of multiple and dangling sessions. Although there is still some connection closing and reopening performed by the connector, which is not ideal.

#324 has been created to address this issue.

jeremyprime added a commit that referenced this issue Feb 9, 2022
Aryex pushed a commit that referenced this issue Feb 10, 2022
* Use singletons for the read and write pipes

* Recreate the JDBC layer if the connection was closed

* Remove commented code

* Remove semicolons (#314)

* Handle exception when checking for closed connection (#314)

* Refactored JDBC layer check (#314)
jeremyprime added a commit that referenced this issue Feb 11, 2022
jeremyprime added a commit that referenced this issue Feb 11, 2022
jeremyprime added a commit that referenced this issue Feb 14, 2022
* Use singletons for the read and write pipes

* Recreate the JDBC layer if the connection was closed

* Remove commented code

* Remove semicolons (#314)

* Handle exception when checking for closed connection (#314)

* Refactored JDBC layer check (#314)

* Added integration test for session handling (#314)

* Close read connection after write (#314)

* Try sleeping to ensure sessions are released (#314)

* Increase sleep (#314)

* Refactored JDBC layer close (#314)

* Poll session count instead of sleeping (#314)

* Refactor session polling (#314)

* Only look for Spark connector sessions (#314)

* Ignore test as it sometimes fails on GitHub (#314)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working High Priority size: 3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants