diff --git a/build.gradle b/build.gradle index 6a6c916f91..fa63b070f8 100644 --- a/build.gradle +++ b/build.gradle @@ -1167,9 +1167,11 @@ task product(type: Zip) { into "${snappyProductDir}/benchmark" } + /* (preferred one is the standard %jdbc interpreter in Zeppelin) if (rootProject.hasProperty('enablePublish')) { packageZeppelinInterpreter() } + */ if (rootProject.hasProperty('R.enable')) { def targetRDir = "${snappyProductDir}/R" @@ -1479,6 +1481,7 @@ task sparkPackage { dependsOn ":snappy-core_${scalaBinaryVersion}:sparkPackage" } +product.mustRunAfter clean, cleanAll distTar.mustRunAfter clean, cleanAll, product distZip.mustRunAfter clean, cleanAll, product distRpm.mustRunAfter clean, cleanAll, product diff --git a/docs/configuring_cluster/configure_launch_cluster_multinode.md b/docs/configuring_cluster/configure_launch_cluster_multinode.md index 2498b751b0..99d8728192 100644 --- a/docs/configuring_cluster/configure_launch_cluster_multinode.md +++ b/docs/configuring_cluster/configure_launch_cluster_multinode.md @@ -53,7 +53,7 @@ The following core properties must be set in the **conf/leads** file: | heap-size | Sets the maximum heap size for the Java VM, using SnappyData default resource manager settings.
For example, `-heap-size=8g`
It is recommended to allocate minimum **6-8 GB** of heap size per lead node. If you use the `-heap-size` option, by default SnappyData sets the critical-heap-percentage to 95% of the heap size, and the `eviction-heap-percentage` to 85.5% of the `critical-heap-percentage`.
SnappyData also sets resource management properties for eviction and garbage collection if the JVM supports them. | | | dir | Working directory of the member that contains the SnappyData Server status file and the default location for the log file, persistent files, data dictionary, and so forth. | Current directory | | classpath | Location of user classes required by the SnappyData Server. This path is appended to the current classpath | Appended to the current classpath | -| -zeppelin.interpreter.enable=true |Enable the SnappyData Zeppelin interpreter. Refer [How to use Apache Zeppelin with SnappyData](/howto/use_apache_zeppelin_with_snappydata.md) | | +| -zeppelin.interpreter.enable=true |Enable the SnappyData Zeppelin interpreter. No longer useful. Refer [How to use Apache Zeppelin with SnappyData](/howto/use_apache_zeppelin_with_snappydata.md) | | | spark.executor.cores | The number of cores to use on each server. | | | spark.jars | | | @@ -69,7 +69,7 @@ You can add a line for each of the Lead members that you want to launch. Typical In the following configuration, you are specifying the Spark UI port and the number of cores to use on each server as well as enabling the SnappyData Zeppelin interpreter ``` -localhost -spark.ui.port=3333 -spark.executor.cores=16 -zeppelin.interpreter.enable=true +localhost -spark.ui.port=3333 -spark.executor.cores=16 ``` !!!Tip diff --git a/docs/howto/concurrent_apache_zeppelin_access_to_secure_snappydata.md b/docs/howto/concurrent_apache_zeppelin_access_to_secure_snappydata.md index ed97762044..536a44c52c 100644 --- a/docs/howto/concurrent_apache_zeppelin_access_to_secure_snappydata.md +++ b/docs/howto/concurrent_apache_zeppelin_access_to_secure_snappydata.md @@ -5,61 +5,43 @@ Multiple users can concurrently access a secure SnappyData cluster by configurin !!! Note - * Currently, only the `%snappydata` and `%jdbc` interpreters are supported with a secure SnappyData cluster. + * Currently, only the `%jdbc` interpreter is supported with a secure SnappyData cluster. - * Each user accessing the secure SnappyData cluster should configure the `%snappydata` and `%jdbc` interpreters in Apache Zeppelin as described in this section. + * Each user accessing the secure SnappyData cluster should configure the `%jdbc` interpreter in Apache Zeppelin as described here. ## Step 1: Download, Install and Configure SnappyData -1. [Download and install SnappyData Enterprise Edition](../install.md)
+ +1. [Download and install SnappyData](../install.md). 2. [Configure the SnappyData cluster with security enabled](../security/security.md). 3. [Start the SnappyData cluster](start_snappy_cluster.md). - - Create a table and load data. + - Create a table and load data. - - Grant the required permissions for the users accessing the table. + - Grant the required permissions for the users accessing the table. For example: snappy> GRANT SELECT ON Table airline TO user2; - snappy> GRANT INSERT ON Table airline TO user3; - snappy> GRANT UPDATE ON Table airline TO user4; - - !!! Note - User requiring INSERT, UPDATE or DELETE permissions also require explicit SELECT permission on a table. - -5. Extract the contents of the Zeppelin binary package.
- -6. Start the Zeppelin daemon using the command:
`./bin/zeppelin-daemon.sh start` - -## Configure the JDBC Interpreter -Log on to Zeppelin from your web browser and configure the [JDBC Interpreter](https://zeppelin.apache.org/docs/0.8.2/interpreter/jdbc.html). + snappy> GRANT INSERT ON Table airline TO user3; + snappy> GRANT UPDATE ON Table airline TO user4; - Zeppelin web server is started on port 8080 - http://:8080/#/ + To enable running `EXEC SCALA` also `GRANT`: - -## Configure the Interpreter + snappy> GRANT PRIVILEGE EXEC SCALA TO user2; -1. Log on to Zeppelin from your web browser and select **Interpreter** from the **Settings** option. -2. Edit the existing `%snappydata` and `%jdbc` interpreters and configure the interpreter properties. - The table lists the properties required for SnappyData: - - | Property | Value |Description| - |--------|--------|--------| - |default.url|jdbc:snappydata://localhost:1527/|Specify the JDBC URL for SnappyData cluster in the format `jdbc:snappydata://:1527`| - |default.driver|io.snappydata.jdbc.ClientDriver|Specify the JDBC driver for SnappyData| - |default.password||The JDBC user password| - |default.user||The JDBC username| + !!! Note + User requiring INSERT, UPDATE or DELETE permissions also require explicit SELECT permission on a table. -3. **Dependency settings**
Since Zeppelin includes only PostgreSQL driver jar by default, you need to add the Client (JDBC) JAR file path for SnappyData with the `%jdbc` interpreter. The SnappyData Client (JDBC) JAR file (snappydata-jdbc-2.11\_1.3.0.jar) is available on [the release page](https://github.com/TIBCOSoftware/snappydata/releases/tag/v1.3.0).
- The SnappyData Client (JDBC) JAR file (snappydata-jdbc\_2.11-1.3.0.jar)can also be placed under **/interpreter/jdbc** before starting Zeppelin instead of providing it in the dependency setting. + !!! IMPORTANT + Beware that granting EXEC SCALA privilege is overarching by design and essentially makes the user + equivalent to the database adminstrator since scala code can be used to modify anything using internal APIs. -4. If required, edit other properties, and then click **Save** to apply your changes. +4. Follow the remaining steps as given in [How to Use Apache Zeppelin with SnappyData](use_apache_zeppelin_with_snappydata.md) **See also** * [How to Use Apache Zeppelin with SnappyData](use_apache_zeppelin_with_snappydata.md) -* [How to connect using JDBC driver](/howto/connect_using_jdbc_driver.md) +* [How to connect using JDBC driver](../howto/connect_using_jdbc_driver.md) diff --git a/docs/howto/use_apache_zeppelin_with_snappydata.md b/docs/howto/use_apache_zeppelin_with_snappydata.md index b99b6f2b91..dc7efec061 100644 --- a/docs/howto/use_apache_zeppelin_with_snappydata.md +++ b/docs/howto/use_apache_zeppelin_with_snappydata.md @@ -3,42 +3,32 @@ ## Step 1: Download, Install and Configure SnappyData -1. [Download and Install SnappyData](../install/install_on_premise.md)
- The product jars directory already includes the snappydata-zeppelin jar used by SnappyData and Zeppelin installations. - The table below lists the version of the SnappyData Zeppelin Interpreter and Apache Zeppelin Installer for the supported SnappyData Releases. - | SnappyData Zeppelin Interpreter | Apache Zeppelin Binary Package | SnappyData Release| - |---------------------------------|--------------------------------|-------------------| - |[Version 0.8.2.1](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/releases/tag/v0.8.2.1) |[Version 0.8.2](http://archive.apache.org/dist/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-netinst.tgz) |[Release 1.3.0](https://github.com/TIBCOSoftware/snappydata/releases/tag/v1.3.0)| - |[Version 0.7.3.6](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/releases/tag/v0.7.3.6) |[Version 0.7.3](http://archive.apache.org/dist/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-netinst.tgz) |[Release 1.2.0](https://github.com/TIBCOSoftware/snappydata/releases/tag/v1.2.0)| +1. [Download and install SnappyData](../install.md). 2. [Configure the SnappyData Cluster](../configuring_cluster/configuring_cluster.md). -3. In [lead node configuration](../configuring_cluster/configuring_cluster.md#configuring-leads) set the following properties: +3. [Start the SnappyData cluster](start_snappy_cluster.md). - - Enable the SnappyData Zeppelin interpreter by adding `-zeppelin.interpreter.enable=true` +4. Extract the contents of the [Zeppelin 0.8.2 binary package](http://archive.apache.org/dist/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-netinst.tgz). + Then `cd` into the extracted `zeppelin-0.8.2-bin-netinst` directory.
+ Note that while these instructions work with any version of Zeppelin, the demo notebooks installed later + have been created and tested only on Zeppelin 0.8.2 and may not work correctly on other versions. - - In the **conf/spark-env.sh** file, set the `SPARK_PUBLIC_DNS` property to the public DNS name of the lead node. This enables the Member Logs to be displayed correctly to users accessing the [SnappyData Monitoring Console](../monitoring/monitoring.md) from outside the network. - In an AWS environment, this property is set automatically to the public address of the lead node so can be skipped. - -4. [Start the SnappyData cluster](start_snappy_cluster.md). - -5. Extract the contents of the Zeppelin binary package.
- -6. The SnappyData Zeppelin interpreter is included in the product jars directory. Install it in Apache Zeppelin by executing the following command from Zeppelin's installation directory:
- - ./bin/install-interpreter.sh --name snappydata --artifact /jars/snappydata-zeppelin_2.11-.jar - - Zeppelin interpreter allows the SnappyData interpreter to be plugged into Zeppelin using which, you can run queries. - Install additional interpreters like below (angular is used by display panels of the sample notebooks installed later):
+5. Install a couple of additional interpreters (angular is used by display panels of the sample notebooks installed later):
ZEPPELIN_INTERPRETER_DEP_MVNREPO=https://repo1.maven.org/maven2 ./bin/install-interpreter.sh --name angular,jdbc - These additional interpreters may need to be configured similar to the snappydata interpreter as described in the next section. + If you are using the `all` binary package from zeppelin instead of the `netinst` package linked in the previous step, + then you can skip this step. + +6. Copy the [SnappyData JDBC client jar](https://github.com/TIBCOSoftware/snappydata/releases/download/v1.3.0/snappydata-jdbc_2.11-1.3.0.jar) + inside the `interpreter/jdbc` directory. -7. Download the predefined SnappyData notebooks with configuration [notebooks\_embedded\_zeppelin.tar.gz](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/blob/master/examples/notebook/notebooks_embedded_zeppelin.tar.gz).
Extract and copy the contents of the compressed tar file (tar xzf) to the **notebook** folder in the Zeppelin installation on your local machine. +7. Download the predefined SnappyData notebooks with configuration [notebooks\_embedded\_zeppelin.tar.gz](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/blob/master/examples/notebook/notebooks_embedded_zeppelin.tar.gz).
+ Extract the contents of the compressed tar file (tar xzf) in the Zeppelin installation on your local machine. -8. Start the Zeppelin daemon using the command:
`bin/zeppelin-daemon.sh start` +8. Start the Zeppelin daemon using the command:
`./bin/zeppelin-daemon.sh start` 9. To ensure that the installation is successful, log into the Zeppelin UI (**http://localhost:8080** or :8080) from your web browser. @@ -50,64 +40,26 @@ Refer [here](concurrent_apache_zeppelin_access_to_secure_snappydata.md) for inst ## Step 2: Configure Interpreter Settings 1. Log on to Zeppelin from your web browser and select **Interpreter** from the **Settings** option. - This will require administrator privileges, which has user name as `admin` by default. + This will require a user having administrator privileges, which is set to `admin` by default. See **zeppelin-dir/conf/shiro.ini** file for the default admin password and other users and update the file to use your preferred authentication scheme as required. -2. Click **Create** to add an interpreter. If the list of interpreters already has snappydata, - then skip this step and instead configure the existing interpreter as shown in the next step.
![Create](../Images/create_interpreter.png) - -3. From the **Interpreter group** drop-down select **SnappyData**. - ![Configure Interpreter](../Images/snappydata_interpreter_properties.png) - - !!! Note - If **SnappyData** is not displayed in the **Interpreter group** drop-down list, try the following options, and then restart Zeppelin daemon: - - * Delete the **interpreter.json** file located in the **conf** directory (in the Zeppelin home directory). - - * Delete the **zeppelin-spark_<_version_number_>.jar** file located in the **interpreter/SnappyData** directory (in the Zeppelin home directory). - - -4. Click the **Connect to existing process** option. The fields **Host** and **Port** are displayed. - -5. Specify the host on which the SnappyData lead node is executing, and the SnappyData Zeppelin Port (Default is 3768). - - | Property | Default Values | Description | - |----------|----------------|-------------| - |Host |localhost |Specify host on which the SnappyData lead node is executing | - |Port |3768 |Specify the Zeppelin server port | - -6. Configure the interpreter properties.
The table lists the properties required for SnappyData. - - | Property | Value | Description | - |----------|-------|-------------| - |default.url|jdbc:snappydata://localhost:1527/ | Specify the JDBC URL for SnappyData cluster in the format `jdbc:snappydata://:1527` | - |default.driver|io.snappydata.jdbc.ClientDriver| Specify the JDBC driver for SnappyData| - |snappydata.connection|localhost:1527| Specify the `host:clientPort` combination of the locator for the JDBC connection (only required if running smart connector) | - |master|local[*]| Specify the URI of the spark master (only local/split mode) | - |zeppelin.jdbc.concurrent.use|true| Specify the Zeppelin scheduler to be used.
Select **True** for Fair and **False** for FIFO | - -7. If required, edit other properties, and then click **Save** to apply your changes.
- - -!!! Note -You can modify the default port number of the Zeppelin interpreter by setting the property:
-`-zeppelin.interpreter.port=` in [lead node configuration](../configuring_cluster/configuring_cluster.md#configuring-leads). - -## Additional Settings - -1. Create a note and bind the interpreter by setting SnappyData as the default interpreter.
SnappyData Zeppelin Interpreter group consist of two interpreters. Click and drag *<_Interpreter_Name_>* to the top of the list to set it as the default interpreter. +2. Click on **edit** in the `jdbc` interpreter section. - | Interpreter Name | Description | - |------------------|-------------| - |%snappydata.snappydata or
%snappydata.spark | This interpreter is used to write Scala code in the paragraph. SnappyContext is injected in this interpreter and can be accessed using variable **snc** | - |%snappydata.sql | This interpreter is used to execute SQL queries on the SnappyData cluster. It also has features of executing approximate queries on the SnappyData cluster.| +3. Configure the interpreter properties.
The table below lists the properties required for SnappyData. -2. Click **Save** to apply your changes. + | Property | Value | Description | + |-------------|-------|-------------| + |default.driver |io.snappydata.jdbc.ClientDriver |Specify the JDBC driver for SnappyData | + |default.url |jdbc:snappydata://localhost:1527 |Specify the JDBC URL for SnappyData cluster in the format `jdbc:snappydata://:1527` | + |default.user |SQL user name or `app` |If security is enabled in the SnappyData cluster, then the configured user name else `app` | + |default.password |SQL user password or `app` |If security is enabled in the SnappyData cluster, then the password of the user else can be anything | + |zeppelin.splitQueries |true |Each query in a paragraph is executed apart and returns the result | + |zeppelin.jdbc.concurrent.use |true |Specify the Zeppelin scheduler to be used.
Select **True** for Fair and **False** for FIFO | + |zeppelin.jdbc.interpolation |true |If interpolation of `ZeppelinContext` objects into the paragraph text is allowed | -### Known Issue +4. If required, edit other properties, and then click **Save** to apply your changes.
-If you are using SnappyData Zeppelin Interpreter 0.7.1 and Zeppelin Installer 0.7 with SnappyData or future releases, the approximate result does not work on the sample table, when you execute a paragraph with the `%sql show-instant-results-first` directive. ## FAQs diff --git a/docs/install/building_from_source.md b/docs/install/building_from_source.md index d799747b0e..315977d57a 100644 --- a/docs/install/building_from_source.md +++ b/docs/install/building_from_source.md @@ -21,29 +21,27 @@ To build product artifacts in all supported formats (tarball, zip, rpm, deb): ```pre > git clone https://github.com/TIBCOSoftware/snappydata.git --recursive > cd snappydata -> ./gradlew cleanAll -> ./gradlew distProduct +> ./gradlew cleanAll distProduct ``` The artifacts are in **build-artifacts/scala-2.11/distributions** You can also add the flags `-PenablePublish -PR.enable` to get them in the form as in an official -SnappyData distributions but that also requires zeppelin-interpreter and R as noted below. +SnappyData distributions but that also requires an installation of R as noted below. To build all product artifacts that are in the official SnappyData distributions: ```pre > git clone https://github.com/TIBCOSoftware/snappydata.git --recursive -> git clone https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter.git > cd snappydata -> ./gradlew cleanAll -> ./gradlew product copyShadowJars distTar -PenablePublish -PR.enable +> ./gradlew cleanAll product copyShadowJars distTar -PenablePublish -PR.enable ``` The artifacts are in **build-artifacts/scala-2.11/distributions** -Building SparkR (with the `R.enable` flag) requires R to be installed locally and at least the following -R packages along with their dependencies: knitr, markdown, rmarkdown, testthat +Building SparkR with the `-PR.enable` flag requires R 3.x or 4.x to be installed locally. +At least the following R packages along with their dependencies also need to be installed: +`knitr`, `markdown`, `rmarkdown`, `testthat` ## Repository Layout diff --git a/docs/isight/quick_start_steps.md b/docs/isight/quick_start_steps.md index e5df80f00e..83efe75697 100644 --- a/docs/isight/quick_start_steps.md +++ b/docs/isight/quick_start_steps.md @@ -221,6 +221,18 @@ Connecting the SnappyData Interpreter to the SnappyData cluster is represented i ![Example](../Images/isightconnect.png) +## Important Note + +The `%snappydata.*` interpreters described in the sections below are no longer preferred due to being +unsupported on secure clusters. The standard `%jdbc` interpreter with support for `EXEC SCALA` provides +equivalent functionality for both secure and insecure clusters. + +Refer to [How to Use Apache Zeppelin with SnappyData](../howto/use_apache_zeppelin_with_snappydata.md) for more details. + +The previous way noted below can still useful for AQP queries with the `show-instant-results-first` directive +as described in the sections below, but it works only for insecure clusters and for all other cases, +use of `%jdbc` interpreter should be preferred. + ## Using the Interpreter SnappyData Interpreter group consists of the interpreters `%snappydata.spark` and `%snappydata.sql`. To use an interpreter, add the associated interpreter directive with the format, `%` at the beginning of a paragraph in your note. In a paragraph, use one of the interpreters, and then enter required commands. diff --git a/docs/reference/command_line_utilities/scala-cli.md b/docs/reference/command_line_utilities/scala-cli.md index 49e67800b2..5db8b7c607 100644 --- a/docs/reference/command_line_utilities/scala-cli.md +++ b/docs/reference/command_line_utilities/scala-cli.md @@ -1,6 +1,8 @@ # snappy-scala CLI -The snappy-scala CLI is introduced as an experimental feature in the SnappyData 1.2.0 release and is considered a stable feature in the 1.3.0 release. This is similar to the Spark shell in its capabilities. The [Spark documentation](https://spark.apache.org/docs/2.1.1/quick-start.html) defines the Spark shell as follows: +The snappy-scala CLI that was introduced as an experimental feature in the SnappyData 1.2.0 release, +is considered a stable feature in the 1.3.0 release. This is similar to the Spark shell in its capabilities. +The [Spark documentation](https://spark.apache.org/docs/2.1.1/quick-start.html) defines the Spark shell as follows: ***Spark’s shell provides a simple way to learn the API, as well as a powerful tool to analyze data interactively. It is available in either Scala (which runs on the Java VM and is thus a good way to use existing Java libraries) or Python.*** diff --git a/docs/release_notes/release_notes.md b/docs/release_notes/release_notes.md index f498688ccb..036f8a573c 100644 --- a/docs/release_notes/release_notes.md +++ b/docs/release_notes/release_notes.md @@ -298,4 +298,4 @@ The following table describes the download artifacts included in SnappyData 1.3. |snappydata-odbc\_1.3.0_win.zip | 32-bit and 64-bit ODBC client drivers for Windows | |snappydata-1.3.0.sha256 | The SHA256 checksums of the product artifacts. On Linux verify using `sha256sum --check snappydata-1.3.0.sha256`. | |snappydata-1.3.0.sha256.gpg | GnuPG signature for snappydata-1.3.0.sha256. Get the public key using `gpg --keyserver hkps://keys.gnupg.net --recv-keys 573D42FDD455480DC33B7105F76D50B69DB1586C`. Then verify using `gpg --verify snappydata-1.3.0.sha256.gpg`. | -|[snappydata-zeppelin\_2.11-0.8.2.1.jar](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/releases/download/v0.8.2.1/snappydata-zeppelin_2.11-0.8.2.1.jar) | The Zeppelin interpreter jar for SnappyData, compatible with Apache Zeppelin 0.8.2. This is already present in the `jars` directory of product installation so does not need to be downloaded separately. | +|[snappydata-zeppelin\_2.11-0.8.2.1.jar](https://github.com/TIBCOSoftware/snappy-zeppelin-interpreter/releases/download/v0.8.2.1/snappydata-zeppelin_2.11-0.8.2.1.jar) | The Zeppelin interpreter jar for SnappyData compatible with Apache Zeppelin 0.8.2. The standard jdbc interpreter is preferred over this. See [How to Use Apache Zeppelin with SnappyData](../howto/use_apache_zeppelin_with_snappydata.md). |