revamped and enhanced the docs

- switched to mkdocs-material which is much better looking and functional - added custom stylesheet, footer etc for the new theme - updated the mkdocs.yml to use the new theme with customizations, and reorganized it - integrated the SQL functions doc in the top-level reference docs instead of linking to it - removed obsolete docs, resurrected VSD docs (with link to third-party tool) - moved and updated the docs for the new theme - updated and fixed references to Spark version, SnappyData releases and so on - tons of other changes to fix and improve the docs
TIBCOSoftware · Oct 18, 2021 · 64c6814 · 64c6814
1 parent d7e9ad6
commit 64c6814
Show file tree

Hide file tree

Showing 215 changed files with 1,312 additions and 993 deletions.
diff --git a/.gitignore b/.gitignore
@@ -10,6 +10,7 @@ vm_*
 .lib/
 dist/*
 build-artifacts/
+site/
 lib_managed/
 src_managed/
 project/boot/

diff --git a/README.md b/README.md
@@ -46,15 +46,15 @@ When speed is essential, applications can selectively copy the external data int
    In SnappyData, operational systems can feed data updates through Kafka to SnappyData. The incoming data can be CDC(Change-data-capture) events (insert, updates, or deletes) and can be easily ingested into in-memory tables with ease, consistency, and exactly-once semantics. The Application can apply custom logic to do sophisticated transformations and get the data ready for analytics. This incremental and continuous process is far more efficient than batch refreshes. Refer [Stream Processing with SnappyData](docs/howto/use_stream_processing_with_snappydata.md) </br>  
 
 *	**Approximate Query Processing(AQP)** </br>
-	When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](docs/aqp.md)
+	When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](docs/sde/index.md).
 
 *	**Access from anywhere** </br>
 	You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](docs/howto/connecttibcospotfire.md), [Tableau](docs/howto/tableauconnect.md), and [Qlikview](docs/setting_up_jdbc_driver_qlikview.md). Refer 
 
 
 ## Downloading and Installing SnappyData
 You can download and install the latest version of SnappyData from [github](https://github.com/TIBCOSoftware/snappydata/releases).
-Refer to the [documentation](docs//install.md) for installation steps.
+Refer to the [documentation](docs/install/index.md) for installation steps.
 
 ## Getting Started
 Multiple options are provided to get started with SnappyData. Easiest way to get going with SnappyData is on your laptop. You can also use any of the following options:
@@ -66,7 +66,7 @@ Multiple options are provided to get started with SnappyData. Easiest way to get
 *	Docker
 *	Kubernetes
 
-You can find more information on options for running SnappyData [here](docs/quickstart.md).
+You can find more information on options for running SnappyData [here](docs/quickstart/index.md).
 
 ## Quick Test to Measure Performance of SnappyData vs Apache Spark
 

diff --git a/build.gradle b/build.gradle
@@ -1540,14 +1540,8 @@ task docs(type: ScalaDoc) {
   destinationDir = file("${rootProject.buildDir}/docs")
 }
 
-task buildSqlFuncDocs(type: Exec) {
-  dependsOn product
-  //on linux
-  commandLine "${rootProject.projectDir}/spark/sql/create-docs.sh"
-}
-
 task publishDocs(type: Exec) {
-  dependsOn docs, buildSqlFuncDocs
+  dependsOn product, docs
   //on linux
   commandLine './publish-site.sh'
 }

diff --git a/docs/GettingStarted.md b/docs/GettingStarted.md
@@ -41,15 +41,15 @@ When speed is essential, applications can selectively copy the external data int
    In SnappyData, operational systems can feed data updates through Kafka to SnappyData. The incoming data can be CDC(Change-data-capture) events (insert, updates, or deletes) and can be easily ingested into in-memory tables with ease, consistency, and exactly-once semantics. The Application can apply custom logic to do sophisticated transformations and get the data ready for analytics. This incremental and continuous process is far more efficient than batch refreshes. Refer [Stream Processing with SnappyData](howto/use_stream_processing_with_snappydata.md) </br>  
 
 *	**Approximate Query Processing(AQP)** </br>
-	When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](aqp.md)
+	When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](sde/index.md).
 
 *	**Access from anywhere** </br>
-	You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](howto/connecttibcospotfire.md), [Tableau](howto/tableauconnect.md), and [Qlikview](setting_up_jdbc_driver_qlikview.md). Refer 
+	You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1 to 2.1.3. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](howto/connecttibcospotfire.md), [Tableau](howto/tableauconnect.md), and [Qlikview](setting_up_jdbc_driver_qlikview.md).
 
 
 ## Downloading and Installing SnappyData
 You can download and install the latest version of SnappyData from [github](https://github.com/TIBCOSoftware/snappydata/releases) or you can download the enterprise version that is TIBCO ComputeDB from [here](https://edelivery.tibco.com/storefront/index.ep).
-Refer to the [documentation](/install.md) for installation steps.
+Refer to the [documentation](install/index.md) for installation steps.
 
 ## Getting Started
 Multiple options are provided to get started with SnappyData. Easiest way to get going with SnappyData is on your laptop. You can also use any of the following options:
@@ -61,7 +61,7 @@ Multiple options are provided to get started with SnappyData. Easiest way to get
 *	Docker
 *	Kubernetes
 
-You can find more information on options for running SnappyData [here](/quickstart.md).
+You can find more information on options for running SnappyData [here](quickstart/index.md).
 
 ## Quick Test to Measure Performance of SnappyData vs Apache Spark
 
@@ -132,7 +132,7 @@ For more details, refer [https://github.com/sbt/sbt/issues/3618](https://github.
 
 
 ## Building from Source
-If you would like to build SnappyData from source, refer to the [documentation on building from source](/install/building_from_source.md).
+If you would like to build SnappyData from source, refer to the [documentation on building from source](install/building_from_source.md).
 
 
 ## How is SnappyData Different than Apache Spark?

diff --git a/docs/Images/logo.png b/docs/Images/logo.png
diff --git a/docs/Images/vsd/vsd-connection-stats.png b/docs/Images/vsd/vsd-connection-stats.png
diff --git a/docs/Images/vsd/vsd_applications.png b/docs/Images/vsd/vsd_applications.png
diff --git a/docs/Images/vsd/vsd_applications_2.png b/docs/Images/vsd/vsd_applications_2.png
diff --git a/docs/Images/vsd/vsd_cpu.png b/docs/Images/vsd/vsd_cpu.png
diff --git a/docs/Images/vsd/vsd_memory.png b/docs/Images/vsd/vsd_memory.png
diff --git a/docs/Images/vsd/vsd_memory_2.png b/docs/Images/vsd/vsd_memory_2.png
diff --git a/docs/Images/vsd/vsd_statements.png b/docs/Images/vsd/vsd_statements.png
diff --git a/docs/Images/vsd/vsd_tables.png b/docs/Images/vsd/vsd_tables.png
diff --git a/docs/Images/vsd/vsd_tables_2.png b/docs/Images/vsd/vsd_tables_2.png
diff --git a/docs/Images/vsd/vsd_tables_3.png b/docs/Images/vsd/vsd_tables_3.png
diff --git a/docs/Images/vsd/vsd_transactions.png b/docs/Images/vsd/vsd_transactions.png
diff --git a/docs/Images/vsd/vsd_transactions_2.png b/docs/Images/vsd/vsd_transactions_2.png
diff --git a/docs/LICENSE.md b/docs/LICENSE.md
@@ -1,4 +1,8 @@
-## LICENSE
+# License
+
+The source code is distributed with Apache License 2.0. Users can download and deploy it in production.
+Full text of the license is below.
+
 
                                  Apache License
                            Version 2.0, January 2004
@@ -188,7 +192,7 @@
       same "printed page" as the copyright notice for easier
       identification within third-party archives.
 
-   Copyright 2018 SnappyData Inc.
+   Copyright © 2017-2021 TIBCO Software Inc. All rights reserved.
 
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.

diff --git a/docs/additional_files/open_source_components.md b/docs/additional_files/open_source_components.md
@@ -44,9 +44,4 @@ The high level capabilities of the **Community Edition** are listed in the follo
 |Use encrypted password instead of clear text password | X |
 |Restrict Table, View, Function creation even in user’s own schema | X |
 |LDAP security interface | X |
-|Visual Statistics Display (VSD) tool for system statistics (gfs) files(*) |  |
 |GemFire connector |  |
-
-(*) NOTE: The graphical Visual Statistics Display (VSD) tool to see the system statistics (gfs) files is not OSS
-and was never shipped with SnappyData. It is available from [GemTalk Systems](https://gemtalksystems.com/products/vsd/)
-or [Pivotal GemFire](https://network.pivotal.io/products/pivotal-gemfire) under their own respective licenses.
diff --git a/docs/affinity_modes/connector_mode.md b/docs/affinity_modes/connector_mode.md
@@ -11,7 +11,7 @@ Specifically, to run on a cluster, the SparkContext can connect to several types
 
 **Key Points:**
 
-* Can work with SnappyData store from a compatible Spark distribution (2.1.1)
+* Can work with SnappyData store from a compatible Spark distribution (2.1.1 to 2.1.3)
 
 * Spark application executes in its own independent JVM processes
 

diff --git a/docs/affinity_modes/embedded_mode.md b/docs/affinity_modes/embedded_mode.md
@@ -23,7 +23,7 @@ In this mode, one can write Spark programs using jobs. For more details, refer t
 
 Also, you can use [SnappySQL](../howto/use_snappy_shell.md) to create and query tables.
 
-You can either [start SnappyData members](../howto/start_snappy_cluster/) using the `snappy-start-all.sh` script or you can start them individually.
+You can either [start SnappyData members](../howto/start_snappy_cluster.md) using the `snappy-start-all.sh` script or you can start them individually.
 
 Having the Spark computation embedded in the same JVM allows us to do several optimizations at the query planning level. For example:
 

diff --git a/docs/affinity_modes/index.md b/docs/affinity_modes/index.md
@@ -0,0 +1,10 @@
+# Affinity Modes
+In this section, the various modes available for colocation of related data and computation is discussed.
+
+You can run the SnappyData store in the following modes:
+
+* [Local Mode](local_mode.md): Used mainly for development, where the client application, the executors, and data store are all running in the same JVM
+
+* [Embedded SnappyData Store Mode](embedded_mode.md): The Spark computations and in-memory data store run colocated in the same JVM
+
+* [SnappyData Smart Connector Mode](connector_mode.md): Allows you to work with the SnappyData store cluster from any compatible Spark distribution
diff --git a/docs/affinity_modes/local_mode.md b/docs/affinity_modes/local_mode.md
@@ -68,7 +68,7 @@ To start SnappyData store you need to create a SnappySession in your program:
 
 **Example**: **Launch Apache Spark shell and provide SnappyData dependency as a Spark package**:
 
-If you already have Spark 2.1.1 installed in your local machine you can directly use `--packages` option to download the SnappyData binaries.
+If you already have Spark 2.1.1 to 2.1.3 installed in your local machine you can directly use `--packages` option to download the SnappyData binaries.
 
 ```pre
 ./bin/spark-shell --packages "TIBCOSoftware:snappydata:1.3.0-s_2.11"

diff --git a/docs/apidocsintro.md b/docs/apidocsintro.md
@@ -1,5 +1,5 @@
 ## API Documentation 
 
-*	Details about **SnappyData Spark Extension APIs** can be found [here](/reference/API_Reference/apireference_guide.md).
+*	Details about **SnappyData Spark Extension APIs** can be found [here](reference/API_Reference/apireference_guide.md).
 
 *	Details of all the **other API reference for SnappyData **can be found [here](http://tibcosoftware.github.io/snappydata/apidocs).
diff --git a/docs/aqp_aws.md b/docs/aqp_aws.md
@@ -1,9 +1,9 @@
 # Using <!--iSight-Cloud-->SnappyData CloudBuilder
-<!--iSight-Cloud-->CloudBuilder is a cloud-based service that allows for instant visualization of analytic query results on large datasets. Powered by the SnappyData Synopsis Data Engine ([SDE](aqp.md)), users interact with <!--iSight-Cloud-->CloudBuilder to populate the synopsis engine with the right data sets and accelerate SQL queries by using the engine to provide latency bounded responses to large complex aggregate queries. 
+<!--iSight-Cloud-->CloudBuilder is a cloud-based service that allows for instant visualization of analytic query results on large datasets. Powered by the SnappyData Synopsis Data Engine ([SDE](sde/index.md)), users interact with <!--iSight-Cloud-->CloudBuilder to populate the synopsis engine with the right data sets and accelerate SQL queries by using the engine to provide latency bounded responses to large complex aggregate queries.
 
-<!--iSight-Cloud-->CloudBuilder uses Apache Zeppelin as the front end notebook to display results and allows users to build powerful notebooks representing key elements of their business in a matter of minutes. 
+<!--iSight-Cloud-->CloudBuilder uses Apache Zeppelin as the front end notebook to display results and allows users to build powerful notebooks representing key elements of their business in a matter of minutes.
 
-The service provides a web URL that spins up a cluster instance on AWS or users can download the <!--iSight-Cloud-->CloudBuilder EC2 script to configure a custom sized cluster, to create and render powerful visualizations of their big data sets with the click of a button. 
+The service provides a web URL that spins up a cluster instance on AWS or users can download the <!--iSight-Cloud-->CloudBuilder EC2 script to configure a custom sized cluster, to create and render powerful visualizations of their big data sets with the click of a button.
 With <!--iSight-Cloud-->CloudBuilder, you can speed up the process of understanding what your data is telling you, and move on to the task of organizing your business around those insights rapidly.
 
 In this document, the features provided by SnappyData for analyzing your data is described. It also provides details for deploying a SnappyData Cloud cluster on AWS using either the CloudFormation service or by using the EC2 scripts.

diff --git a/docs/architecture/cluster_architecture.md b/docs/architecture/cluster_architecture.md
@@ -9,7 +9,7 @@ A SnappyData cluster is a peer-to-peer (P2P) network comprised of three distinct
 
 ![ClusterArchitecture](../GettingStarted_Architecture.png)
 
-SnappyData also has multiple deployment options. For more information refer to, [Deployment Options](../deployment.md).
+SnappyData also has multiple deployment options. For more information refer to, [Deployment Options](../affinity_modes/index.md).
 
 ## Interacting with SnappyData
 

diff --git a/docs/architecture/index.md b/docs/architecture/index.md
@@ -0,0 +1,11 @@
+# SnappyData Concepts
+
+
+The topic explains the following fundamental concepts of SnappyData:
+
+*   [Core Components](core_components.md)
+*   [SnappyData Cluster Architecture](cluster_architecture.md)
+*   [Hybrid Cluster Manager](hybrid_cluster_manager.md)
+*   [Distributed Transactions](../consistency/index.md)
+*   [Affinity Modes](../affinity_modes/index.md)
+
diff --git a/docs/best_practices.md b/docs/best_practices.md
diff --git a/docs/best_practices/design_principles.md → ...ctices/design_schema/design_principles.md b/docs/best_practices/design_principles.md → ...ctices/design_schema/design_principles.md
diff --git a/docs/best_practices/design_schema.md → docs/best_practices/design_schema/index.md b/docs/best_practices/design_schema.md → docs/best_practices/design_schema/index.md