Skip to content

Commit

Permalink
revamped and enhanced the docs
Browse files Browse the repository at this point in the history
- switched to mkdocs-material which is much better looking and functional
- added custom stylesheet, footer etc for the new theme
- updated the mkdocs.yml to use the new theme with customizations, and reorganized it
- integrated the SQL functions doc in the top-level reference docs instead of linking to it
- removed obsolete docs, resurrected VSD docs (with link to third-party tool)
- moved and updated the docs for the new theme
- updated and fixed references to Spark version, SnappyData releases and so on
- tons of other changes to fix and improve the docs
  • Loading branch information
sumwale committed Oct 18, 2021
1 parent d7e9ad6 commit 64c6814
Show file tree
Hide file tree
Showing 215 changed files with 1,312 additions and 993 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ vm_*
.lib/
dist/*
build-artifacts/
site/
lib_managed/
src_managed/
project/boot/
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,15 @@ When speed is essential, applications can selectively copy the external data int
In SnappyData, operational systems can feed data updates through Kafka to SnappyData. The incoming data can be CDC(Change-data-capture) events (insert, updates, or deletes) and can be easily ingested into in-memory tables with ease, consistency, and exactly-once semantics. The Application can apply custom logic to do sophisticated transformations and get the data ready for analytics. This incremental and continuous process is far more efficient than batch refreshes. Refer [Stream Processing with SnappyData](docs/howto/use_stream_processing_with_snappydata.md) </br>

* **Approximate Query Processing(AQP)** </br>
When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](docs/aqp.md)
When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](docs/sde/index.md).

* **Access from anywhere** </br>
You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](docs/howto/connecttibcospotfire.md), [Tableau](docs/howto/tableauconnect.md), and [Qlikview](docs/setting_up_jdbc_driver_qlikview.md). Refer


## Downloading and Installing SnappyData
You can download and install the latest version of SnappyData from [github](https://github.com/TIBCOSoftware/snappydata/releases).
Refer to the [documentation](docs//install.md) for installation steps.
Refer to the [documentation](docs/install/index.md) for installation steps.

## Getting Started
Multiple options are provided to get started with SnappyData. Easiest way to get going with SnappyData is on your laptop. You can also use any of the following options:
Expand All @@ -66,7 +66,7 @@ Multiple options are provided to get started with SnappyData. Easiest way to get
* Docker
* Kubernetes

You can find more information on options for running SnappyData [here](docs/quickstart.md).
You can find more information on options for running SnappyData [here](docs/quickstart/index.md).

## Quick Test to Measure Performance of SnappyData vs Apache Spark

Expand Down
8 changes: 1 addition & 7 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -1540,14 +1540,8 @@ task docs(type: ScalaDoc) {
destinationDir = file("${rootProject.buildDir}/docs")
}

task buildSqlFuncDocs(type: Exec) {
dependsOn product
//on linux
commandLine "${rootProject.projectDir}/spark/sql/create-docs.sh"
}

task publishDocs(type: Exec) {
dependsOn docs, buildSqlFuncDocs
dependsOn product, docs
//on linux
commandLine './publish-site.sh'
}
Expand Down
10 changes: 5 additions & 5 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,15 @@ When speed is essential, applications can selectively copy the external data int
In SnappyData, operational systems can feed data updates through Kafka to SnappyData. The incoming data can be CDC(Change-data-capture) events (insert, updates, or deletes) and can be easily ingested into in-memory tables with ease, consistency, and exactly-once semantics. The Application can apply custom logic to do sophisticated transformations and get the data ready for analytics. This incremental and continuous process is far more efficient than batch refreshes. Refer [Stream Processing with SnappyData](howto/use_stream_processing_with_snappydata.md) </br>

* **Approximate Query Processing(AQP)** </br>
When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](aqp.md)
When dealing with huge data sets, for example, IoT sensor streaming time-series data, it may not be possible to provision the data in-memory, and if left at the source (say Hadoop or S3) your analytic query processing can take too long. In SnappyData, you can create one or more stratified data samples on the full data set. The query engine automatically uses these samples for aggregation queries, and a nearly accurate answer returned to clients. This can be immensely valuable when visualizing a trend, plotting a graph or bar chart. Refer [AQP](sde/index.md).

* **Access from anywhere** </br>
You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](howto/connecttibcospotfire.md), [Tableau](howto/tableauconnect.md), and [Qlikview](setting_up_jdbc_driver_qlikview.md). Refer
You can use JDBC, ODBC, REST, or any of the Apache Spark APIs. The product is fully compatible with Apache Spark 2.1.1 to 2.1.3. SnappyData natively supports modern visualization tools such as [TIBCO Spotfire](howto/connecttibcospotfire.md), [Tableau](howto/tableauconnect.md), and [Qlikview](setting_up_jdbc_driver_qlikview.md).


## Downloading and Installing SnappyData
You can download and install the latest version of SnappyData from [github](https://github.com/TIBCOSoftware/snappydata/releases) or you can download the enterprise version that is TIBCO ComputeDB from [here](https://edelivery.tibco.com/storefront/index.ep).
Refer to the [documentation](/install.md) for installation steps.
Refer to the [documentation](install/index.md) for installation steps.

## Getting Started
Multiple options are provided to get started with SnappyData. Easiest way to get going with SnappyData is on your laptop. You can also use any of the following options:
Expand All @@ -61,7 +61,7 @@ Multiple options are provided to get started with SnappyData. Easiest way to get
* Docker
* Kubernetes

You can find more information on options for running SnappyData [here](/quickstart.md).
You can find more information on options for running SnappyData [here](quickstart/index.md).

## Quick Test to Measure Performance of SnappyData vs Apache Spark

Expand Down Expand Up @@ -132,7 +132,7 @@ For more details, refer [https://github.com/sbt/sbt/issues/3618](https://github.


## Building from Source
If you would like to build SnappyData from source, refer to the [documentation on building from source](/install/building_from_source.md).
If you would like to build SnappyData from source, refer to the [documentation on building from source](install/building_from_source.md).


## How is SnappyData Different than Apache Spark?
Expand Down
Binary file modified docs/Images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd-connection-stats.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_applications.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_applications_2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_cpu.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_memory.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_memory_2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_statements.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_tables.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_tables_2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_tables_3.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_transactions.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified docs/Images/vsd/vsd_transactions_2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 6 additions & 2 deletions docs/LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
## LICENSE
# License

The source code is distributed with Apache License 2.0. Users can download and deploy it in production.
Full text of the license is below.


Apache License
Version 2.0, January 2004
Expand Down Expand Up @@ -188,7 +192,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright 2018 SnappyData Inc.
Copyright © 2017-2021 TIBCO Software Inc. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
5 changes: 0 additions & 5 deletions docs/additional_files/open_source_components.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,4 @@ The high level capabilities of the **Community Edition** are listed in the follo
|Use encrypted password instead of clear text password | X |
|Restrict Table, View, Function creation even in user’s own schema | X |
|LDAP security interface | X |
|Visual Statistics Display (VSD) tool for system statistics (gfs) files(*) | |
|GemFire connector | |

(*) NOTE: The graphical Visual Statistics Display (VSD) tool to see the system statistics (gfs) files is not OSS
and was never shipped with SnappyData. It is available from [GemTalk Systems](https://gemtalksystems.com/products/vsd/)
or [Pivotal GemFire](https://network.pivotal.io/products/pivotal-gemfire) under their own respective licenses.
2 changes: 1 addition & 1 deletion docs/affinity_modes/connector_mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Specifically, to run on a cluster, the SparkContext can connect to several types

**Key Points:**

* Can work with SnappyData store from a compatible Spark distribution (2.1.1)
* Can work with SnappyData store from a compatible Spark distribution (2.1.1 to 2.1.3)

* Spark application executes in its own independent JVM processes

Expand Down
2 changes: 1 addition & 1 deletion docs/affinity_modes/embedded_mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ In this mode, one can write Spark programs using jobs. For more details, refer t

Also, you can use [SnappySQL](../howto/use_snappy_shell.md) to create and query tables.

You can either [start SnappyData members](../howto/start_snappy_cluster/) using the `snappy-start-all.sh` script or you can start them individually.
You can either [start SnappyData members](../howto/start_snappy_cluster.md) using the `snappy-start-all.sh` script or you can start them individually.

Having the Spark computation embedded in the same JVM allows us to do several optimizations at the query planning level. For example:

Expand Down
10 changes: 10 additions & 0 deletions docs/affinity_modes/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Affinity Modes
In this section, the various modes available for colocation of related data and computation is discussed.

You can run the SnappyData store in the following modes:

* [Local Mode](local_mode.md): Used mainly for development, where the client application, the executors, and data store are all running in the same JVM

* [Embedded SnappyData Store Mode](embedded_mode.md): The Spark computations and in-memory data store run colocated in the same JVM

* [SnappyData Smart Connector Mode](connector_mode.md): Allows you to work with the SnappyData store cluster from any compatible Spark distribution
2 changes: 1 addition & 1 deletion docs/affinity_modes/local_mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ To start SnappyData store you need to create a SnappySession in your program:

**Example**: **Launch Apache Spark shell and provide SnappyData dependency as a Spark package**:

If you already have Spark 2.1.1 installed in your local machine you can directly use `--packages` option to download the SnappyData binaries.
If you already have Spark 2.1.1 to 2.1.3 installed in your local machine you can directly use `--packages` option to download the SnappyData binaries.

```pre
./bin/spark-shell --packages "TIBCOSoftware:snappydata:1.3.0-s_2.11"
Expand Down
2 changes: 1 addition & 1 deletion docs/apidocsintro.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
## API Documentation

* Details about **SnappyData Spark Extension APIs** can be found [here](/reference/API_Reference/apireference_guide.md).
* Details about **SnappyData Spark Extension APIs** can be found [here](reference/API_Reference/apireference_guide.md).

* Details of all the **other API reference for SnappyData **can be found [here](http://tibcosoftware.github.io/snappydata/apidocs).
6 changes: 3 additions & 3 deletions docs/aqp_aws.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Using <!--iSight-Cloud-->SnappyData CloudBuilder
<!--iSight-Cloud-->CloudBuilder is a cloud-based service that allows for instant visualization of analytic query results on large datasets. Powered by the SnappyData Synopsis Data Engine ([SDE](aqp.md)), users interact with <!--iSight-Cloud-->CloudBuilder to populate the synopsis engine with the right data sets and accelerate SQL queries by using the engine to provide latency bounded responses to large complex aggregate queries.
<!--iSight-Cloud-->CloudBuilder is a cloud-based service that allows for instant visualization of analytic query results on large datasets. Powered by the SnappyData Synopsis Data Engine ([SDE](sde/index.md)), users interact with <!--iSight-Cloud-->CloudBuilder to populate the synopsis engine with the right data sets and accelerate SQL queries by using the engine to provide latency bounded responses to large complex aggregate queries.

<!--iSight-Cloud-->CloudBuilder uses Apache Zeppelin as the front end notebook to display results and allows users to build powerful notebooks representing key elements of their business in a matter of minutes.
<!--iSight-Cloud-->CloudBuilder uses Apache Zeppelin as the front end notebook to display results and allows users to build powerful notebooks representing key elements of their business in a matter of minutes.

The service provides a web URL that spins up a cluster instance on AWS or users can download the <!--iSight-Cloud-->CloudBuilder EC2 script to configure a custom sized cluster, to create and render powerful visualizations of their big data sets with the click of a button.
The service provides a web URL that spins up a cluster instance on AWS or users can download the <!--iSight-Cloud-->CloudBuilder EC2 script to configure a custom sized cluster, to create and render powerful visualizations of their big data sets with the click of a button.
With <!--iSight-Cloud-->CloudBuilder, you can speed up the process of understanding what your data is telling you, and move on to the task of organizing your business around those insights rapidly.

In this document, the features provided by SnappyData for analyzing your data is described. It also provides details for deploying a SnappyData Cloud cluster on AWS using either the CloudFormation service or by using the EC2 scripts.
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/cluster_architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A SnappyData cluster is a peer-to-peer (P2P) network comprised of three distinct

![ClusterArchitecture](../GettingStarted_Architecture.png)

SnappyData also has multiple deployment options. For more information refer to, [Deployment Options](../deployment.md).
SnappyData also has multiple deployment options. For more information refer to, [Deployment Options](../affinity_modes/index.md).

## Interacting with SnappyData

Expand Down
11 changes: 11 additions & 0 deletions docs/architecture/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# SnappyData Concepts


The topic explains the following fundamental concepts of SnappyData:

* [Core Components](core_components.md)
* [SnappyData Cluster Architecture](cluster_architecture.md)
* [Hybrid Cluster Manager](hybrid_cluster_manager.md)
* [Distributed Transactions](../consistency/index.md)
* [Affinity Modes](../affinity_modes/index.md)

20 changes: 0 additions & 20 deletions docs/best_practices.md

This file was deleted.

File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 64c6814

Please sign in to comment.