Content:
- Welcome to Cygnus NGSI
- Basic operation
- Advanced topics and further reading
- Features summary
- Reporting issues and contact information
Cygnus NGSI is a connector in charge of persisting Orion context data in certain configured third-party storages, creating a historical view of such data. In other words, Orion only stores the last value regarding an entity's attribute, and if an older value is required then you will have to persist it in other storage, value by value, using Cygnus NGSI.
Cygnus NGSI uses the subscription/notification feature of Orion. A subscription is made in Orion on behalf of Cygnus NGSI, detailing which entities we want to be notified when an update occurs on any of those entities attributes.
Internally, Cygnus NGSI is based on Apache Flume, which is used through cygnus-common and which Cygnus NGSI depends on. In fact, Cygnus NGSI is a Flume agent, which is basically composed of a source in charge of receiving the data, a channel where the source puts the data once it has been transformed into a Flume event, and a sink, which takes Flume events from the channel in order to persist the data within its body into a third-party storage.
Current stable release is able to persist Orion context data in:
- HDFS, the Hadoop distributed file system.
- MySQL, the well-know relational database manager.
- CKAN, an Open Data platform.
- MongoDB, the NoSQL document-oriented database.
- STH Comet, a Short-Term Historic database built on top of MongoDB.
- Kafka, the publish-subscribe messaging broker.
- DynamoDB, a cloud-based NoSQL database by Amazon Web Services.
- PostgreSQL, the well-know relational database manager.
- PostGIS, a spatial database extender for PostgreSQL object-relational database.
- Carto, the database specialized in geolocated data.
- Orion, the FIWARE Context Broker.
- Elasticsearch, the distributed full-text search engine with JSON documents.
- Arcgis, the Arcgis is a geographic information system (GIS).
You may consider to visit Cygnus NGSI Quick Start Guide before going deep into the details.
- RAM: 1 GB, specially if abusing of the batching mechanism.
- HDD: A few GB may be enough unless the channel types are configured as
FileChannel
type.
Simply configure the FIWARE release repository if not yet configured:
sudo wget -P /etc/yum.repos.d/ https://nexus.lab.fiware.org/repository/raw/public/repositories/el/7/x86_64/fiware-release.repo
And use your applications manager in order to install the latest version of Cygnus NGSI:
sudo yum install cygnus-ngsi
The above will install cygus-ngsi in /usr/cygnus/
.
Please observe, as part of the installation process, cygnus-common is installed too.
Cygnus NGSI is a tool with a high degree of configuration required for properly running it. The reason is the configuration describes the Flume-based agent chosen to be run.
So, the starting point is choosing the internal architecture of the Cygnus NGSI agent. Let's assume the simplest one:
+-------+
| NGSI|
| Rest|
|Handler|
+-------------+ +----------------+ +---------------+
| http source |----| memory channel |----| NGSITestSink |
+-------------+ +----------------+ +---------------+
Attending to the above architecture, the content of /usr/cygnus/conf/agent_1.conf
will be:
cygnusagent.sources = http-source
cygnusagent.sinks = test-sink
cygnusagent.channels = test-channel
cygnusagent.sources.http-source.channels = test-channel
cygnusagent.sources.http-source.type = http
cygnusagent.sources.http-source.port = 5050
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
cygnusagent.sources.http-source.handler.notification_target = /notify
cygnusagent.sources.http-source.handler.default_service = def_serv
cygnusagent.sources.http-source.handler.default_service_path = /def_servpath
cygnusagent.sources.http-source.handler.events_ttl = 10
cygnusagent.sources.http-source.interceptors = ts
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
cygnusagent.channels.test-channel.type = memory
cygnusagent.channels.test-channel.capacity = 1000
cygnusagent.channels.test-channel.transactionCapacity = 100
Check the Installation and Administration Guide for configurations involving real data storages such as HDFS, MySQL, etc.
In addition, a /usr/cygnus/conf/cygnus_instance_1.conf
file must be created if we want to run Cygnus NGSI as a service (see next section):
CYGNUS_USER=cygnus
CONFIG_FOLDER=/usr/cygnus/conf
CONFIG_FILE=/usr/cygnus/conf/agent_1.conf
AGENT_NAME=cygnusagent
LOGFILE_NAME=cygnus.log
ADMIN_PORT=8081
POLLING_INTERVAL=30
Cygnus NGSI can be run as a service by simply typing:
$ (sudo) service cygnus start
Logs are written in /var/log/cygnus/cygnus.log
, and the PID of the process will be at /var/run/cygnus/cygnus_1.pid
.
Running the tests require Apache Maven installed and Cygnus NGSI sources downloaded.
$ git clone https://github.com/telefonicaid/fiware-cygnus.git
$ cd fiware-cygnus/cygnus-ngsi
$ mvn test
Cygnus NGSI works by receiving NGSI-like notifications, which are finally persisted. In order to test this, you can run any of the notification scripts located in the resources folder of this repo, which emulate certain notification types.
$ ./notification-json-simple.sh http://localhost:5050/notify myservice myservicepath
* Trying ::1...
* Connected to localhost (::1) port 5050 (#0)
> POST /notify HTTP/1.1
> Host: localhost:5050
> Content-Type: application/json
> Accept: application/json
> User-Agent: orion/0.10.0
> Fiware-Service: myservice
> Fiware-ServicePath: myservicepath
> ngsiv2-attrsformat: normalized
> Content-Length: 460
>
* upload completely sent off: 460 out of 460 bytes
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< Server: Jetty(6.1.26)
<
* Connection #0 to host localhost left intact
Or you can connect a real NGSI source such as Orion Context Broker. Please, check the User and Programmer Guide for further details.
Run the following curl
in order to get the version (assuming Cygnus NGSI runs on localhost
):
$ curl -X GET "http://localhost:8081/v1/version"
{
"success": "true",
"version": "0.12.0_SNAPSHOT.52399574ea8503aa8038ad14850380d77529b550"
}
Run the following curl
in order to get certain Flume components statistics (assuming cygus-ngsi runs on localhost
):
$ curl -X GET "http://localhost:8081/v1/stats" | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 489 100 489 0 0 81500 0 --:--:-- --:--:-- --:--:-- 97800
{
"stats": {
"channels": [
{
"name": "mysql-channel",
"num_events": 0,
"num_puts_failed": 0,
"num_puts_ok": 11858,
"num_takes_failed": 1,
"num_takes_ok": 11858,
"setup_time": "2016-02-05T10:34:25.80Z",
"status": "START"
}
],
"sinks": [
{
"name": "mysql-sink",
"num_persisted_events": 11800,
"num_processed_events": 11858,
"setup_time": "2016-02-05T10:34:24.978Z",
"status": "START"
}
],
"sources": [
{
"name": "http-source",
"num_processed_events": 11858,
"num_received_events": 11858,
"setup_time": "2016-02-05T10:34:24.921Z",
"status": "START"
}
]
},
"success": "true"
}
Many other operations, like getting/setting the log level, can be found in Management Interface documentation.
Detailed information regarding cygus-ngsi can be found in the Installation and Administration Guide, the User and Programmer Guide and the Flume extensions catalogue. The following is just a list of shortcuts regarding the most popular topics:
- Installation with docker. An alternative to RPM installation, docker is one of the main options when installing FIWARE components.
- Installation from sources. Sometimes you will need to install from sources, particularly when some of the dependencies must be modified, e.g. the
hadoop-core
libraries. - Running as a process. Running cygus-ngsi as a process is very useful for testing and debugging purposes.
- Management Interface. REST-based management interface for administration purposes.
- Name Mappings. Designed as a Flume interceptor, this feature alows overwriting any notified service, service path, entity ID, entity type, attribute name or attribute type, when used for naming.
- Multi-instance. Several instances of cygus-ngsi can be run as a service.
- Reliability. Learn about the mechanisms making Cygnus a very reliable tool.
- Performance tips. If you are experiencing performance issues or want to improve your statistics, take a look on how to obtain the best from cygus-ngsi.
- New sink development. Addressed to those developers aiming to contribute to cygus-ngsi with new sinks.
- Integration examples. Step-by-step how-to's regarding the integraton of Cygnus NGSI with Spark and Kafka.
Component | Feature | From version |
---|---|---|
NGSIHDFSSink | First implementation | 0.1.0 |
Multiple HDFS endpoint setup | 0.4.1 | |
Kerberos support | 0.7.0 | |
OAuth2 support | 0.8.2 | |
CSV support | 0.9.0 | |
HiveServer2 support | 0.9.0 | |
Table type select | 0.9.0 | |
enable/disable Hive | 0.10.0 | |
HDFSBackendImplBinary | 0.10.0 | |
Batching mechanism | 0.10.0 | |
Per-user Hive databases | 0.12.0 | |
NGSICKANSink | First implementation | 0.2.0 |
Enable SSL | 0.4.2 | |
Batching mechanism | 0.11.0 | |
Capping and expiration | 1.7.0 | |
Possibility to select datamodel | 2.2.0 | |
NGSIDynamoDBSink | First implementation | 0.11.0 |
NGSIKafkaSink | First implementation | 0.9.0 |
Batching mechanims | 0.11.0 | |
NGSIMongoSink | First implementation | 0.8.0 |
Hash based collections | 0.8.1 | |
Batching support | 0.12.0 | |
Time and size-based data management policies | 0.13.0 | |
Ignore white space-based attribute values | 1.0.0 | |
NGSIMySQLSink | First implementation | 0.2.0 |
Batching mechanism | 0.10.0 | |
Capping and expiration | 1.7.0 | |
NGSISTHSink | First implementation | 0.8.0 |
Hash based collections | 0.8.1 | |
TimeInstant metadata as reception time | 0.12.0 | |
Batching mechanism | 0.13.0 | |
Time and size-based data management policies | 0.13.0 | |
String-based aggregation (occurrences) | 1.0.0 | |
Ignore white space-based attribute values | 1.0.0 | |
NGSIPostgreSQLSink | First implementation | 0.12.0 |
NGSIPostgisLSink | First implementation | 1.12.0 |
NGSICartoDBSink | First implementation (raw-historic analysis) | 1.0.0 |
Distance-historic analysis | 1.1.0 | |
Multi tenancy support | 1.1.0 | |
Orion's geo:json support | 1.6.0 | |
Raw-snapsot analysis | 1.6.0 | |
NGSIOrionSink | First implementation | 1.10.0 |
NGSIElasticsearchSink | First implementation | 1.15.0 |
NGSIArcgisFeatureTableSink | First implementation (as NGSIArcGisSink) | 1.16.0 |
NGSITestSink | First implementation | 0.7.0 |
Batching mechanism | 0.12.0 | |
All sinks | Events TTL | 0.4.1 |
Pattern-based grouping | 0.5.0 | |
Infinite events TTL | 0.7.0 | |
enable/disable Grouping Rules | 0.9.0 | |
Data model configuration | 0.12.0 | |
enable/disable forced lower case | 0.13.0 | |
Per batch TTL | 0.13.0 | |
New encoding | 1.3.0 | |
Name mappings | 1.4.0 | |
API | Grouping Rules | 0.13.0 |
Subscriptions | 1.0.0 | |
Agents and instances | 1.2.0 | |
Logs | 1.4.0 | |
Metrics | 1.7.0 |
Any doubt you may have, please refer to the Cygnus Core Team.