YATS Features
Use cases
Yats was born with the following features in mind:
-
IOT Data
- Single sensor, single row
- Multiple sensors, Json map
- Multiple sensors, custom protobuf
-
Embedded Client in C
-
Application Logging
- Simple text in Base64
- Json attributes + Base64
- Java Filter
- Go Middleware
- More
-
Event reporting
- SIEM applications
- Legal compliance
Other Features
Analytics
So far Yats does not provide many primitive functions, but the mid-term plan is to rely on Golang rich ecosystem in order to support simple statistics and sliding window functions.
Cost concerns
By moving "cold" data to Parquet diskfiles and eventually to data storages compatible with S3, we are able to cut down on the huge costs of maintaining the data live on the database cluster.
Query Language
SQL is not directly supported by Yats, which instead provides a simple syntax usable from the commandline, but it is still possible to run CQL queries directly on top of the Apache Cassandra backend. Also the Parquet files in the cold storage will be easily queried in SQL with tools like DuckDB, which provides quite a number of useful functions.
Addressing Data with High Cardinality
High Cardinality relates to columns with a high number of distinct possible values.
Yats uses Cassandra as persistence engine; as such it addresses data with high cardinality by relying on proper schema and Cassandra sharding different partition keys across different nodes in the cluster.
As probably known, Cassandra makes use of consistent hashing to distribute keys across the cluster.
In the case of IOT data, the metrics are partitioned by IdClient, Name and clustered by timestamp, where the Client is the actual device.
This means that data from different devices can be distributed across the cluster, storing data for the same device in physically contiguous location, possibly in chronological order, in order to reduce the number of disk seeks and achieve good performances.
To the main page about Yats
[cassandra] [backend] [java] [golang]