BangDB Features - High level

Following is the gist of all the features supported by BangDB.

Data Model:	         KV, Doc, binary data, large files, Graph, time-series
Buffer pool: 	     Page cache, manages every single byte of data.
Index:		         Primary, secondary, composite, nested, reversed, geo-index, Vector*
WAL:		         Transaction, durability, crash recovery, availability
IO-Layer:			 SSD as RAM+, high performance IO, predictive IO
Deployment:	         Embedded, Client/Server, p2p distributed, Hybrid, Cloud
Enterprise grade:    Data replication, disaster recovery, business continuity
Security:		 	 End-to-end TLS/SSL based, user service and API key for auth
Stream:		 	     Time-series, ETL, statistics, aggregates, CEP, anomaly, pattern
AI:		             ML, IE, DL, train, prediction on stream, AutoML, version, deploy
Graph:		 	     Graph data platform, Cypher query, Ontology
Cloud platform:	 	 Ampere, an interactive front-end platform on cloud
Performance:	 	 200K+IOPS, 20K+ Events/sec - per commodity machine
Language:	         Database - C/C++, clients - C/C++, Java, C#, Python
Connect:	         Custom clients, CLI, REST API
License:		 	 Free - BSD 3, SaaS, Enterprise - Custom

Multi modal database

BangDB supports many kinds of data types and works with them in interlinked manner as required. Follow table lists some of the different kinds of data supported as of now which allows us to store different data in their native and natural forms within the database.

Key Value 	          - Value as opaque
Document 	          - Json Doc
Timeseries 	          - Real-time events
Graph 		          - Linked data, triple
Model, Video, audio   - Large binary files
ML Models	          - AutoML

Stream Processing

Most of the data is flowing in streaming manner, these are time-series data that come in real-time, value diminishing with time. Therefore, they should be ingested and processed in real-time as well to extract the intelligence for several actions to improve the ongoing operations. The process also ensure that we find anomalies, patterns and interesting events in a continuous and real-time manner that could be used for several automations.

Stream processing in BangDB is designed and implemented to be hyper real-time. The processing of data and several insights are extracted while data is still not persisted. BangDB operates on every single event to perform several processing to achieve several business goals. BangDB deals with stream as a native construct, and for a use case implementation, it will keep dealing with set of streams to achieve the goal. The stream can refer with other stream to enrich itself or join with other stream to create another stream or filter from the current stream to output into another stream and so on. These can be done recursively and iteratively, therefore, BangDB implements a stream processing engine which is powerful and has many different capabilities naturally available to it when it comes to processing the data.

ETL - BangDB can transform the data and enrich it during the event ingestion.
Refer - One stream can refer another stream to enrich the data or take some action.
Running Statistics - BangDB compute statistics as events are ingested, in continuous real-time manner (min, max, count, average, std dev, ex Kurtosis etc…)
Group by - We can compute the statistics while grouping the attributes by value.
TopK - streaming topk for attributes, which is mem and time bounded processing.
Unique count - Using hyper log log.
Join - A stream can join with other and produce third stream.
CEP - Complex event processing for finding complex patterns, anomalies, and events of interest, for auto actions.
Graph processing - Auto update of the graph underneath for various triples (subject, object, and predicate each with properties)
1Filter - Use custom logic to filter data and send it to another stream or processing.
1Entity - Various attribute long term statistics for decision making.
1PRED - use model to predict and enrich the event or act.
1Table - Refer to another table, a document or otherwise for enriching data.
1Train - use streaming data to train models, use them and retrain as required.
1Corelate - predictive or statistical techniques for correlation of attributes or events.
1Forward - Simple bundle and forward a set of data to another stream or table.
1Work - Send data to CRM, or Work Management Systems in automated manner.
1Notifications - Alerts, auto actions, drill down.
1Sliding window - This is a continuous sliding window for continuous analysis.

Some of the highlights of the BangDB stream processing is as follows.

Time series data processing for Real-time analytics Sliding window (continuous) for limitless processing Running aggregates and statistics Training, prediction on streaming data Data governance and integrity enforcing can be done using stream schema
Complex Event Processing - State based Anomaly & pattern detection in real time, Automated Take actions in automated manner
Continuous ingestion Time series data and event from anywhere, any device or files or systems	Agents, JavaScript, SNMP, Kafka, Files, Logs, Applications, Devices, Machines, Vehicles, Sensors, Cameras - Any data from Anywhere
Continuous processing of the Time series data and events	Auto ETL, Pattern and Anomaly detection, Running statistics, Complex Event Processing
Auto action from the stream processing layer	Trigger actions in automated manner, Notifications, integrations with Systems.
Handle large amount of data using much lesser resources.	Sliding window allows us to process enormous amount of data using much lesser resources in continuous and high-performance manner.