18. MAADSTML Python Library API
Multi-Agent Accelerator for Data Science Using Transactional Machine Learning (MAADSTML)
Source: MAADSTML Python Library
Revolutionizing Data Stream Science with Transactional Machine Learning
Important
The MAADSTML python library is a very powerful library that accelerates TML solution builds. While this library includes many functions, the TSS automatically calls the core functions for your TML solution - considerably reducing the effort on your part.
Refer to Where Do I Start?
Overview
MAADSTML combines Artificial Intelligence, ChatGPT, PrivateGPT, Auto Machine Learning with Data Streams Integrated with Apache Kafka (or Redpanda) to create frictionless and elastic machine learning solutions.
This library allows users to harness the power of agent-based computing using hundreds of advanced linear and non-linear algorithms. Users can easily integrate Predictive Analytics, Prescriptive Analytics, Pre-Processing, and Optimization in any data stream solution by wrapping additional code around the functions below. It connects with Apache KAFKA brokers for cloud based computing using Kafka (or Redpanda) as the data backbone.
If analysing MILLIONS of IoT devices, you can easily deploy thousands of VIPER/HPDE instances in Kubernetes Cluster in AWS/GCP/Azure.
It uses VIPER as a KAFKA connector and seamlessly combines Auto Machine Learning, with Real-Time Machine Learning, Real-Time Optimization and Real-Time Predictions while publishing these insights in to a Kafka cluster in real-time at scale, while allowing users to consume these insights from anywhere, anytime and in any format.
It also HPDE as the AutoML technology for TML. Linux/Windows/Mac versions can be downloaded from [Github](https://github.com/smaurice101/transactionalmachinelearning)
It uses VIPERviz to visualize streaming insights over HTTP(S). Linux/Windows/Mac versions can be downloaded from [Github](https://github.com/smaurice101/transactionalmachinelearning)
MAADSTML details can be found in the book: [Transactional Machine Learning with Data Streams and AutoML](https://www.amazon.com/Transactional-Machine-Learning-Streams-AutoML/dp/1484270223)
To install this library a request should be made to support@otics.ca for a username and a MAADSTOKEN. Once you have these credentials then install this Python library.
- Compatibility
Python 3.8 or greater
Minimal Python skills needed
- Copyright
Author: Sebastian Maurice, PhD
- Installation
At the command prompt write: pip install maadstml - This assumes you have [Downloaded Python](https://www.python.org/downloads/) and installed it on your computer.
- MAADS-VIPER Connector to Manage Apache KAFKA:
MAADS-VIPER python library connects to VIPER instances on any servers; VIPER manages Apache Kafka. VIPER is REST based and cross-platform that can run on windows, linux, MAC, etc.. It also fully supports SSL/TLS encryption in Kafka brokers for producing and consuming.
- **TML is integrated with PrivateGPT (https://github.com/imartinez/privateGPT), which is a production ready GPT, that is 100% Local, 100% Secure and 100% FREE GPT Access.
Users need to PULL and RUN one of the privateGPT Docker containers:
Docker Hub: maadsdocker/tml-privategpt-no-gpu-amd64 (without NVIDIA GPU for AMD64 Chip)
Docker Hub: maadsdocker/tml-privategpt-with-gpu-amd64 (with NVIDIA GPU for AMD64 Chip)
Docker Hub: maadsdocker/tml-privategpt-no-gpu-arm64 (without NVIDIA GPU for ARM64 Chip)
Docker Hub: maadsdocker/tml-privategpt-with-gpu-arm64 (with NVIDIA GPU for ARM64 Chip)
Additional details are here: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
TML accesses privateGPT container using REST API.
For PrivateGPT production deployments it is recommended that machines have the NVIDIA GPU as this will lead to significant performance improvements.
pgptingestdocs - Set Context for PrivateGPT by ingesting PDFs or text documents. All responses will then use these documents for context.
pgptgetingestedembeddings - After documents are ingested, you can retrieve the embeddings for the ingested documents. These embeddings allow you to filter the documents for specific context.
pgptchat - Send any prompt to privateGPT (with or without context) and get back a response.
pgptdeleteembeddings - Delete embeddings.
pgpthealth - Check the health of the privateGPT http server.
vipermirrorbrokers - Migrate data streams from (mutiple) brokers to (multiple) brokers FAST! In one simple function you have the
- power to migrate from hundreds of brokers with hundreds of topics and partitions to any other brokers
with ease. Viper ensures no duplication of messages and translates offsets from last committed. Every transaction is logged, making data validation and auditability a snap. You can also increase or decrease partitions and apply filter to topics to copy over.
viperstreamquery - Query multiple streams with conditional statements. For example, if you preprocessed multiple streams you can
query them in real-time and extract powerful insights. You can use >, <, =, AND, OR.
viperstreamquerybatch - Query multiple streams with conditional statements. For example, if you preprocessed multiple streams you can
- query them in real-time and extract powerful insights. You can use >, <, =, AND, OR. Batch allows you to query
multiple IDs at once.
viperlisttopics - List all topics in Kafka brokers
viperdeactivatetopic - Deactivate topics in kafka brokers and prevent unused algorithms from consuming storage and computing resources that cost money
viperactivatetopic - Activate topics in Kafka brokers
vipercreatetopic - Create topics in Kafka brokers
viperstats - List all stats from Kafka brokers allowing VIPER and KAFKA admins with a end-end view of who is producing data to algorithms, and who is consuming the insights from the algorithms including date/time stamp on the last reads/writes to topics, and how many bytes were read and written to topics and a lot more
vipersubscribeconsumer - Admins can subscribe consumers to topics and consumers will immediately receive insights from topics. This also gives admins more control of who is consuming the insights and allows them to ensures any issues are resolved quickly in case something happens to the algorithms.
viperunsubscribeconsumer - Admins can unsubscribe consumers from receiving insights, this is important to ensure storage and compute resources are always used for active users. For example, if a business user leaves your company or no longer needs the insights, by unsubscribing the consumer, the algorithm will STOP producing the insights.
viperhpdetraining - Users can do real-time machine learning (RTML) on the data in Kafka topics. This is very powerful and useful for “transactional learnings” on the fly using our HPDE technology. HPDE will find the optimal algorithm for the data in less than 60 seconds.
viperpreprocessrtms - Users can use this function to mesh TEXT data with TML machine learning output to cross-reference entities with TEXT files like log files. This function is a
very powerful function that incorporates “past memory” of real-time events using sliding time windows. For details see: How TML incorporates real-time memory using sliding time windows
viperhpdetrainingbatch - Users can do real-time machine learning (RTML) on the data in Kafka topics. This is very powerful and useful for “transactional learnings” on the fly using
our HPDE technology. HPDE will find the optimal algorithm for the data in less than 60 seconds. Batch allows you to perform ML on multiple IDs at once.
viperhpdepredict - Using the optimal algorithm - users can do real-time predictions from streaming data into Kafka Topics.
viperhpdepredictprocess - Using the optimal algorithm you can determine object ranking based on input data. For example, if you want to know which human or machine is the
best or worst given input data then this function will return the best or worst human or machine.
viperhpdepredictbatch - Using the optimal algorithm - users can do real-time predictions from streaming data into Kafka Topics. Batch allows you to perform predictions
on multiple IDs at once.
viperhpdeoptimize - Users can even do optimization to MINIMIZE or MAXIMIZE the optimal algorithm to find the BEST values for the independent variables that will minimize or maximize the dependent variable.
viperhpdeoptimizebatch - Users can even do optimization to MINIMIZE or MAXIMIZE the optimal algorithm to find the BEST values for the independent variables that will minimize or maximize the dependent
variable. Batch allows you to optimize multiple IDs at once.
viperproducetotopic - Users can produce to any topics by injesting from any data sources.
viperproducetotopicbulk - Users can produce to any topics by injesting from any data sources. Use this function to write bulk transactions at high speeds. With the right architecture and network you can stream 1 million transactions per second (or more).
viperconsumefromtopic - Users can consume from any topic and graph the data.
viperconsumefromtopicbatch - Users can consume from any topic and graph the data. Batch allows you to consume from multiple IDs at once.
viperconsumefromstreamtopic - Users can consume from a multiple stream of topics at once
vipercreateconsumergroup - Admins can create a consumer group made up of any number of consumers. You can add as many partitions for the group in the Kafka broker as well as specify the replication factor to ensure high availaibility and no disruption to users who consume insights from the topics.
viperconsumergroupconsumefromtopic - Users who are part of the consumer group can consume from the group topic.
viperproducetotopicstream - Users can join multiple topic streams and produce the combined results to another topic.
viperpreprocessproducetotopicstream - Users can pre-process data streams using the following functions: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, DIFFMARGIN, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y,VARIED,
- ANOMPROB,ANOMPROBX-Y,ENTROPY, AUTOCORR, TREND, CONSISTENCY, IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean,
CV (coefficient of Variation),Mad (Mean absolute deviation), Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the
- average time in seconds between consecutive dates.. Spikedetect uses a Zscore method to detect
spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Geodiff (returns distance in Kilometers between two lat/long points)
- Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from
current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
CONSISTENCY checks if the data all have consistent data types. Returns 1 for consistent data types, 0 otherwise.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
- ANOMPROB=Anomaly Probability, it will run several algorithms on the data stream window to determine a probability percentage of
anomalous behaviour. This can be cross-referenced with other process types. This is very useful if you want to extract aggregate values that you can then use to build TML models and/or make decisions to prevent issues. ENTROPY will compute the amount of information in the data stream. AUTOCORR will run a autocorrelation regression: Y = Y (t-1), to indicate how previous value correlates with future
value. TREND will run a linear regression of Y = f(Time), to determine if the data in the stream are increasing or decreasing.
- ANOMPROBX-Y (similar to OUTLIERSX-Y), where X and Y are numbers or “n”, if “n” means examine all anomalies for recurring patterns.
They allow you to check if the anomalies in the streams are truly anomalies and not some
pattern. For example, if a IoT device shuts off and turns on again routinely, this may be picked up as an anomaly when in fact it is normal behaviour. So, to ignore these cases, if ANOMPROB2-5, this tells Viper, check anomalies with patterns of 2-5 peaks. If the stream has two classes and these two classes are like 0 and 1000, and show a pattern, then they should not be considered an anomaly. Meaning, class=0, is the device shutting down, class=1000 is the device turning back on. If ANOMPROB3-10, Viper will check for patterns of classes 3 to 10 to see if they recur routinely. This is very helpful to reduce false positives and false negatives.
viperpreprocessbatch - This function is similar to viperpreprocessproducetotopicstream the only difference is you can specify multiple
- tmlids in Topicid field. This allows you to batch process multiple tmlids at once. This is very useful if using
kubernetes architecture.
vipercreatejointopicstreams - Users can join multiple topic streams
vipercreatetrainingdata - Users can create a training data set from the topic streams for Real-Time Machine Learning (RTML) on the fly.
vipermodifyconsumerdetails - Users can modify consumer details on the topic. When topics are created an admin must indicate name, email, location and description of the topic. This helps to better manage the topic and if there are issues, the admin can contact the individual consuming from the topic.
vipermodifytopicdetails - Users can modify details on the topic. When topics are created an admin must indicate name, email, location and description of the topic. This helps to better manage the topic and if there are issues, the admin can contact the developer of the algorithm and resolve issue quickly to ensure disruption to consumers is minimal.
vipergroupdeactivate - Admins can deactive a consumer group, which will stop all insights being delivered to consumers in the group.
vipergroupactivate - Admins can activate a group to re-start the insights.
viperdeletetopics - Admins can delete topics in VIPER database and Kafka clusters.
viperanomalytrain - Perform anomaly/peer group analysis on text or numeric data stream using advanced unsupervised learning. VIPER automatically joins
- streams, and determines the peer group of “usual” behaviours using proprietary algorithms, which are then used to predict anomalies with
viperanomalypredict in real-time. Users can use several parameters to fine tune the peer groups.
VIPER is one of the very few, if not only, technology to do anomaly/peer group analysis using unsupervised learning on data streams with Apache Kafka.
viperanomalytrainbatch - Batch allows you to perform anomaly training on multiple IDs at once.
viperanomalypredict - Predicts anomalies for text or numeric data using the peer groups found with viperanomalytrain. VIPER automatically joins streams and compares each value with the peer groups and determines if a value is anomalous in real-time. Users can use several parameters to fine tune the analysis.
VIPER is one of the very few, if not only, technology to do anomaly detection/predictions using unsupervised learning on data streams with Apache Kafka.
viperanomalypredictbatch - Batch allows you to perform anomaly prediction on multiple IDs at once.
viperstreamcorr - Performs streaming correlations by joining multiple data streams with 2 variables. Also performs cross-correlations with 4 variables.
This is a powerful function and can offer important correlation signals between variables. Will also correlate TEXT using natural language processing (NLP).
viperpreprocesscustomjson - Immediately start processing ANY RAW JSON data in minutes. This is useful if you want to start processing data quickly.
viperstreamcluster - Perform cluster analysis on streaming data. This uses K-Means clustering with Euclidean or EuclideanSquared algorithms to compute
- distance. It is a very useful function if you want to determine common behaviours between devices, patients, or other entities.
Users can also setup email alerts if specific clusters are found.
vipersearchanomaly - Perform advanced analysis for user search. This function is useful if you want to monitor what people are searching for, and determine
if the searches are anamolous and differ from the peer group of “normal” search behaviour.
vipernlp - Perform advanced natural language summary of PDFs.
viperchatgpt - Start a conversation with ChatGPT in real-time and stream responses.
viperexractpdffields - Extracts fields from PDF file
viperexractpdffieldbylabel - Extracts fields from PDF file by label name.
videochatloadresponse - Analyse videos with video chatgpt. This is a powerful GPT LLM that will understand and reason with videos frame by frame.
It will also understand the spatio-temporal frames in the video. Video gpt runs in a container.
areyoubusy - If deploying thousands of VIPER/HPDE binaries in a Kubernetes cluster - you can broadcast a ‘areyoubusy’ message to all VIPER and HPDE
- binaries, and they will return back the HOST/PORT if they are NOT busy with other tasks. This is very convenient for dynamically managing
enormous load among VIPER/HPDE and allows you to dynamically assign HOST/PORT to non-busy VIPER/HPDE microservices.
First import the Python library.
import maadstml
1. maadstml.viperstats(vipertoken,host,port=-999,brokerhost=’’,brokerport=-999,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: A JSON formatted object of all the Kafka broker information.
- **2. maadstml.vipersubscribeconsumer(vipertoken,host,port,topic,companyname,contactname,contactemail,
location,description,brokerhost=’’,brokerport=-999,groupid=’’,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to subscribe to in Kafka broker
companyname : string, required
Company name of consumer
contactname : string, required
Contact name of consumer
contactemail : string, required
Contact email of consumer
location : string, required
Location of consumer
description : string, required
Description of why consumer wants to subscribe to topic
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
groupid : string, optional
Subscribe consumer to group
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Consumer ID that the user must use to receive insights from topic.
- **3. maadstml.viperunsubscribeconsumer(vipertoken,host,port,consumerid,brokerhost=’’,brokerport=-999,
microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
consumerid : string, required
Consumer id to unsubscribe
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
RETURNS: Success/failure
- **4. maadstml.viperproducetotopic(vipertoken,host,port,topic,producerid,enabletls=0,delay=100,inputdata=’’,maadsalgokey=’’,
maadstoken=’’,getoptimal=0,externalprediction=’’,subtopics=’’,topicid=-999,identifier=’’,array=0,brokerhost=’’, brokerport=-999,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic or Topics to produce to. You can separate multiple topics by a comma. If using multiple topics, you must have the same number of producer ids (separated by commas), and same number of externalprediction (separated by commas). Producing to multiple topics at once is convenient for synchronizing the timing of streams for machine learning.
subtopic : string, optional
Enter sub-topic streams. This is useful if you want to reduce the number of topics/partitions in Kafka by adding sub-topics in the main topic.
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, with 10 subtopic streams you can assign a Topicid to each IoT device and each of the 10 subtopics will be associated to each IoT device. This way, you do not create 10,000 streams, but just 1 Main Topic stream, and VIPER will add the 10,000 streams in the one topic. This will also drastically reduce the partition costs. You can also create custom machine learning models, predictions, and optimization for each 1000 IoT devices quickly: It is very powerful.
“array* : int, optional
You can stream multiple variables at once, and use array=1 to specify that the streams are an array. This is similar to streaming 1 ROW in a database, and useful if you want to synchonize variables for machine learning. For example, if a device produces 3 streams: stream A, stream B, stream C, and rather than streaming A, B, C separately you can add them to subtopic=”A,B,C”, and externalprediction=”value_FOR_A,value_FOR_B,value_FOR_C”, then specify array=1, then when you do machine learning on this data, the variables A, B, C are date/time synchronized and you can choose which variable is the depdendent variable in viperhpdetraining function.
identifier : string, optional
You can add any string identifier for the device. For examaple, DSN ID, IoT device id etc..
producerid : string, required
Producer ID of topic to produce to in the Kafka broker
enabletls : int, optional
Set to 1 if Kafka broker is enabled with SSL/TLS encryption, otherwise 0 for plaintext.
delay: int, optional
Time in milliseconds from VIPER backsout from writing messages
inputdata : string, optional
This is the inputdata for the optimal algorithm found by MAADS or HPDE
maadsalgokey : string, optional
This should be the optimal algorithm key returned by maadstml.dotraining function.
maadstoken : string, optional - If the topic is the name of the algorithm from MAADS, then a MAADSTOKEN must be specified to access the algorithm in the MAADS server
getoptimal: int, optional - If you used the maadstml.OPTIMIZE function to optimize a MAADS algorithm, then if this is 1 it will only retrieve the optimal results in JSON format.
externalprediction : string, optional - If you are using your own custom algorithms, then the output of your algorithm can be still used and fed into the Kafka topic.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns the value produced or results retrieved from the optimization.
- **4.1. maadstml.viperproducetotopicbulk(vipertoken,host,port,topic,producerid,inputdata,partitionsize=100,enabletls=1,delay=100,
brokerhost=’’,brokerport=-999,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic or Topics to produce to. You can separate multiple topics by a comma. If using multiple topics, you must have the same number of producer ids (separated by commas), and same number of externalprediction (separated by commas). Producing to multiple topics at once is convenient for synchronizing the timing of streams for machine learning.
producerid : string, required
Producer ID of topic to produce to in the Kafka broker. Separate multiple producer ids with comma.
inputdata : string, required
You can write multiple transactions to each topic. Each group of transactions must be separated by a tilde. Each transaction in the group must be separate by a comma. The number of groups must match the producerids and topics. For example, if you are writing to two topics: topic1,topic2, then the inputdata should be: trans1,transn2,…,transnN~trans1,transn2,…,transnN. The number of transactions and topics can be any number. This function can be very powerful if you need to analyse millions or billions of transactions very quickly.
partitionsize : int, optional
This is the number of partitions of the inputdata. For example, if your transactions=10000, then VIPER will create partitions of size 100 (if partitionsize=100) resulting in 100 threads for concurrency. The higher the partitionsize, the lower the number of threads. If you want to streams lots of data fast, then a partitionzie of 1 is the fastest but will come with overhead because more RAM and CPU will be consumed.
enabletls : int, optional
Set to 1 if Kafka broker is enabled with SSL/TLS encryption, otherwise 0 for plaintext.
delay: int, optional
Time in milliseconds from VIPER backsout from writing messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: None
- **5. maadstml.viperconsumefromtopic(vipertoken,host,port,topic,consumerid,companyname,partition=-1,enabletls=0,delay=100,offset=0,
brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=’-999’,rollbackoffsets=0,preprocesstype=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to consume from in the Kafka broker
preprocesstype : string, optional
If you only want to search for record that have a particular processtype, you can enter: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, DIFFMARGIN, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB,ANOMPROBX-Y,ENTROPY, AUTOCORR, TREND, CONSISTENCY, Unique, Uniquestr, Geodiff (returns distance in Kilometers between two lat/long points) IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation), Skewness, Kurtosis, Spikedetect, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
CONSISTENCY checks if the data all have consistent data types. Returns 1 for consistent data types, 0 otherwise.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
ANOMPROB=Anomaly probability, it will run several algorithms on the data stream window to determine a probaility of anomalous behaviour. This can be cross-refenced with OUTLIERS. It can be very powerful way to detection issues with devices.
ANOMPROBX-Y (similar to OUTLIERSX-Y), where X and Y are numbers, or “n”. If “n”, means examine all anomalies for patterns. They allow you to check if the anomalies in the streams are truly anomalies and not some pattern. For example, if a IoT device shuts off and turns on again routinely, this may be picked up as an anomaly when in fact it is normal behaviour. So, to ignore these cases, if ANOMPROB2-5, this tells Viper, check anomalies with patterns of 2-5 peaks. If the stream has two classes and these two classes are like 0 and 1000, and show a pattern, then they should not be considered an anomaly. Meaning, class=0, is the device shutting down, class=1000 is the device turning back on. If ANOMPROB3-10, Viper will check for patterns of classes 3 to 10 to see if they recur routinely. This is very helpful to reduce false positives and false negatives.
topicid : string, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can consume on a per device by entering its topicid that you gave when you produced the topic stream. Or, you can read from multiple topicids at the same time. For example, if you have 10 ids, then you can specify each one separated by a comma: 1,2,3,4,5,6,7,8,9,10 VIPER will read topicids in parallel. This can drastically speed up consumption of messages but will require more CPU.
rollbackoffsets : int, optional, enter value between 0 and 100
This will rollback the streams by this percentage. For example, if using topicid, the main stream is rolled back by this percentage amount.
consumerid : string, required
Consumer id associated with the topic
companyname : string, required
Your company name
partition : int, optional
set to Kafka partition number or -1 to autodect
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
offset: int, optional
Offset to start the reading from..if 0 then reading will start from the beginning of the topic. If -1, VIPER will automatically go to the last offset. Or, you can extract the LastOffet from the returned JSON and use this offset for your next call.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the contents read from the topic.
- **5.1 maadstml.viperconsumefromtopicbatch(vipertoken,host,port,topic,consumerid,companyname,partition=-1,enabletls=0,delay=100,offset=0,
brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=’-999’,rollbackoffsets=0,preprocesstype=’’,timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
topic : string, required
Topic to consume from in the Kafka broker
preprocesstype : string, optional
If you only want to search for record that have a particular processtype, you can enter: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, DIFFMARGIN, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB,ANOMPROBX-Y,ENTROPY, AUTOCORR, TREND, IQR (InterQuartileRange), Midhinge, CONSISTENCY, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation), Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Geodiff (returns distance in Kilometers between two lat/long points) Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
CONSISTENCY checks if the data all have consistent data types. Returns 1 for consistent data types, 0 otherwise.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
ANOMPROB=Anomaly probability, it will run several algorithms on the data stream window to determine a probaility of anomalous behaviour. This can be cross-refenced with OUTLIERS. It can be very powerful way to detection issues with devices.
ANOMPROBX-Y (similar to OUTLIERSX-Y), where X and Y are numbers, or “n”. If “n”, means examine all anomalies for patterns. They allow you to check if the anomalies in the streams are truly anomalies and not some pattern. For example, if a IoT device shuts off and turns on again routinely, this may be picked up as an anomaly when in fact it is normal behaviour. So, to ignore these cases, if ANOMPROB2-5, this tells Viper, check anomalies with patterns of 2-5 peaks. If the stream has two classes and these two classes are like 0 and 1000, and show a pattern, then they should not be considered an anomaly. Meaning, class=0, is the device shutting down, class=1000 is the device turning back on. If ANOMPROB3-10, Viper will check for patterns of classes 3 to 10 to see if they recur routinely. This is very helpful to reduce false positives and false negatives.
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can consume on a per device by entering its topicid that you gave when you produced the topic stream. Or, you can read from multiple topicids at the same time. For example, if you have 10 ids, then you can specify each one separated by a comma: 1,2,3,4,5,6,7,8,9,10 VIPER will read topicids in parallel. This can drastically speed up consumption of messages but will require more CPU. VIPER will consume continously from topic ids.
rollbackoffsets : int, optional, enter value between 0 and 100
This will rollback the streams by this percentage. For example, if using topicid, the main stream is rolled back by this percentage amount.
consumerid : string, required
Consumer id associated with the topic
companyname : string, required
Your company name
partition : int, optional
set to Kafka partition number or -1 to autodect
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
offset: int, optional
Offset to start the reading from..if 0 then reading will start from the beginning of the topic. If -1, VIPER will automatically go to the last offset. Or, you can extract the LastOffet from the returned JSON and use this offset for your next call.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the contents read from the topic.
- **6. maadstml.viperhpdepredict(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
hpdehost,inputdata,maxrows=0,algokey=’’,partition=-1,offset=-1,enabletls=1,delay=1000,hpdeport=-999,brokerhost=’’, brokerport=-999,timeout=120,usedeploy=0,microserviceid=’’,topicid=-999, maintopic=’’, streamstojoin=’’, array=0,pathtoalgos=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, with 10 subtopic streams you can assign a Topicid to each IoT device and each of the 10 subtopics will be associated to each IoT device. This way, you can do predictions for each IoT using its own custom ML model.
pathtoalgos : string, required
Enter the full path to the root folder where the algorithms are stored.
maintopic : string, optional
This is the name of the topic that contains the sub-topic streams.
array : int, optional
Set array=1 if you produced data (from viperproducetotopic) as an array.
streamstojoin : string, optional
These are the sub-topics you are streaming into maintopic. To do predictions, VIPER will automatically join these streams to create the input data for predictions for each Topicid.
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
inputdata: string, required
This is a comma separated list of values that represent the independent variables in your algorithm. The order must match the order of the independent variables in your algorithm. OR, you can enter a data stream that contains the joined topics from vipercreatejointopicstreams.
maxrows: int, optional
Use this to rollback the stream by maxrows offsets. For example, if you want to make 1000 predictions then set maxrows=1000, and make 1000 predictions from the current offset of the independent variables.
algokey: string, optional
If you know the algorithm key that was returned by VIPERHPDETRAIING then you can specify it here. Specifying the algokey can drastically speed up the predictions.
partition : int, optional
If you know the kafka partition used to store data then specify it here. Most cases Kafka will dynamically store data in partitions, so you should use the default of -1 to let VIPER find it.
offset : int, optional
Offset to start consuming data. Usually you can use -1, and VIPER will get the last offset.
hpdehost: string, required
Address of HPDE
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encryted traffic, otherwise 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
hpdeport: int, required
Port number HPDE is listening on
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
usedeploy : int, optional
If 0 will use algorithm in test, else if 1 use in production algorithm.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the prediction.
- **6.1 maadstml.viperhpdepredictbatch(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
hpdehost,inputdata,maxrows=0,algokey=’’,partition=-1,offset=-1,enabletls=1,delay=1000,hpdeport=-999,brokerhost=’’, brokerport=-999,timeout=120,usedeploy=0,microserviceid=’’,topicid=”-999”, maintopic=’’, streamstojoin=’’, array=0,timedelay=0,asynctimeout=120,pathtoalgos=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, with 10 subtopic streams you can assign a Topicid to each IoT device and each of the 10 subtopics will be associated to each IoT device. This way, you can do predictions for each IoT using its own custom ML model. Separate multiple topicids by a comma. For example, topicid=”1,2,3,4,5” and viper will process at once.
pathtoalgos : string, required
Enter the full path to the root folder where the algorithms are stored.
maintopic : string, optional
This is the name of the topic that contains the sub-topic streams.
array : int, optional
Set array=1 if you produced data (from viperproducetotopic) as an array.
streamstojoin : string, optional
These are the sub-topics you are streaming into maintopic. To do predictions, VIPER will automatically join these streams to create the input data for predictions for each Topicid.
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
inputdata: string, required
This is a comma separated list of values that represent the independent variables in your algorithm. The order must match the order of the independent variables in your algorithm. OR, you can enter a data stream that contains the joined topics from vipercreatejointopicstreams.
maxrows: int, optional
Use this to rollback the stream by maxrows offsets. For example, if you want to make 1000 predictions then set maxrows=1000, and make 1000 predictions from the current offset of the independent variables.
algokey: string, optional
If you know the algorithm key that was returned by VIPERHPDETRAIING then you can specify it here. Specifying the algokey can drastically speed up the predictions.
partition : int, optional
If you know the kafka partition used to store data then specify it here. Most cases Kafka will dynamically store data in partitions, so you should use the default of -1 to let VIPER find it.
offset : int, optional
Offset to start consuming data. Usually you can use -1, and VIPER will get the last offset.
hpdehost: string, required
Address of HPDE
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encryted traffic, otherwise 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
hpdeport: int, required
Port number HPDE is listening on
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
usedeploy : int, optional
If 0 will use algorithm in test, else if 1 use in production algorithm.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the prediction.
- **6.2. maadstml.viperhpdepredictprocess(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,hpdehost,inputdata,processtype,maxrows=0,
algokey=’’,partition=-1,offset=-1,enabletls=1,delay=1000,hpdeport=-999,brokerhost=’’,brokerport=9092, timeout=120,usedeploy=0,microserviceid=’’,topicid=-999, maintopic=’’, streamstojoin=’’,array=0,pathtoalgos=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, with 10 subtopic streams you can assign a Topicid to each IoT device and each of the 10 subtopics will be associated to each IoT device. This way, you can do predictions for each IoT using its own custom ML model.
pathtoalgos : string, required
Enter the full path to the root folder where the algorithms are stored.
maintopic : string, optional
This is the name of the topic that contains the sub-topic streams.
array : int, optional
Set array=1 if you produced data (from viperproducetotopic) as an array.
streamstojoin : string, optional
These are the sub-topics you are streaming into maintopic. To do predictions, VIPER will automatically join these streams to create the input data for predictions for each Topicid.
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
inputdata: string, required
This is a comma separated list of values that represent the independent variables in your algorithm. The order must match the order of the independent variables in your algorithm. OR, you can enter a data stream that contains the joined topics from vipercreatejointopicstreams.
processtype: string, required
This must be: max, min, avg, median, trend, all. For example, to find the maximum or the best human or machine. Trend will compute the predictions are trending. Avg is the average of all predictions. Median is the median of predictions. All will produce all predictions.
maxrows: int, optional
Use this to rollback the stream by maxrows offsets. For example, if you want to make 1000 predictions then set maxrows=1000, and make 1000 predictions from the current offset of the independent variables.
algokey: string, optional
If you know the algorithm key that was returned by VIPERHPDETRAIING then you can specify it here. Specifying the algokey can drastically speed up the predictions.
partition : int, optional
If you know the kafka partition used to store data then specify it here. Most cases Kafka will dynamically store data in partitions, so you should use the default of -1 to let VIPER find it.
offset : int, optional
Offset to start consuming data. Usually you can use -1, and VIPER will get the last offset.
hpdehost: string, required
Address of HPDE
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encryted traffic, otherwise 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
hpdeport: int, required
Port number HPDE is listening on
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
usedeploy : int, optional
If 0 will use algorithm in test, else if 1 use in production algorithm.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the prediction.
- **7. maadstml.viperhpdeoptimize(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
hpdehost,partition=-1,offset=-1,enabletls=0,delay=100,hpdeport=-999,usedeploy=0,ismin=1,constraints=’best’, stretchbounds=20,constrainttype=1,epsilon=10,brokerhost=’’,brokerport=-999,timeout=120,microserviceid=’’,topicid=-999)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
consumefrom : string, required
Topic to consume from in the Kafka broker
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform mathematical optimization for each of the 1000 IoT devices using their specific algorithm.
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
partition : int, optional
If you know the kafka partition used to store data then specify it here. Most cases Kafka will dynamically store data in partitions, so you should use the default of -1 to let VIPER find it.
offset : int, optional
Offset to start consuming data. Usually you can use -1, and VIPER will get the last offset.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
hpdeport: int, required
Port number HPDE is listening on
- usedeployint, optional
If 0 will use algorithm in test, else if 1 use in production algorithm.
ismin : int, optional - If 1 then function is minimized, else if 0 the function is maximized
constraints: string, optional
If “best” then HPDE will choose the best values of the independent variables to minmize or maximize the dependent variable. Users can also specify their own constraints for each variable and must be in the following format: varname1:min:max,varname2:min:max,…
stretchbounds: int, optional
A number between 0 and 100, this is the percentage to stretch the bounds on the constraints.
constrainttype: int, optional
If 1 then HPDE uses the min/max of each variable for the bounds, if 2 HPDE will adjust the min/max by their standard deviation, if 3 then HPDE uses stretchbounds to adjust the min/max for each variable.
epsilon: int, optional
Once HPDE finds a good local minima/maxima, it then uses this epsilon value to find the Global minima/maxima to ensure you have the best values of the independent variables that minimize or maximize the dependent variable.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the optimization details and optimal values.
- **7.1 maadstml.viperhpdeoptimizebatch(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
hpdehost,partition=-1,offset=-1,enabletls=0,delay=100,hpdeport=-999,usedeploy=0,ismin=1,constraints=’best’, stretchbounds=20,constrainttype=1,epsilon=10,brokerhost=’’,brokerport=-999,timeout=120,microserviceid=’’,topicid=”-999”, timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
consumefrom : string, required
Topic to consume from in the Kafka broker
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform mathematical optimization for each of the 1000 IoT devices using their specific algorithm. Separate multiple topicids by a comma.
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
partition : int, optional
If you know the kafka partition used to store data then specify it here. Most cases Kafka will dynamically store data in partitions, so you should use the default of -1 to let VIPER find it.
offset : int, optional
Offset to start consuming data. Usually you can use -1, and VIPER will get the last offset.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
hpdeport: int, required
Port number HPDE is listening on
- usedeployint, optional
If 0 will use algorithm in test, else if 1 use in production algorithm.
ismin : int, optional - If 1 then function is minimized, else if 0 the function is maximized
constraints: string, optional
If “best” then HPDE will choose the best values of the independent variables to minmize or maximize the dependent variable. Users can also specify their own constraints for each variable and must be in the following format: varname1:min:max,varname2:min:max,…
stretchbounds: int, optional
A number between 0 and 100, this is the percentage to stretch the bounds on the constraints.
constrainttype: int, optional
If 1 then HPDE uses the min/max of each variable for the bounds, if 2 HPDE will adjust the min/max by their standard deviation, if 3 then HPDE uses stretchbounds to adjust the min/max for each variable.
epsilon: int, optional
Once HPDE finds a good local minima/maxima, it then uses this epsilon value to find the Global minima/maxima to ensure you have the best values of the independent variables that minimize or maximize the dependent variable.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the optimization details and optimal values.
- **8. maadstml.viperhpdetraining(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
- hpdehost,viperconfigfile,enabletls=1,partition=-1,deploy=0,modelruns=50,modelsearchtuner=80,hpdeport=-999,
offset=-1,islogistic=0,brokerhost=’’, brokerport=-999,timeout=120,microserviceid=’’,topicid=-999,maintopic=’’,
- independentvariables=’’,dependentvariable=’’,rollbackoffsets=0,fullpathtotrainingdata=’’,processlogic=’’,
identifier=’’,array=0,transformtype=’’,sendcoefto=’’,coeftoprocess=’’,coefsubtopicnames=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
transformtype : string, optional
You can transform the dependent and independent variables using: log-log, log-lin, lin-log, lin=linear, log=natural log This may be useful if you want to compute price or demand elasticities.
sendcoefto : string, optional
This is the name of the kafka topic that you want to stream the estimated parameters to.
coeftoprocess : string, optional
This is the indexes of the estimated parameters. For example, if the ML model has a constant and two estimated parameters, then coeftoprocess=”0,1,2” means stream constant term (at index 0) and the two estmiated parameters at index 1, and 2.
coefsubtopicnames : string, optional
This is the names for the estimated parameters. For example, “constant,elasticity,elasticity2” would be streamed as kafka topics for coeftoprocess
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can create individual Machine Learning models for each IoT device in real-time. This is a core functionality of TML solutions.
array : int, optional
Set array=1 if the data you are consuming from is an array of multiple streams that you produced from viperproducetotopic in an effort to synchronize data for training.
maintopic : string, optional
This is the maintopic that contains the sub-topc streams.
independentvariables : string, optional
These are the independent variables that are the subtopics.
dependentvariable : string, optional
This is the dependent variable in the subtopic streams.
rollbackoffsets: int, optional
This is the rollback percentage to create the training dataset. VIPER will automatically create a training dataset using the independent and dependent variable streams.
fullpathtotrainingdata: string, optional
This is the FULL path where you want to store the training dataset. VIPER will write file to disk. Make sure proper permissions are granted to VIPER. For example, c:/myfolder/mypath
processlogic : string, optional
You can dynamically build a classification model by specifying how you want to classify the dependent variable by indicating your conditions in the processlogic variable (this will take effect if islogistic=1). For example:
processlogic=’classification_name=my_prob:temperature=20.5,30:humidity=50,55’, means the following:
The name of the dependent variable is specified by classification_name
Then you can specify the conditions on the streams. If your stream is Temperature and humidity, if Temperature is between 20.5 and 30, then my_prob=1, otherwise my_prob=0, and
if Humidity is between 50 and 55, then my_prob=1, otherwise my_prob=0
If you want to specify no upperbound you can use n, or -n for no lowerbound. For example, if temperature=20.5,n, means temperature >=20.5 then my_prob=1
If humidity=-n,55, means humidity<=55 then my_prob=1
This allows you to classify the dependent with any number of variables all in real-time!
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
identifier: string, optional
You can add any name or identifier like DSN ID
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
deploy: int, optional
If deploy=1, this will deploy the algorithm to the Deploy folder. This is useful if you do not want to use this algorithm in production, and just testing it. If just testing, then set deploy=0 (default).
modelruns: int, optional
Number of iterations for model training
modelsearchtuner: int, optional
An integer between 0-100, this variable will attempt to fine tune the model search space. A number close to 0 means you will have lots of models but their quality may be low, a number close to 100 (default=80) means you will have fewer models but their quality will be higher
hpdeport: int, required
Port number HPDE is listening on
offset : int, optional
If 0 will use the training data from the beginning of the topic
islogistic: int, optional
If is 1, the HPDE will switch to logistic modeling, else continous.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the optimal algorithm that best fits your data.
- **8.1 maadstml.viperhpdetrainingbatch(vipertoken,host,port,consumefrom,produceto,companyname,consumerid,producerid,
- hpdehost,viperconfigfile,enabletls=1,partition=-1,deploy=0,modelruns=50,modelsearchtuner=80,hpdeport=-999,
offset=-1,islogistic=0,brokerhost=’’, brokerport=-999,timeout=120,microserviceid=’’,topicid=”-999”,maintopic=’’,
- independentvariables=’’,dependentvariable=’’,rollbackoffsets=0,fullpathtotrainingdata=’’,processlogic=’’,
identifier=’’,array=0,timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can create individual Machine Learning models for each IoT device in real-time. This is a core functionality of TML solutions. Separate multiple topic ids by comma.
array : int, optional
Set array=1 if the data you are consuming from is an array of multiple streams that you produced from viperproducetotopic in an effort to synchronize data for training.
maintopic : string, optional
This is the maintopic that contains the sub-topc streams.
independentvariables : string, optional
These are the independent variables that are the subtopics.
dependentvariable : string, optional
This is the dependent variable in the subtopic streams.
rollbackoffsets: int, optional
This is the rollback percentage to create the training dataset. VIPER will automatically create a training dataset using the independent and dependent variable streams.
fullpathtotrainingdata: string, optional
This is the FULL path where you want to store the training dataset. VIPER will write file to disk. Make sure proper permissions are granted to VIPER. For example, c:/myfolder/mypath
processlogic : string, optional
You can dynamically build a classification model by specifying how you want to classify the dependent variable by indicating your conditions in the processlogic variable (this will take effect if islogistic=1). For example:
processlogic=’classification_name=my_prob:temperature=20.5,30:humidity=50,55’, means the following:
The name of the dependent variable is specified by classification_name
Then you can specify the conditions on the streams. If your stream is Temperature and humidity, if Temperature is between 20.5 and 30, then my_prob=1, otherwise my_prob=0, and
if Humidity is between 50 and 55, then my_prob=1, otherwise my_prob=0
If you want to specify no upperbound you can use n, or -n for no lowerbound. For example, if temperature=20.5,n, means temperature >=20.5 then my_prob=1
If humidity=-n,55, means humidity<=55 then my_prob=1
This allows you to classify the dependent with any number of variables all in real-time!
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
companyname : string, required
Your company name
consumerid: string, required
identifier: string, optional
You can add any name or identifier like DSN ID
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
deploy: int, optional
If deploy=1, this will deploy the algorithm to the Deploy folder. This is useful if you do not want to use this algorithm in production, and just testing it. If just testing, then set deploy=0 (default).
modelruns: int, optional
Number of iterations for model training
modelsearchtuner: int, optional
An integer between 0-100, this variable will attempt to fine tune the model search space. A number close to 0 means you will have lots of models but their quality may be low, a number close to 100 (default=80) means you will have fewer models but their quality will be higher
hpdeport: int, required
Port number HPDE is listening on
offset : int, optional
If 0 will use the training data from the beginning of the topic
islogistic: int, optional
If is 1, the HPDE will switch to logistic modeling, else continous.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the optimal algorithm that best fits your data.
- **9. maadstml.viperproducetotopicstream(vipertoken,host,port,topic,producerid,offset,maxrows=0,enabletls=0,delay=100,
brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999,mainstreamtopic=’’,streamstojoin=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topics to produce to in the Kafka broker - this is a topic that contains multiple topics, VIPER will consume from each topic and write results to the produceto topic
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can join these streams and produce it to one stream,
mainstreamtopic: string, optional
This is the main stream topic that contain the subtopic streams.
streamstojoin: string, optional
These are the streams you want to join and produce to mainstreamtopic.
producerid : string, required
Producerid of the topic producing to
offset : int
If 0 will use the stream data from the beginning of the topics, -1 will automatically go to last offset
maxrows : int, optional
If offset=-1, this number will rollback the streams by maxrows amount i.e. rollback=lastoffset-maxrows
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise 0 for plaintext
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the optimal algorithm that best fits your data.
- **10. maadstml.vipercreatetrainingdata(vipertoken,host,port,consumefrom,produceto,dependentvariable,
independentvariables,consumerid,producerid,companyname,partition=-1,enabletls=0,delay=100, brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
consumefrom : string, required
Topic to consume from
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, with 10 subtopic streams you can assign a Topicid to each IoT device and each of the 10 subtopics will be associated to each IoT device. You can create training dataset for each device.
produceto : string, required
Topic to produce to
dependentvariable : string, required
Topic name of the dependentvariable
independentvariables : string, required
Topic names of the independentvariables - VIPER will automatically read the data streams. Separate multiple variables by comma.
consumerid : string, required
Consumerid of the topic to consume to
producerid : string, required
Producerid of the topic producing to
partition : int, optional
This is the partition that Kafka stored the stream data. Specifically, the streams you joined from function viperproducetotopicstream will be stored in a partition by Kafka, if you want to create a training dataset from these data, then you should use this partition. This ensures you are using the right data to create a training dataset.
companyname : string, required
Your company name
enabletls: int, optional
Set to 1 if Kafka broker is enabled for SSL/TLS encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the training data set.
11. maadstml.vipercreatetopic(vipertoken,host,port,topic,companyname,contactname,contactemail,location, description,enabletls=0,brokerhost=’’,brokerport=-999,numpartitions=1,replication=1,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to create
companyname : string, required
Company name of consumer
contactname : string, required
Contact name of consumer
contactemail : string, required
Contact email of consumer
location : string, required
Location of consumer
description : string, required
Description of why consumer wants to subscribe to topic
enabletls : int, optional
Set to 1 if Kafka is SSL/TLS enabled for encrypted traffic, otherwise 0 for no encryption (plain text)
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
numpartitions: int, optional
Number of the parititons to create in the Kafka broker - more parititons the faster Kafka will produce results.
replication: int, optional
Specificies the number of brokers to replicate to - this is important for failover
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the producer id for the topic.
- **12. maadstml.viperconsumefromstreamtopic(vipertoken,host,port,topic,consumerid,companyname,partition=-1,
enabletls=0,delay=100,offset=0,brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to consume from
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can consume for each device.
consumerid : string, required
Consumerid associated with topic
companyname : string, required
Your company name
partition: int, optional
Set to a kafka partition number, or -1 to autodetect partition.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
offset : int, optional
Offset to start reading from ..if 0 VIPER will read from the beginning
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the contents of all the topics read
- **13. maadstml.vipercreatejointopicstreams(vipertoken,host,port,topic,topicstojoin,companyname,contactname,contactemail,
description,location,enabletls=0,brokerhost=’’,brokerport=-999,replication=1,numpartitions=1,microserviceid=’’, topicid=-999)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to consume from
topicid : int, optional
Topicid represents an id for some entity. Create a joined topic stream per topicid.
topicstojoin : string, required
Enter two or more topics separated by a comma and VIPER will join them into one topic
companyname : string, required
Company name of consumer
contactname : string, required
Contact name of consumer
contactemail : string, required
Contact email of consumer
location : string, required
Location of consumer
description : string, required
Description of why consumer wants to subscribe to topic
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled, otherwise set to 0 for plaintext.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
numpartitions : int, optional
Number of partitions
replication : int, optional
Replication factor
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the producerid of the joined streams
- **14. maadstml.vipercreateconsumergroup(vipertoken,host,port,topic,groupname,companyname,contactname,contactemail,
description,location,enabletls=1,brokerhost=’’,brokerport=-999,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to dd to the group, multiple (active) topics can be separated by comma
groupname : string, required
Enter the name of the group
companyname : string, required
Company name of consumer
contactname : string, required
Contact name of consumer
contactemail : string, required
Contact email of consumer
location : string, required
Location of consumer
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled, otherwise set to 0 for plaintext.
description : string, required
Description of why consumer wants to subscribe to topic
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the groupid of the group.
- **15. maadstml.viperconsumergroupconsumefromtopic(vipertoken,host,port,topic,consumerid,groupid,companyname,
partition=-1,enabletls=0,delay=100,offset=0,rollbackoffset=0,brokerhost=’’,brokerport=-999,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to dd to the group, multiple (active) topics can be separated by comma
consumerid : string, required
Enter the consumerid associated with the topic
groupid : string, required
Enter the groups id
companyname : string, required
Enter the company name
partition: int, optional
set to Kakfa partition number or -1 to autodetect
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled, otherwise set to 0 for plaintext.
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
offset : int, optional
Offset to start reading from. If 0, will read from the beginning of topic, or -1 to automatically go to end of topic.
rollbackoffset : int, optional
The number of offsets to rollback the data stream.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the contents of the group.
16. maadstml.vipermodifyconsumerdetails(vipertoken,host,port,topic,companyname,consumerid,contactname=’’, contactemail=’’,location=’’,brokerhost=’’,brokerport=9092,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to dd to the group, multiple (active) topics can be separated by comma
consumerid : string, required
Enter the consumerid associated with the topic
companyname : string, required
Enter the company name
contactname : string, optional
Enter the contact name
contactemail : string, optional - Enter the contact email
location : string, optional
Enter the location
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
- **17. maadstml.vipermodifytopicdetails(vipertoken,host,port,topic,companyname,partition=0,enabletls=1,
isgroup=0,contactname=’’,contactemail=’’,location=’’,brokerhost=’’,brokerport=9092,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to dd to the group, multiple (active) topics can be separated by comma
companyname : string, required
Enter the company name
partition : int, optional
You can change the partition in the Kafka topic.
enabletls : int, optional
If enabletls=1, then SSL/TLS is enables in Kafka, otherwise if enabletls=0 it is not.
isgroup : int, optional
This tells VIPER whether this is a group topic if isgroup=1, or a normal topic if isgroup=0
contactname : string, optional
Enter the contact name
contactemail : string, optional - Enter the contact email
location : string, optional
Enter the location
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
18. maadstml.viperactivatetopic(vipertoken,host,port,topic,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to activate
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
19. maadstml.viperdeactivatetopic(vipertoken,host,port,topic,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to deactivate
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
20. maadstml.vipergroupactivate(vipertoken,host,port,groupname,groupid,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
groupname : string, required
Name of the group
groupid : string, required
ID of the group
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
21. maadstml.vipergroupdeactivate(vipertoken,host,port,groupname,groupid,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
groupname : string, required
Name of the group
groupid : string, required
ID of the group
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns success/failure
22. maadstml.viperdeletetopics(vipertoken,host,port,topic,enabletls=1,brokerhost=’’,brokerport=9092,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
Topic to delete. Separate multiple topics by a comma.
enabletls : int, optional
If enabletls=1, then SSL/TLS is enable on Kafka, otherwise if enabletls=0, it is not.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
microservice to access viper
23. maadstml.balancebigdata(localcsvfile,numberofbins,maxrows,outputfile,bincutoff,distcutoff,startcolumn=0)
Parameters:
localcsvfile : string, required
Local file, must be CSV formatted.
numberofbins : int, required
The number of bins for the histogram. You can set to any value but 10 is usually fine.
maxrows : int, required
The number of rows to return, which will be a subset of your original data.
outputfile : string, required
Your new data will be writted as CSV to this file.
bincutoff : float, required.
This is the threshold percentage for the bins. Specifically, the data in each variable is allocated to bins, but many times it will not fall in ALL of the bins. By setting this percentage between 0 and 1, MAADS will choose variables that exceed this threshold to determine which variables have data that are well distributed across bins. The variables with the most distributed values in the bins will drive the selection of the rows in your dataset that give the best distribution - this will be very important for MAADS training. Usually 0.7 is good.
distcutoff : float, required.
This is the threshold percentage for the distribution. Specifically, MAADS uses a Lilliefors statistic to determine whether the data are well distributed. The lower the number the better. Usually 0.45 is good.
startcolumn : int, optional
This tells MAADS which column to start from. If you have DATE in the first column, you can tell MAADS to start from 1 (columns are zero-based)
RETURNS: Returns a detailed JSON object and new balaced dataset written to outputfile.
- **24. maadstml.viperanomalytrain(vipertoken,host,port,consumefrom,produceto,producepeergroupto,produceridpeergroup,consumeridproduceto,
streamstoanalyse,companyname,consumerid,producerid,flags,hpdehost,viperconfigfile, enabletls=1,partition=-1,hpdeport=-999,topicid=-999,maintopic=’’,rollbackoffsets=0,fullpathtotrainingdata=’’,
brokerhost=’’,brokerport=9092,delay=1000,timeout=120,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform anomaly detection/predictions for each device.
maintopic : string, optional
This is the maintopic that contains the subtopic streams.
rollbackoffsets: int, optional
This is the percentage to rollback the streams that you are analysing: streamstoanalyse
fullpathtotrainingdata: string, optional
This is the full path to the training dataset to use to find peer groups.
producepeergroupto : string, required
Topic to produce the peer group for anomaly comparisons
produceridpeergroup : string, required
Producerid for the peer group topic
consumeridproduceto : string, required
Consumer id for the Produceto topic
streamstoanalyse : string, required
Comma separated list of streams to analyse for anomalies
flags : string, required
These are flags that will be used to select the peer group for each stream. The flags must have the following format: topic=[topic name],topictype=[numeric or string],threshnumber=[a number between 0 and 10000, i.e. 200], lag=[a number between 1 and 20, i.e. 5],zthresh=[a number between 1 and 5, i.e. 2.5],influence=[a number between 0 and 1 i.e. 0.5]
threshnumber: decimal number to determine usual behaviour - only for numeric streams, numbers are compared to the centroid number, a standardized distance is taken and all numbers below the thresholdnumeric are deemed as usual i.e. thresholdnumber=200, any value below is close to the centroid - you need to experiment with this number.
lag: number of lags for the moving mean window, works to smooth the function i.e. lag=5
zthresh: number of standard deviations from moving mean i.e. 3.5
influence: strength in identifying outliers for both stationary and non-stationary data, i.e. influence=0 ignores outliers when recalculating the new threshold, influence=1 is least robust. Influence should be between (0,1), i.e. influence=0.5
Flags must be provided for each topic. Separate multiple flags by ~
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
hpdeport: int, required
Port number HPDE is listening on
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
delay : int, optional
delay parameter to wait for Kafka to respond - in milliseconds.
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the peer groups for all the streams.
- **24.1 maadstml.viperanomalytrainbatch(vipertoken,host,port,consumefrom,produceto,producepeergroupto,produceridpeergroup,consumeridproduceto,
streamstoanalyse,companyname,consumerid,producerid,flags,hpdehost,viperconfigfile, enabletls=1,partition=-1,hpdeport=-999,topicid=”-999”,maintopic=’’,rollbackoffsets=0,fullpathtotrainingdata=’’,
brokerhost=’’,brokerport=9092,delay=1000,timeout=120,microserviceid=’’,timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform anomaly detection/predictions for each device. Separate multiple topicids by a comma.
maintopic : string, optional
This is the maintopic that contains the subtopic streams.
rollbackoffsets: int, optional
This is the percentage to rollback the streams that you are analysing: streamstoanalyse
fullpathtotrainingdata: string, optional
This is the full path to the training dataset to use to find peer groups.
producepeergroupto : string, required
Topic to produce the peer group for anomaly comparisons
produceridpeergroup : string, required
Producerid for the peer group topic
consumeridproduceto : string, required
Consumer id for the Produceto topic
streamstoanalyse : string, required
Comma separated list of streams to analyse for anomalies
flags : string, required
These are flags that will be used to select the peer group for each stream. The flags must have the following format: topic=[topic name],topictype=[numeric or string],threshnumber=[a number between 0 and 10000, i.e. 200], lag=[a number between 1 and 20, i.e. 5],zthresh=[a number between 1 and 5, i.e. 2.5],influence=[a number between 0 and 1 i.e. 0.5]
threshnumber: decimal number to determine usual behaviour - only for numeric streams, numbers are compared to the centroid number, a standardized distance is taken and all numbers below the thresholdnumeric are deemed as usual i.e. thresholdnumber=200, any value below is close to the centroid - you need to experiment with this number.
lag: number of lags for the moving mean window, works to smooth the function i.e. lag=5
zthresh: number of standard deviations from moving mean i.e. 3.5
influence: strength in identifying outliers for both stationary and non-stationary data, i.e. influence=0 ignores outliers when recalculating the new threshold, influence=1 is least robust. Influence should be between (0,1), i.e. influence=0.5
Flags must be provided for each topic. Separate multiple flags by ~
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
hpdeport: int, required
Port number HPDE is listening on
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
delay : int, optional
delay parameter to wait for Kafka to respond - in milliseconds.
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the peer groups for all the streams.
- **25. maadstml.viperanomalypredict(vipertoken,host,port,consumefrom,produceto,consumeinputstream,produceinputstreamtest,produceridinputstreamtest,
streamstoanalyse,consumeridinputstream,companyname,consumerid,producerid,flags,hpdehost,viperconfigfile, enabletls=1,partition=-1,hpdeport=-999,topicid=-999,maintopic=’’,rollbackoffsets=0,fullpathtopeergroupdata=’’,
brokerhost=’’,brokerport=9092,delay=1000,timeout=120,microserviceid=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
consumeinputstream : string, required
Topic of the input stream to test for anomalies
produceinputstreamtest : string, required
Topic to store the input stream data for analysis
produceridinputstreamtest : string, required
Producer id for the produceinputstreamtest topic
streamstoanalyse : string, required
Comma separated list of streams to analyse for anomalies
flags : string, required
These are flags that will be used to select the peer group for each stream. The flags must have the following format: *riskscore=[a number between 0 and 1]~complete=[and, or, pvalue i.e. p50 means streams over 50% that have an anomaly]~type=[and,or this will determine what logic to apply to v and sc],topic=[topic name],topictype=[numeric or string],v=[v>some value, v<some value, or valueany], sc=[sc>some number, sc<some number - this is the score for the anomaly test]
if using strings, the specify flags: type=[and,or],topic=[topic name],topictype=string,stringcontains=[0 or 1 - 1 will do a substring test, 0 will equate the strings],v2=[any text you want to test - use | for OR or ^ for AND],sc=[score value, sc<some value, sc>some value]
riskscore: this the riskscore threshold. A decimal number between 0 and 1, use this as a threshold to flag anomalies.
complete : If using multiple streams, this will test each stream to see if the computed riskscore and perform an AND or OR on each risk value and take an average of the risk scores if using AND. Otherwise if at least one stream exceeds the riskscore it will return.
type: AND or OR - if using v or sc, this is used to apply the appropriate logic between v and sc. For example, if type=or, then VIPER will see if a test value is less than or greater than V, OR, standarzided value is less than or greater than sc.
sc: is a standarized variavice between the peer group value and test value.
v1: is a user chosen value which can be used to test for a particular value. For example, if you want to flag values less then 0, then choose v<0 and VIPER will flag them as anomolous.
v2: if analysing string streams, v2 can be strings you want to check for. For example, if I want to check for two strings: Failed and Attempt Failed, then set v2=Failed^Attempt Failed, where ^ tells VIPER to perform an AND operation. If I want either to exist, 2=Failed|Attempt Failed, where | tells VIPER to perform an OR operation.
stringcontains : if using string streams, and you want to see if a particular text value exists and flag it - then if stringcontains=1, VIPER will test for substrings, otherwise it will equate the strings.
Flags must be provided for each topic. Separate multiple flags by ~
consumeridinputstream : string, required
Consumer id of the input stream topic: consumeinputstream
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
hpdeport: int, required
Port number HPDE is listening on
topicid : int, optional
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform anomaly prediction for each device.
maintopic : string, optional
This is the maintopic that contains the subtopic streams.
rollbackoffsets: int, optional
This is the percentage to rollback the streams that you are analysing: streamstoanalyse
fullpathtopeergroupdata: string, optional
This is the full path to the peer group you found in viperanomalytrain; this will be used for anomaly detection.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
delay : int, optional
delay parameter to wait for Kafka to respond - in milliseconds.
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the peer groups for all the streams.
- **25.1 maadstml.viperanomalypredictbatch(vipertoken,host,port,consumefrom,produceto,consumeinputstream,produceinputstreamtest,produceridinputstreamtest,
streamstoanalyse,consumeridinputstream,companyname,consumerid,producerid,flags,hpdehost,viperconfigfile, enabletls=1,partition=-1,hpdeport=-999,topicid=”-999”,maintopic=’’,rollbackoffsets=0,fullpathtopeergroupdata=’’,
brokerhost=’’,brokerport=9092,delay=1000,timeout=120,microserviceid=’’,timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
consumefrom : string, required
Topic to consume from in the Kafka broker
produceto : string, required
Topic to produce results of the prediction to
consumeinputstream : string, required
Topic of the input stream to test for anomalies
produceinputstreamtest : string, required
Topic to store the input stream data for analysis
produceridinputstreamtest : string, required
Producer id for the produceinputstreamtest topic
streamstoanalyse : string, required
Comma separated list of streams to analyse for anomalies
flags : string, required
These are flags that will be used to select the peer group for each stream. The flags must have the following format: *riskscore=[a number between 0 and 1]~complete=[and, or, pvalue i.e. p50 means streams over 50% that have an anomaly]~type=[and,or this will determine what logic to apply to v and sc],topic=[topic name],topictype=[numeric or string],v=[v>some value, v<some value, or valueany], sc=[sc>some number, sc<some number - this is the score for the anomaly test]
if using strings, the specify flags: type=[and,or],topic=[topic name],topictype=string,stringcontains=[0 or 1 - 1 will do a substring test, 0 will equate the strings],v2=[any text you want to test - use | for OR or ^ for AND],sc=[score value, sc<some value, sc>some value]
riskscore: this the riskscore threshold. A decimal number between 0 and 1, use this as a threshold to flag anomalies.
complete : If using multiple streams, this will test each stream to see if the computed riskscore and perform an AND or OR on each risk value and take an average of the risk scores if using AND. Otherwise if at least one stream exceeds the riskscore it will return.
type: AND or OR - if using v or sc, this is used to apply the appropriate logic between v and sc. For example, if type=or, then VIPER will see if a test value is less than or greater than V, OR, standarzided value is less than or greater than sc.
sc: is a standarized variavice between the peer group value and test value.
v1: is a user chosen value which can be used to test for a particular value. For example, if you want to flag values less then 0, then choose v<0 and VIPER will flag them as anomolous.
v2: if analysing string streams, v2 can be strings you want to check for. For example, if I want to check for two strings: Failed and Attempt Failed, then set v2=Failed^Attempt Failed, where ^ tells VIPER to perform an AND operation. If I want either to exist, 2=Failed|Attempt Failed, where | tells VIPER to perform an OR operation.
stringcontains : if using string streams, and you want to see if a particular text value exists and flag it - then if stringcontains=1, VIPER will test for substrings, otherwise it will equate the strings.
Flags must be provided for each topic. Separate multiple flags by ~
consumeridinputstream : string, required
Consumer id of the input stream topic: consumeinputstream
companyname : string, required
Your company name
consumerid: string, required
Consumerid associated with the topic to consume from
producerid: string, required
Producerid associated with the topic to produce to
hpdehost: string, required
Address of HPDE
viperconfigfile : string, required
Full path to VIPER.ENV configuration file on server.
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise set to 0 for plaintext.
partition: int, optional
Partition used by kafka to store data. NOTE: Kafka will dynamically store data in partitions. Unless you know for sure the partition, you should use the default of -1 to let VIPER determine where your data is.
hpdeport: int, required
Port number HPDE is listening on
topicid : string, required
Topicid represents an id for some entity. For example, if you have 1000 IoT devices, you can perform anomaly prediction for each device. Separate multiple topic ids by a comma.
maintopic : string, optional
This is the maintopic that contains the subtopic streams.
rollbackoffsets: int, optional
This is the percentage to rollback the streams that you are analysing: streamstoanalyse
fullpathtopeergroupdata: string, optional
This is the full path to the peer group you found in viperanomalytrain; this will be used for anomaly detection.
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
delay : int, optional
delay parameter to wait for Kafka to respond - in milliseconds.
timeout : int, optional
Number of seconds that VIPER waits when trying to make a connection to HPDE.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: Returns a JSON object of the peer groups for all the streams.
- **26. maadstml.viperpreprocessproducetotopicstream(VIPERTOKEN,host,port,topic,producerid,offset,maxrows=0,enabletls=0,delay=100,
- brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999,streamstojoin=’’,preprocesslogic=’’,
preprocessconditions=’’,identifier=’’,preprocesstopic=’’,array=0,saveasarray=0,rawdataoutput=0)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
- Topics to produce to in the Kafka broker - this is a topic that contains multiple topics, VIPER will consume from each
topic and write the aggregated results back to this stream.
array : int, optional
Set array=1 if you produced data (from viperproducetotopic) as an array.
rawdataoutput : int, optional
Set rawdataoutput=1 and the raw data used for preprocessing will be added to the output json.
preprocessconditions : string, optional
You can set conditions to aggregate functions: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, DIFFMARGIN, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB,ANOMPROBX-Y, CONSISTENCY, ENTROPY, AUTOCORR, TREND, IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation),Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Geodiff (returns distance in Kilometers between two lat/long points) Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
CONSISTENCY checks if the data all have consistent data types. Returns 1 for consistent data types, 0 otherwise.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
ANOMPROB=Anomaly Probability, it will run several algorithms on the data stream window to determine a probaility of anomalous behaviour. This can be cross-refenced with OUTLIERS. It can be very powerful way to detection issues with devices. VARIED will determine if the values in the window are all the same, or varied: it will return 1 for varied, 0 if values are all the same. This is useful if you want to know if something changed in the stream.
ANOMPROBX-Y (similar to OUTLIERSX-Y), where X and Y are numbers or “n”. If “n” means examine all anomalies for patterns. They allow you to check if the anomalies in the streams are truly anomalies and not some pattern. For example, if a IoT device shuts off and turns on again routinely, this may be picked up as an anomaly when in fact it is normal behaviour. So, to ignore these cases, if ANOMPROB2-5, this tells Viper, check anomalies with patterns of 2-5 peaks. If the stream has two classes and these two classes are like 0 and 1000, and show a pattern, then they should not be considered an anomaly. Meaning, class=0, is the device shutting down, class=1000 is the device turning back on. If ANOMPROB3-10, Viper will check for patterns of classes 3 to 10 to see if they recur routinely. This is very helpful to reduce false positives and false negatives.
For example, preprocessconditions=’humidity=55,60:temperature=34,n’, and preprocesslogic=’max,count’, means Get the MAX value of values in humidity if humidity is between [55,60], and Count values in temperature if temperature >=34.
preprocesstopic : string, optional
You can specify a topic for the preprocessed message. VIPER will automatically dump the preprocessed results to this topic.
identifier : string, optional
Add any identifier like DSN ID.
producerid : string, required
Producerid of the topic producing to
offset : int, optional
If 0 will use the stream data from the beginning of the topics, -1 will automatically go to last offset
saveasarray : int, optional
Set to 1, to save the preprocessed jsons as a json array. This is very helpful if you want to do machine learning or further query the preprocessed json because each processed json are time synchronized. For example, if you want to compare different preprocessed streams the date/time of the data is synchronized to give you impacts of one stream on another.
maxrows : int, optional
If offset=-1, this number will rollback the streams by maxrows amount i.e. rollback=lastoffset-maxrows
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise 0 for plaintext
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
topicid : int, optional
This represents the IoT device number or any entity
streamstojoin : string, optional
If you entered topicid, you need to enter the streams you want to pre-process
preprocesslogic : string, optional
Here you need to specify how you want to pre-process the streams. You can perform the following operations: MAX, MIN, AVG, COUNT, COUNTSTR, SUM, DIFF, DIFFMARGIN, VARIANCE, MEDIAN, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB, ANOMPROBX-Y, ENTROPY, AUTOCORR, TREND, CONSISTENCY, Unique, Uniquestr, Geodiff (returns distance in Kilometers between two lat/long points), IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation), Skewness, Kurtosis, Spikedetect, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5.
The order of the operation must match the order of the stream. If you specified topicid, you can perform TML on the new preprocessed stream append appending: _preprocessed_processlogic For example, if streamstojoin=”stream1,stream2,streams3”, and preprocesslogic=”min,max,diff”, the new streams will be: stream1_preprocessed_Min, stream2_preprocessed_Max, stream3_preprocessed_Diff.
RETURNS: Returns preprocessed JSON.
27. maadstml.areyoubusy(host,port)
Parameters:
host : string, required
You can get the host by determining all the hosts that are listening in your machine. You use this code: https://github.com/smaurice101/transactionalmachinelearning/blob/main/checkopenports
port : int, required
You can get the port by determining all the ports that are listening in your machine. You use this code: https://github.com/smaurice101/transactionalmachinelearning/blob/main/checkopenports
RETURNS: Returns a list of available VIPER and HPDE with their HOST and PORT.
- **28. maadstml.viperstreamquery(VIPERTOKEN,host,port,topic,producerid,offset=-1,maxrows=0,enabletls=1,delay=100,brokerhost=’’,
brokerport=-999,microserviceid=’’,topicid=-999,streamstojoin=’’,preprocessconditions=’’, identifier=’’,preprocesstopic=’’,description=’’,array=0)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, required
- Topics to produce to in the Kafka broker - this is a topic that contains multiple topics, VIPER will consume from each
topic and write the aggregated results back to this stream.
producerid : string, required
Producer id of topic
offset : int, optional
If 0 will use the stream data from the beginning of the topics, -1 will automatically go to last offset
maxrows : int, optional
If offset=-1, this number will rollback the streams by maxrows amount i.e. rollback=lastoffset-maxrows
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise 0 for plaintext
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
topicid : int, optional
This represents the IoT device number or any entity
streamstojoin : string, required
Identify multiple streams to join, separate by comma. For example, if you preprocessed Power, Current, Voltage:
streamstojoin=”Power_preprocessed_Avg,Current_preprocessed_Min,Voltage_preprocessed_Avg,Current_preprocessed_Trend”
preprocessconditions : string, required
You apply strict conditions to a MAX of 3 streams. You can use >, <, =, AND, OR. You can add as many conditions as you like. Separate multiple conditions by semi-colon. You cannot mix AND and OR. For example,
preprocessconditions=’Power_preprocessed_Avg > 139000:Power_preprocessed_Avg < 1000 or Voltage_preprocessed_Avg > 120000 or Current_preprocessed_Min=0:Voltage_preprocessed_Avg > 120000 and Current_preprocessed_Trend>0’
identifier: string, optional
Add an identifier text to the result. This is a label, and useful if you want to identify the result for some IOT device.
preprocesstopic : string, optional
The topic to produce the query results to.
description : string, optional
You can give each query condition a description. Separate multiple desction by semi-colon.
array : int, optional
Set to 1 if you are reading a JSON ARRAY, otherwise 0.
RETURNS: 1 if the condition is TRUE (condition met), 0 if false (condition not met)
- **28.1 maadstml.viperstreamquerybatch(VIPERTOKEN,host,port,topic,producerid,offset=-1,maxrows=0,enabletls=1,delay=100,brokerhost=’’,
brokerport=-999,microserviceid=’’,topicid=”-999”,streamstojoin=’’,preprocessconditions=’’, identifier=’’,preprocesstopic=’’,description=’’,array=0,timedelay=0,asynctimeout=120)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
topic : string, required
- Topics to produce to in the Kafka broker - this is a topic that contains multiple topics, VIPER will consume from each
topic and write the aggregated results back to this stream.
producerid : string, required
Producer id of topic
offset : int, optional
If 0 will use the stream data from the beginning of the topics, -1 will automatically go to last offset
maxrows : int, optional
If offset=-1, this number will rollback the streams by maxrows amount i.e. rollback=lastoffset-maxrows
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise 0 for plaintext
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
topicid : string, required
This represents the IoT device number or any entity. Separate multiple topic ids by a comma.
streamstojoin : string, required
Identify multiple streams to join, separate by comma. For example, if you preprocessed Power, Current, Voltage:
streamstojoin=”Power_preprocessed_Avg,Current_preprocessed_Min,Voltage_preprocessed_Avg,Current_preprocessed_Trend”
preprocessconditions : string, required
You apply strict conditions to a MAX of 3 streams. You can use >, <, =, AND, OR. You can add as many conditions as you like. Separate multiple conditions by semi-colon. You cannot mix AND and OR. For example,
preprocessconditions=’Power_preprocessed_Avg > 139000:Power_preprocessed_Avg < 1000 or Voltage_preprocessed_Avg > 120000 or Current_preprocessed_Min=0:Voltage_preprocessed_Avg > 120000 and Current_preprocessed_Trend>0’
identifier: string, optional
Add an identifier text to the result. This is a label, and useful if you want to identify the result for some IOT device.
preprocesstopic : string, optional
The topic to produce the query results to.
description : string, optional
You can give each query condition a description. Separate multiple desction by semi-colon.
array : int, optional
Set to 1 if you are reading a JSON ARRAY, otherwise 0.
RETURNS: 1 if the condition is TRUE (condition met), 0 if false (condition not met)
- **29. maadstml.viperpreprocessbatch(VIPERTOKEN,host,port,topic,producerid,offset,maxrows=0,enabletls=0,delay=100,
- brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=”-999”,streamstojoin=’’,preprocesslogic=’’,
preprocessconditions=’’,identifier=’’,preprocesstopic=’’,array=0,saveasarray=0,timedelay=0,asynctimeout=120,rawdataoutput=0)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
asynctimeout : int, optional
-This is the timeout in seconds for the Python library async function.
rawdataoutput : int, optional
-Set rawdataoutput=1 to output the raw preprocessing data to the Json.
timedelay : int, optional
Timedelay is in SECONDS. Because batch runs continuously in the background, this will cause Viper to pause timedelay seconds when reading and writing to Kafka. For example, if the raw data is being generated every 3600 seconds, it may make sense to set timedelay=3600
topic : string, required
- Topics to produce to in the Kafka broker - this is a topic that contains multiple topics, VIPER will consume from each
topic and write the aggregated results back to this stream.
array : int, optional
Set array=1 if you produced data (from viperproducetotopic) as an array.
preprocessconditions : string, optional
You can set conditions to aggregate functions: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB,ANOMPROBX-Y, ENTROPY, AUTOCORR, TREND, IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation),Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Geodiff (returns distance in Kilometers between two lat/long points).
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise. Uniquecount Checks numeric data for duplication. Returns count of unique numbers. Uniquestrcount Checks string data for duplication. Returns count of unique strings.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
ANOMPROB=Anomaly Probability, it will run several algorithms on the data stream window to determine a probaility of anomalous behaviour. This can be cross-refenced with OUTLIERS. It can be very powerful way to detection issues with devices. VARIED will determine if the values in the window are all the same, or varied: it will return 1 for varied, 0 if values are all the same. This is useful if you want to know if something changed in the stream.
ANOMPROBX-Y (similar to OUTLIERSX-Y), where X and Y are numbers or “n”. If “n” means examine all anomalies for patterns. They allow you to check if the anomalies in the streams are truly anomalies and not some pattern. For example, if a IoT device shuts off and turns on again routinely, this may be picked up as an anomaly when in fact it is normal behaviour. So, to ignore these cases, if ANOMPROB2-5, this tells Viper, check anomalies with patterns of 2-5 peaks. If the stream has two classes and these two classes are like 0 and 1000, and show a pattern, then they should not be considered an anomaly. Meaning, class=0, is the device shutting down, class=1000 is the device turning back on. If ANOMPROB3-10, Viper will check for patterns of classes 3 to 10 to see if they recur routinely. This is very helpful to reduce false positives and false negatives.
For example, preprocessconditions=’humidity=55,60:temperature=34,n’, and preprocesslogic=’max,count’, means Get the MAX value of values in humidity if humidity is between [55,60], and Count values in temperature if temperature >=34.
preprocesstopic : string, optional
You can specify a topic for the preprocessed message. VIPER will automatically dump the preprocessed results to this topic.
identifier : string, optional
Add any identifier like DSN ID. Note, for multiple identifiers per topicid, you can separate by pipe “|”.
producerid : string, required
Producerid of the topic producing to
offset : int, optional
If 0 will use the stream data from the beginning of the topics, -1 will automatically go to last offset
saveasarray : int, optional
Set to 1, to save the preprocessed jsons as a json array. This is very helpful if you want to do machine learning or further query the preprocessed json because each processed json are time synchronized. For example, if you want to compare different preprocessed streams the date/time of the data is synchronized to give you impacts of one stream on another.
maxrows : int, optional
If offset=-1, this number will rollback the streams by maxrows amount i.e. rollback=lastoffset-maxrows
enabletls: int, optional
Set to 1 if Kafka broker is SSL/TLS enabled for encrypted traffic, otherwise 0 for plaintext
delay: int, optional
Time in milliseconds before VIPER backsout from reading messages
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
topicid : string, required
This represents the IoT device number or any entity. You can specify multiple ids separated by a comma: topicid=”1,2,4,5”.
streamstojoin : string, optional
If you entered topicid, you need to enter the streams you want to pre-process
preprocesslogic : string, optional
Here you need to specify how you want to pre-process the streams. You can perform the following operations: MAX, MIN, AVG, COUNT, COUNTSTR, SUM, DIFF, VARIANCE, MEDIAN, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB, ANOMPROBX-Y, ENTROPY, AUTOCORR, TREND, IQR (InterQuartileRange), Midhinge, CONSISTENCY, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation), Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Geodiff (returns distance in Kilometers between two lat/long points). Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Uniquecount Checks numeric data for duplication. Returns count of unique numbers. Uniquestrcount Checks string data for duplication. Returns count of unique strings.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
The order of the operation must match the order of the stream. If you specified topicid, you can perform TML on the new preprocessed stream append appending: _preprocessed_processlogic For example, if streamstojoin=”stream1,stream2,streams3”, and preprocesslogic=”min,max,diff”, the new streams will be: stream1_preprocessed_Min, stream2_preprocessed_Max, stream3_preprocessed_Diff.
RETURNS: None.
30. maadstml.viperlisttopics(vipertoken,host,port=-999,brokerhost=’’, brokerport=-999,microserviceid=’’)
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: A JSON formatted object of all the topics in the Kafka broker.
- **31. maadstml.viperpreprocesscustomjson(VIPERTOKEN,host,port,topic,producerid,offset,jsoncriteria=’’,rawdataoutput=0,maxrows=0,
enabletls=0,delay=100,brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999,streamstojoin=’’,preprocesslogic=’’, preprocessconditions=’’,identifier=’’,preprocesstopic=’’,array=0,saveasarray=0,timedelay=0,asynctimeout=120, usemysql=0,tmlfilepath=’’,pathtotmlattrs=’’)**
Parameters:
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
topic : string, required
Topic containing the raw data to consume.
producerid : string, required
Id of the Topic.
offset : int, required
Offset to consume from. Set to -1 if consuming the last offset of topic.
jsoncriteria : string, required
This is the JSON path to the data you want to consume . It must be the following format:
UID is path to the main id. For example, Patient ID
filter is the path to something that filter the jsons
subtopic is the path to the subtopics in the json (several paths can be specified)
values is the path to the Values of the subtopics - Subtopic and Value must have 1-1 match
identifiers is the path to any special identifiers for the subtopics
datetime is the path to the datetime of the message
msgid is the path to any msg id
For example:
- jsoncriteria=’uid=subject.reference,filter:resourceType=Observation~
subtopics=code.coding.0.code,component.0.code.coding.0.code,component.1.code.coding.0.code~values=valueQuantity.value,component.0.valueQuantity.value,component.1.valueQuantity.value~identifiers=code.coding.0.display,component.0.code.coding.0.display,component.1.code.coding.0.display~datetime=effectiveDateTime~msgid=id’
rawdataoutput : int, optional
set to 1 if you want to output the raw data. Note: This could involve a lot of data and Kafka may refuse to write to the topic.
maxrows : int, optional
Number of offsets or percentage to roll back the data stream
enabletls : int, optional
Set to 1 for TLS encrpyted traffic
delay : int, optional
Delay to wait for Kafka to finish writing to topic
topicid : int, optional
Since you are consuming raw data, this is not needed. Topicid will be set for you.
streamstojoin : string, optional
This is ignored for raw data.
preprocesslogic : string, optional
Specify your preprocess algorithms. For example, You can set conditions to aggregate functions: MIN, MAX, AVG, COUNT, COUNTSTR, DIFF, DIFFMARGIN, SUM, MEDIAN, VARIANCE, OUTLIERS, OUTLIERSX-Y, VARIED, ANOMPROB,ANOMPROBX-Y, CONSISTENCY, ENTROPY, AUTOCORR, TREND, IQR (InterQuartileRange), Midhinge, GM (Geometric mean), HM (Harmonic mean), Trimean, CV (coefficient of Variation), Mad (Mean absolute deviation),Skewness, Kurtosis, Spikedetect, Unique, Uniquestr, Timediff: time should be in this layout:2006-01-02T15:04:05, Timediff returns the difference in seconds between the first date/time and last datetime. Avgtimediff returns the average time in seconds between consecutive dates. Spikedetect uses a Zscore method to detect spikes in the data using lag of 5, StD of 3.5 from mean and influence of 0.5. Geodiff (returns distance in Kilometers between two lat/long points) Unique Checks numeric data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Dataage_[UTC offset]_[timetype], dataage can be used to check the last update time of the data in the data stream from current local time. You can specify the UTC offset to adjust the current time to match the timezone of the data stream. You can specify timetype as millisecond, second, minute, hour, day. For example, if Dataage_1_minute, then this processtype will compare the last timestamp in the data stream, to the local UTC time offset +1 and compute the time difference between the data stream timestamp and current local time and return the difference in minutes. This is a very powerful processtype for data quality and data assurance programs for any number of data streams.
Uniquestr Checks string data for duplication. Returns 1 if no data duplication (unique), 0 otherwise.
Uniquecount Checks numeric data for duplication. Returns count of unique numbers.
Uniquestrcount Checks string data for duplication. Returns count of unique strings.
CONSISTENCY checks if the data all have consistent data types. Returns 1 for consistent data types, 0 otherwise.
Meanci95 or Meanci99 - returns a 95% or 99% confidence interval: mean, low, high
RAW for no processing.
preprocessconditions : string, optional
Specify any preprocess conditions
identifier : string, optional
Specify any text identifier
preprocesstopic : string, optional
Specify the name of the topic to write preprocessed results.
array : int, optional
Ignored for raw data - as jsoncriteria specifies json path
saveasarray : int, optional
Set to 1 to save as json array
timedelay : int, optional
Delay to wait for response from Kafka.
asynctimeout : int, optional
Maximum delay for asyncio in Python library
usemysql : int, optional
Set to 1 to specify whether MySQL is used to store TMLIDs. This will be needed to track individual objects.
tmlfilepath : string, optional
Ignored.
pathtotmlattrs : string, optional
Specifiy any attributes for the TMLID. Here you can specify OEM, Latitude, Longitude, and Location JSON paths:
pathtotmlattrs=’oem=id,lat=subject.reference,long=component.0.code.coding.0.display,location=component.1.valueQuantity.value’
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: null
- **32. maadstml.viperstreamcorr(vipertoken,host,port,topic,producerid,offset=-1,maxrows=0,enabletls=1,delay=100,brokerhost=’’,
brokerport=-999,microserviceid=’’,topicid=-999,streamstojoin=’’, identifier=’’,preprocesstopic=’’,description=’’,array=0, wherecondition=’’, wheresearchkey=’PreprocessIdentifier’,rawdataoutput=1,threshhold=0,pvalue=0, identifierextractpos=””,topcorrnum=5,jsoncriteria=’’,tmlfilepath=’’,usemysql=0, pathtotmlattrs=’’,mincorrvectorlen=5,writecorrstotopic=’’,outputtopicnames=0,nlp=0, correlationtype=’’,docrosscorr=0)**
Parameters: Perform Stream correlations
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
topic : string, required
Topic containing the raw data to consume.
producerid : string, required
Id of the Topic.
wherecondition : string, optional
Specify the where condition. For example, if you want to filter the data on “males”, enter males. You can specify exact match by using [males], or substring by using (males), or “not” includes by using {males}
correlationtype : string, optional
Specify type of correlation you want to do. Valid values are: kendall,spearman,pearson,ks You can specify some, or all (leave blank and ALL will be done), separated by comma. ks=kolmogorov-Smirnov test.
docrosscorr : int, optional
Set to 1 if you want to do cross-correlations with 4 variables, not the normal 2-variable.
wheresearchkey : string, optional
Specify the where search key. This key will be searched for “males”.
description : string, optional
Specify a text description for this correlation.
identifierextractpos : string, optional
If doing correlation on data you have already preprocessed, you can extract the identifier from the identifier field in the preprocessed json.
offset : int, required
Offset to consume from. Set to -1 if consuming the last offset of topic.
mincorrvectorlen : int, optional
Minimum length of the data variables you are correlating.
topcorrnum : int, optional
Top number of sorted correlations to output
threshhold : int, optional
Threshold for the correlation coefficient. Must range from 0-100. All correlations will be greater than this number.
pvalue : int, optional
Pvalue threshold for the p-values. Must range from 0-100. All p-values will be below this number.
writecorrstotopic : string, optional
This is the name of the topic that Viper will write “individual” correlation results to.
outputtopicnames : int, optional
Set to 1 if you want to write out topic names.
nlp : int, optional
Set to 1 if you want to correlate TEXT data by using natural language processing (NLP).
jsoncriteria : string, required
This is the JSON path to the data you want to consume . It must be the following format:
UID is path to the main id. For example, Patient ID
filter is the path to something that filter the jsons
subtopic is the path to the subtopics in the json (several paths can be specified)
values is the path to the Values of the subtopics - Subtopic and Value must have 1-1 match
identifiers is the path to any special identifiers for the subtopics
datetime is the path to the datetime of the message
msgid is the path to any msg id
For example:
- jsoncriteria=’uid=subject.reference,filter:resourceType=Observation~
subtopics=code.coding.0.code,component.0.code.coding.0.code,component.1.code.coding.0.code~values=valueQuantity.value,component.0.valueQuantity.value,component.1.valueQuantity.value~identifiers=code.coding.0.display,component.0.code.coding.0.display,component.1.code.coding.0.display~datetime=effectiveDateTime~msgid=id’
rawdataoutput : int, optional
set to 1 if you want to output the raw data. Note: This could involve a lot of data and Kafka may refuse to write to the topic.
maxrows : int, optional
Number of offsets or percentage to roll back the data stream
enabletls : int, optional
Set to 1 for TLS encrpyted traffic
delay : int, optional
Delay to wait for Kafka to finish writing to topic
topicid : int, optional
Since you are consuming raw data, this is not needed. Topicid will be set for you.
streamstojoin : string, optional
This is ignored for raw data.
preprocesslogic : string, optional
Specify your preprocess algorithms. For example, min, max, variance, trend, anomprob, outliers, etc..
preprocessconditions : string, optional
Specify any preprocess conditions
identifier : string, optional
Specify any text identifier
preprocesstopic : string, optional
Specify the name of the topic to write preprocessed results.
array : int, optional
Ignored for raw data - as jsoncriteria specifies json path
saveasarray : int, optional
Set to 1 to save as json array
timedelay : int, optional
Delay to wait for response from Kafka.
asynctimeout : int, optional
Maximum delay for asyncio in Python library
usemysql : int, optional
Set to 1 to specify whether MySQL is used to store TMLIDs. This will be needed to track individual objects.
tmlfilepath : string, optional
Ignored.
pathtotmlattrs : string, optional
Specifiy any attributes for the TMLID. Here you can specify OEM, Latitude, Longitude, and Location JSON paths:
pathtotmlattrs=’oem=id,lat=subject.reference,long=component.0.code.coding.0.display,location=component.1.valueQuantity.value’
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
RETURNS: null
- **33. maadstml.viperstreamcluster(vipertoken,host,port,topic,producerid,offset=-1,maxrows=0,enabletls=1,delay=100,brokerhost=’’,
brokerport=-999,microserviceid=’’,topicid=-999,iterations=1000, numclusters=8, distancealgo=1,description=’’,rawdataoutput=0,valuekey=’’,filterkey=’’,groupkey=’’, identifier=’’,datetimekey=’’,valueidentifier=’’,msgid=’’,valuecondition=’’, identifierextractpos=’’,preprocesstopic=’’, alertonclustersize=0,alertonsubjectpercentage=50,sendalertemailsto=’’,emailfrequencyinseconds=0, companyname=’’,analysisdescription=’’,identifierextractposlatitude=-1, identifierextractposlongitude=-1,identifierextractposlocation=-1, identifierextractjoinedidentifiers=-1,pdfformat=’’,minimumsubjects=2)**
Parameters: Perform Stream correlations
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
topic : string, required
Topic containing the raw data to consume.
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
alertonsubjectpercentage : int, optional
Set a value between 0-100 that specifies the percentage of subjects that exceed a threshold.
identifierextractjoinedidentifiers : int, optional
Position of additional text in identfier field.
pdfformat : string, optional
Speficy format text of the PDF to generate and emailed to users. You can set title, signature, showpdfemaillist, and charttitle.
pdfformat=”title=This is a Transactional Machine Learning Auto-Generated PDF for Cluster Analysis For OTICS|signature=Created by: OTICS, Toronto|showpdfemaillist=1|charttitle=Chart Shows Clusters of Patients with Similar Symptoms”
minimumsubjects : int, optional
Sepecify minimum subjects in the cluster analysis.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
maxrows : int, optional
Number of offsets or percentage to roll back the data stream
enabletls : int, optional
Set to 1 for TLS encrpyted traffic
delay : int, optional
Delay to wait for Kafka to finish writing to topic
producerid : string, required
Id of the Topic.
topicid : int, optional
Ignored
iterations : int, optional
Number of iterations to compute clusters
numclusters : int, optional
Number of clusters you want. Maximum is 20.
distancealgo : int, optional
Set to 1 for Euclidean, or 2 for EuclideanSquared.
valuekey : string, required
JSON path to the value to cluster on
filterkey : string, optional
JSON path to filter on. Ex. Preprocesstype=Pearson, gets value from Key=Preprocesstype, and checks for value=Pearson
groupkey : string, optional
JSON path to group on a key. Ex. Topicid, to group on TMLIDs
valueidentifier : string, optional
JSON path to text value IDs you correlated.
msgid : string, optional
JSON path for a unique message id
valuecondition : string, optional
A condition to filter numeric values on. Ex. valuecondition=”> .5”, if valuekey is correlations, then all correlation > 0.5 are taken.
identifierextractpos : string, optional
The location of data to extract from the Identifier field. Ex. identifierextractpos=”1,2”, will extract data from position 1 and 2.
preprocesstopic : string, required
Topic to produce results to
alertonclustersize : int, optional
Size of the cluster to alert on. Ex. if this is 100, then when any cluster has more than 100 elements an email is sent.
sendalertemailsto: string, optional
List of email addresses to send alert to
emailfrequencyinseconds : int, optional
Seconds between emails. Ex. set to 3600, so emails will be sent every 1 hour if alert condition met.
companyname : string, optional
Your company name
analysisdescription : string, optional
A detailed description of the analysis. This will be added to the PDF.
identifierextractposlatitude : int, optional
Position for latitude in the Identifier field
identifierextractposlongitude : int, optional
Position for longitude in the Identifier field
identifierextractposlocation : int, optional
Position for location in the Identifier field
RETURNS: null
- **34. maadstml.vipersearchanomaly(vipertoken,host,port,topic,producerid,offset,jsoncriteria=’’,rawdataoutput=0,maxrows=0,enabletls=0,delay=100,
brokerhost=’’,brokerport=-999,microserviceid=’’,topicid=-999,identifier=’’,preprocesstopic=’’, timedelay=0,asynctimeout=120,searchterms=’’,entitysearch=’’,tagsearch=’’,checkanomaly=1,testtopic=’’, includeexclude=1,anomalythreshold=0,sendanomalyalertemail=’’,emailfrequency=3600)**
Parameters: Perform Stream correlations
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
topic : string, required
Topic containing the raw data to consume.
port : int, required
Port on which VIPER is listenting.
brokerhost : string, optional
Address where Kafka broker is running - if none is specified, the Kafka broker address in the VIPER.ENV file will be used.
brokerport : int, optional
Port on which Kafka is listenting.
jsoncriteria : string, optional
Enter the JSON path to the search fields
anomalythreshold : int, optional
Threshold to meet to determine if search differs from the peer group. This is a number between 0-100. The lower the number the “more” this search differs from the peer group and likely anomalous.
includeexclude : int, optional
Set to 1 if you want the search terms included in the user searches, 0 otherwise.
sendanomalyalertemail : string, optional
List of email addresses to send alerts to: separate list by comma.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
maxrows : int, optional
Number of offsets or percentage to roll back the data stream
enabletls : int, optional
Set to 1 for TLS encrpyted traffic
delay : int, optional
Delay to wait for Kafka to finish writing to topic
producerid : string, required
Id of the Topic.
emailfrequency : int, optional
Frequency in seconds, between alert emails.
testtopic : string, optional
ignored
preprocesstopic : string, required
Topic to produce results to
sendalertemailsto: string, optional
List of email addresses to send alert to
tagsearch : string, optional
Search for tags in the search. You can enter: ‘superlative,noun,interjection,verb,pronoun’
entitysearch : string, optional
Search for entities in the search. You can enter: ‘person,gpe’, where gpe=Geo-political entity
searchterms : string, optional
You can specify your own search terms. Separate list by comma.
emailfrequencyinseconds : int, optional
Seconds between emails. Ex. set to 3600, so emails will be sent every 1 hour if alert condition met.
companyname : string, optional
Your company name
topicid : int, optional
ignored
identifier : string, optional
identifier text
checkanomaly : int, optional
Set to 1 to check for search anomaly.
rawdataoutput : int, optional
ignored
RETURNS: null
- **35. maadstml.vipermirrorbrokers(VIPERTOKEN,host,port,brokercloudusernamepassfrom,brokercloudusernamepassto,
enabletlsfrom,enabletlsto, replicationfactorfrom,replicationfactorto,compressionfrom,compressionto, saslfrom,saslto,partitions,brokerlistfrom,brokerlistto, topiclist,asynctimeout=300,microserviceid=””,servicenamefrom=”broker”,
servicenameto=”broker”,partitionchangeperc=0,replicationchange=0,filter=””,rollbackoffset=0)**
Parameters: Perform Data Stream migration across brokers - fast and simple.
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
brokercloudusernamepassfrom : string, required
This is a comma separated list of source broker username:password. For multiple brokers separate with comma, for example for 3 brokers: username:password,username:password,username:password
brokercloudusernamepassto : string, required
This is a comma separated list of destination broker username:password. For multiple brokers separate with comma, for example for 3 brokers: username:password,username:password,username:password. The number of source and destination brokers must match.
enabletlsfrom : string, required
This is a colon separated list of whether source brokers require TLS: 1=TLS, 0=NoTLS. For multiple brokers separate with colon, for example for 3 brokers: 1:0:1. Some brokers may be On-Prem and do not need TLS.
enabletlsto : string, required
This is a colon separated list of whether destination brokers require TLS: 1=TLS, 0=NoTLS. For multiple brokers separate with colon, for example for 3 brokers: 1:0:1. Some brokers may be On-Prem and do not need TLS.
replicationfactorfrom : string, optional
This is a colon separated list of the replication factor of source brokers. For multiple brokers separate with colon, for example for 3 brokers: 3:4:3, or leave blank to let VIPER decide.
replicationfactorto : string, optional
This is a colon separated list of the replication factor of destination brokers. For multiple brokers separate with colon, for example for 3 brokers: 3:4:3, or leave blank to let VIPER decide.
compressionfrom : string, required
This is a colon separated list of the compression type of source brokers: snappy, gzip, lz4. For multiple brokers separate with colon, for example for 3 brokers: snappy:snappy:gzip.
compressionto : string, required
This is a colon separated list of the compression type of destination brokers: snappy, gzip, lz4. For multiple brokers separate with colon, for example for 3 brokers: snappy:snappy:gzip.
saslfrom : string, required
This is a colon separated list of the SASL type: None, Plain, SCRAM256, SCRAM512 of source brokers. For multiple brokers separate with colon, for example for 3 brokers: PLAIN:SCRAM256:SCRAM512.
saslto : string, required
This is a colon separated list of the SASL type: None, Plain, SCRAM256, SCRAM512 of destination brokers. For multiple brokers separate with colon, for example for 3 brokers: PLAIN:SCRAM256:SCRAM512.
partitions : string, optional
If you are manually migrating topics you will need to specify the partitions of the topics in topiclist. Otherwise, VIPER will automatically find topics and their partitions on the broker for you - this is recommended.
brokerlistfrom : string, required
This is a list of source brokers: host:port. For multiple brokers separate with comma, for example for 3 brokers: host:port,host:port,host:port.
brokerlistto : string, required
This is a list of destination brokers: host:port. For multiple brokers separate with comma, for example for 3 brokers: host:port,host:port,host:port.
topiclist : string, optional
You can manually specify topics to migrate, separate multiple topics with a comma. Otherwise, Viper will automatically find topics on the broker for you - this is recommended.
partitionchangeperc : number, optional
You can increase or decrease partitions on destination broker by specifying a percentage between 0-100, or -100-0. Minimum partition will always be 1.
replicationchange : ignored for now
You can increase or decrease replication factor on destination broker by specifying a positive or negative number. Minimum partition will always be 2.
filter : string, optional
You can specify a filter to choose only those topics that satisfy the filter. Filters must have the following format: “searchstring1,searchstring2,searchstring3,..:Logic=0 or 1:search position: 0,1,2”. For example, Logic 0=AND, 1=OR, search position: 0=BeginsWith, 1=Any, 2=EndsWith
asynctimeout : number, optional
This specifies the timeout in seconds for the python connection.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
servicenamefrom : string, optional
You can specify the name of the source brokers.
servicenameto : string, optional
You can specify the name of the destination brokers.
rollbackoffset: ignored
36. maadstml.vipernlp(filename,maxsummarywords,maxkeywords)
Parameters: Perform NLP summarization of PDFs
filename : string, required
Filename of PDF to summarize.
maxsummarywords : int, required
Maximum amount of words in the summary.
maxkeywords : int, required
Maximum amount of keywords to extract.
RETURNS: JSON string of summary.
37. maadstml.viperchatgpt(openaikey,texttoanalyse,query, temperature,modelname)
Parameters: Start a conversation with ChatGPT
openaikey : string, required
OpenAI API key
texttoanalyse : string, required
Text you want ChatGPT to analyse
query : string, required
Prompts for chatGPT. For example, “What are key points in this text? What are the concerns or issues?”
temperature : float, required
Temperature for chatgpt, must be between 0-1 i.e. 0.7
modelname : string, required
ChatGPT model to use. For example, text-davinci-002, text-curie-001, text-babbage-001.
RETURNS: ChatGPT response.
38. maadstml.viperexractpdffields(pdffilename)
Parameters: Extract data from PDF
pdffilename : string, required
PDF filename
RETURNS: JSON of PDF and writes JSON and XML files of PDF to disk.
39. maadstml.viperexractpdffieldbylabel(pdffilename,labelname,arcotype)
Parameters: Extract data from PDF by PDF labels
pdffilename : string, required
PDF filename
labelname : string, required
Label name in the PDF filename to search for.
pdffilename,labelname,arcotype : string, required
Acrobyte tag in PDF i.e. LTTextLineHorizontal
RETURNS: Value of the labelname - if any.
40. maadstml.pgptingestdocs(docname,doctype, pgptip,pgptport,pgptendpoint)
Parameters:
docname : string, required
A full-path to a PDF, or text file.
doctype : string, required
This can be: binary, or text.
pgptip : string, required
Your container IP - this is usually: http://127.0.0.1
pgptport : string, required
Your container Port - this is usually: 8001. This will be dependent on the docker run port forwarding command. See: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
pgptendpoint : string, required
This must be: /v1/ingest
RETURNS: JSON containing Document details, or ERROR.
41. maadstml.pgptgetingestedembeddings(docname,ip,port,endpoint)
Parameters:
docname : string, required
A full-path to a PDF, or text file.
ip : string, required
Your container IP - this is usually: http://127.0.0.1
port : string, required
Your container Port - this is usually: 8001. This will be dependent on the docker run port forwarding command. See: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
endpoint : string, required
This must be: /v1/ingest/list
RETURNS: Three variables: docids,docstr,docidsstr; these are the embeddings related to docname. Or, ERROR.
42. maadstml.pgptchat(prompt,context,docfilter,port,includesources,ip,endpoint)
Parameters:
prompt : string, required
A prompt for privateGPT.
context : bool, required
This can be True or False. If True, privateGPT will use context, if False, it will not.
docfilter : string array, required
This is docidsstr, and can be retrieved from pgptgetingestedembeddings. If context=True, and dockfilter is empty, then ALL documents are used for context.
port : string, required
Your container Port - this is usually: 8001. This will be dependent on the docker run port forwarding command. See: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
includesources : bool, required
This can be True or False. If True, with context, privateGPT will return the sources in the response.
ip : string, required
Your container IP - this is usually: http://127.0.0.1
endpoint : string, required
This must be: /v1/completions
RETURNS: The response from privateGPT, or ERROR.
43. maadstml.pgptdeleteembeddings(docids, ip,port,endpoint)
Parameters:
docids : string array, required
An array of doc ids. This can be retrieved from pgptgetingestedembeddings.
port : string, required
Your container Port - this is usually: 8001. This will be dependent on the docker run port forwarding command. See: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
ip : string, required
Your container IP - this is usually: http://127.0.0.1
endpoint : string, required
This must be: /v1/ingest/
RETURNS: Null if successful, or ERROR.
44. maadstml.pgpthealth(ip,port,endpoint)
Parameters:
port : string, required
Your container Port - this is usually: 8001. This will be dependent on the docker run port forwarding command. See: https://github.com/smaurice101/raspberrypi/tree/main/privategpt
ip : string, required
Your container IP - this is usually: http://127.0.0.1
endpoint : string, required
This must be: /health
RETURNS: This will return a JSON of OK if the privateGPT server is running, or ERROR.
45. maadstml.videochatloadresponse(url,port,filename,prompt,responsefolder=’videogpt_response’,temperature=0.2,max_output_tokens=512)
Parameters:
url : string, required
IP video chatgpt is listening on in the container - this is usually: http://127.0.0.1
port : string, required
Port video chat gpt is listening on in the container i.e. 7800
filename : string, required
This is the video filename to analyse i.e. with mp4 extension
prompt : string, required
This is the prompt for video chat gpt. i.e. “what is the video about? Is there anaything strange in the video?”
responsefolder : string, optional
This is the folder you want video chatgpt to write responses to
temperature : float, optional
Temperature determines how conservative video chat gpt is i.e. closer to 0 very conservative in responses
max_output_tokens : int, optional
max_output_tokens determines tokens to return
RETURNS: The file name the response was written to by video chatgpt.
- **46. maadstml.viperpreprocessrtms(vipertoken,host,port,topic,producerid,offset,maxrows=0,enabletls=0,delay=100,brokerhost=’’,brokerport=-999,microserviceid=’’,
topicid=-999,rtmsstream=’’,searchterms=’’,rememberpastwindows=’’,identifier=’’, preprocesstopic=’’,patternwindowthreshold=’’,array=0,saveasarray=0,rawdataoutput=0, rtmsscorethreshold=’’,rtmsscorethresholdtopic=’’,attackscorethreshold=’’, attackscorethresholdtopic=’’,patternscorethreshold=’’,patternscorethresholdtopic=’’):**
Parameters:
** : string, required
VIPERTOKEN : string, required
A token given to you by VIPER administrator.
host : string, required
Indicates the url where the VIPER instance is located and listening.
port : int, required
Port on which VIPER is listenting.
topic : string, optional
This is the topic containing preprocessed data for entities
producerid : string, required
Producerid for the topic.
offset : int, required
This is the offset to start reading from ususally -1
maxrows : int, required
The number of offsets to rollback the datastream
enabletls : int, required
if 0, no encryption, otherwise if 1 all data are encrypted
brokerhost : string, optional
Address of Kafka broker - if none is specified it will use broker address in VIPER.ENV file
brokerport : int, optional
Port Kafka is listening on - if none is specified it will use port in the VIPER.ENV file
delay : int, optional
delay parameter to wait for Kafka to respond - in milliseconds.
microserviceid : string, optional
If you are routing connections to VIPER through a microservice then indicate it here.
topicid : int, required
Specifies how entities are processed.
rtmsstream : string, required
Specifies Kafka topic to stream the TEXT data into. Separate multiple topics by comma.
searchterms : string, required
Search terms to use to search the data in text data. Separate by semi-colon for
for different searches for different rtmsstream topics.
rememberpastwindows : int, required
How many sliding time windows for TML to remember.
identifier : string, optional
Identifies this analysis.
preprocesstopic : string, required
Kafka topic to store the output.
patternwindowthreshold : int, required
Threshold number of windows for the occurence
of patterns of the search terms.
rtmsscorethreshold: string, optional
Threshold number for RTMS score between 0-1
rtmsscorethresholdtopic: string, optional
Name of a kafka topic that will contain messages greater
than rtmsthreshold
attackscorethreshold: string, optional
Threshold number between 0-1
attackscorethresholdtopic: string, optional
Name of a kafka topic that will contain messages greater
than attackthreshold
patternscorethreshold: string, optional
Threshold number between 0-1
patternscorethresholdtopic: string, optional
Name of a kafka topic that will contain messages greater
than patternthreshold
array : int, optional
Process data as arrays
saveasarray : int, optional
Save output as arrays.
rawdataoutput : int, optional
Output raw data used in the TML processing
RETURNS: Null