Method : POST

URI : /ml/pred/<pred_type>

Body : JSON doc for prediction

<pred_type> has two possible values.

1 = Async, should be used when predicting for lots of data in a file.

2 = Sync, should be used when predicting for a single data line.

Structure of the body

   "data_type":"<1 for file, 2 for single data>",
   "attr_type":"<1 for number only, 2 for string only, 3 for hybrid>", // user hybrid when in doubt
   "algo_type":"<SVM | KMEANS etc ..>",
   "input_format":"<CSV | SVM | JSON | TEXT | KV>", // format of input data (data we wish to predict on)
   "expected_format":"<CSV | SVM | JSON >", // format of data expected by algo/model

Now for the model trained in previous stage, let's do the prediction.


curl -H "Content-Type: application/json" -d'{"schema-name":"website","model_name":"sales_model", "data_type":2,"attr_type":3,"algo_type":"SVM","input_format":"CSV","expected_format":"SVM","data":"v1,p2,c2,pg2,2"}' -X POST



The predicted output is 49.23, which is the predicted sales for this row.

Now let's add an event into the “visitor” stream, since the model is available, the stream processing engine should be able to use the model to predict the value and add into the stream as defined.

curl -H "Content-Type: application/json" -d'{"vid":"v1","prod":"p2","catid":"c2","pgid":"pg2","price":54.50,"items":1}' -X POST

Now fetch the row from the stream and check

curl -H "Content-Type: application/json" -d'{"sql":"select * from website.visitor limit 1"}' -X POST

As you see now, "pred_sales":49.275 is added automatically by the DB.