Interfaces and APIs
Interfaces & APIs
Following basic interface is available from user's perspective
client_ml_helper
//creates bucket
i/p - {"bucket_name":"myname", "access_key":"akey", "secret_key":"skey"
int create_bucket(char *bucket_info);**- //sets bucket - which bucket to use for any operation within this class
//i/p - {"bucket_name":"myname", "access_key":"akey", "secret_key":"skey"}
void set_bucket(char *bucket_info); - //key is the key of the file using which the brs will store it
//fpath is full path of the file on local fs, iop is flag for operation put
int upload_file(char *key, char *fpath, insert_options iop); - //train req : It depends on what kind of algo is requested. Format changes for different types
int train_model(char *req); - //req : {"account_id":"AACCEEGGIILLNN", "model_name":"my_model1"}
char *get_model_status(char *req); - //{"account_id":, "model_name":}
int del_model(char *req); - //{"account_id":, "model_name":}
int del_train_request(char *req); - //req : depends on algo etc, user ought to provide the right one
char *predict(char *req); - //get training requests for a given account - all training requests
//req : {account_id:"aacid"}
resultset *get_training_requests(char *req); - //count models for a given account, all the models
//req : {account_id:"aacid"}
long count_models(char *req); - //This is to re-init the model data manager in case we would like to change the
//IP:PORT info for BRS, useful because BRS mostly will be separate and mostly static
//but may change due to load etc, as BRS can scale linearly
//req : {"bucket_info", "brs_ip", "brs_port"}
int reinit_mdm(char *req); - //how many objects are using this reference
int get_ref_count(); - //get the handle of BRS - useful only for embd as client should never bother about this
bangdb_resource_manager *get_brs(); - //this is to test if brs is local to the BE server DB bool is_brs_local();
- void clean_ml_helper();
For developer, we have following interfaces
Development Interfaces, classes
- iq_train_predict
- model_data_manager
- pred_housekeep
- iqconvert
- ml_bangdb
Out of these, iq_train_predict is the interface which we need to implement for every new algo we add. For ex, we have svm_train_predict for svm, similarly ie_train_predict for IE etc.
iq_convert is for converting the format of a file from f1 to f2.
pred_housekeep keeps the state of any request, training info etc. It also provides locking apis for safely handling of parallel trainings or predictions
model_data_manager manages the models. It interfaces with BRS to get or put data (any data)
Finlly ml_bangdb or ie_bangdb are collections of helper functions
Details of these would be defined below.
iq_train_predicit
void set_housekeep(void *hkeep); char *train_model(char *param_list); char *predict(char *str, void *arg = NULL); char *get_status(char *model_detail); void close_trainer();
We just need to implement above five APIs to add any new algo