Interfaces & APIs

Following basic interface is available from user's perspective

client_ml_helper

  • //creates bucket
    i/p - {"bucket_name":"myname", "access_key":"akey", "secret_key":"skey"
    int create_bucket(char *bucket_info);**

  • //sets bucket - which bucket to use for any operation within this class
    //i/p - {"bucket_name":"myname", "access_key":"akey", "secret_key":"skey"}
    void set_bucket(char *bucket_info);
  • //key is the key of the file using which the brs will store it
    //fpath is full path of the file on local fs, iop is flag for operation put
    int upload_file(char *key, char *fpath, insert_options iop);
  • //train req : It depends on what kind of algo is requested. Format changes for different types
    int train_model(char *req);
  • //req : {"account_id":"AACCEEGGIILLNN", "model_name":"my_model1"}
    char *get_model_status(char *req);
  • //{"account_id":, "model_name":}
    int del_model(char *req);
  • //{"account_id":, "model_name":}
    int del_train_request(char *req);
  • //req : depends on algo etc, user ought to provide the right one
    char *predict(char *req);
  • //get training requests for a given account - all training requests
    //req : {account_id:"aacid"}
    resultset *get_training_requests(char *req);
  • //count models for a given account, all the models
    //req : {account_id:"aacid"}
    long count_models(char *req);
  • //This is to re-init the model data manager in case we would like to change the
    //IP:PORT info for BRS, useful because BRS mostly will be separate and mostly static
    //but may change due to load etc, as BRS can scale linearly
    //req : {"bucket_info", "brs_ip", "brs_port"}
    int reinit_mdm(char *req);
  • //how many objects are using this reference
    int get_ref_count();
  • //get the handle of BRS - useful only for embd as client should never bother about this
    bangdb_resource_manager *get_brs();
  • //this is to test if brs is local to the BE server DB bool is_brs_local();
  • void clean_ml_helper();

For developer, we have following interfaces

Development Interfaces, classes

  • iq_train_predict
  • model_data_manager
  • pred_housekeep
  • iqconvert
  • ml_bangdb

Out of these, iq_train_predict is the interface which we need to implement for every new algo we add. For ex, we have svm_train_predict for svm, similarly ie_train_predict for IE etc.

iq_convert is for converting the format of a file from f1 to f2.

pred_housekeep keeps the state of any request, training info etc. It also provides locking apis for safely handling of parallel trainings or predictions

model_data_manager manages the models. It interfaces with BRS to get or put data (any data)

Finlly ml_bangdb or ie_bangdb are collections of helper functions

Details of these would be defined below.

iq_train_predicit

    void set_housekeep(void *hkeep);
              char *train_model(char *param_list);
              char *predict(char *str, void *arg = NULL);
              char *get_status(char *model_detail);
              void close_trainer();
          

We just need to implement above five APIs to add any new algo