Memory Budget

We know that training esp is memory intensive and server may crash if not handled properly. Similarly, when we run prediction server, to be able to handle prediction faster, we must have models loaded in the memory. But again, how many models? what if we need more than we can handle given amount of memory? etc. To address all these, we have memory budget concept implemented in the ML infra.

1. Training happens in given memory budget. It never exceeds the given amount. In future we may add speed as param for training and accordingly can adjust the memory. As of now user needs to explicitly define the memory budget

2. The model manager also works within memory budget. It manages all required models with the given amount

LRU for ensuring memory budget constraints

When we have more objects to deal with than the given memory, we must have ways to de-allocate few objects in favor of newer ones. Therefore, BangDB ML Infra implements its own LRU which keeps objects in-memory as much as possible, however when it requires more, it will de-allocate or invalidate the ones which were accessed earliest. Which means it tries to keep the recently accessed objects in-memory as much as possible. This is very light weight implementation, but performance is high and its thread safe

BRS Integration

Now we don't have dependency on S3 and it's removed completely. We have BRS for the same purpose. The ML Infra is seamlessly integrated with BRS.

BSM Integration

For BSM, not much has changed from outside and we still use svm_predictor and most of the APIs are same. However, there are subtle changes internally, which basically aligns with the new implementation. Therefore, in BSM we don't have any impact from caller or usage perspective. Existing code for Get, Put, scan etc. should just work as they used to