Scaling and Deploying Production Grade Convergence Docker Container

Hi. Its me again! I have a few questions with regards to using convergence server for production.

  1. How many connections can 1 instance of omnibus docker container support?
  2. Can I deploy multiple instances of this container using Kubernetes?
  3. If question 2 is true, then the OrientDB package together with the conatiner, can it work as a cluster? A separate question, why package the OrientDB inside a container? not sure if that is a good choice.
  4. Is the omnibus docker container mounted with an external path for persisting OrientDB data?
  5. Is there a guide line to how much storage space I should provision? I know this question depends on how many models i am looking at, but is there a formula for calculating? i.e. Elasticsearch has a formula to calculating the amount of RAM needed for a node.
  6. Typically how much RAM and CPU do I need for a single instance of the container?

Thanks!

Greetings. It is probably worth having a specific conversation on this, perhaps in our public slack. (https:// slack.convergence.io). In short, we ship the omnibus container to get people going quickly. If you check out our dockerhub account (https://hub.docker.com/u/convergencelabs) you will see that we also ship individual containers. These are generally the ones that we would recommend trying to scale up. We internally deploy these versions in Kubernetes to host our demo site.

A very rough approximation of how this works can be seen here: https://github.com/convergencelabs/convergence-deployment

Although, this is a docker compose set up, but it could be translated to Kubernetes. We are more than happy to help, but for full transparency one of the services we offer for paid support is supporting clustered / high availability deployments. Refer here: https://convergence.io/support/. The software will support it, but understanding performance requirements, SLAs, fault tolerance use cases (e.g. availability vs consistency, etc) and how to appropriately scale, health check, monitor, etc. is pretty dependent on your cloud vendor, PaaS (kubernetes, etc.). We are happy to help here, but really helping people get a bullet proof deployment meeting all the above tends to required several hours of conversation at this point.

Happy to discuss here or on slack. It might be worth a quick chat either way.

To initially answer your questions:

  1. It really depends on the application. How big are the models, how big are the individual mutations? How many operations per second does the average use generate, etc.
  2. The omnibus container (at this time) is not intended to cluster or for production. It was designed to get people up and running in development quickly so that they don’t have to deploy a bunch of different containers to get up and running.
  3. Orient DB is only packaged in the omnibus container so that folks don’t have to go set up orientDB to try out Convergence. This is not the recommended production scenario.
  4. Yes it can be. often with will host mount the OrientDB volume. This is a good question, we should add this to the DockerHub page for the omnibus container.
  5. Generally we recommend folks just create a few models and see what the storage requirement is. We don’t have this feature yet, but one of the things on the roadmap is to allow for pruning of history over time. Also many of our users have a separate persistent store for their data and only use ephemeral models in convergence, so the data is transient. In that case you only need to provision for peak concurrent models.
  6. IT also depends, but generally for folks running a single AWS instance we recommend something like a t2/t3.large if you are running all of the containers on one box.

Hi Michael thanks for your reply!

Noted on the support part. I am looking to deploy it as a proof of concept on premise. As a start I would say i will just look at the docker-compose script and deploy it. Once I have a firmer requirement I will definitely consider the support option to scale it correctly.

For this, the POC is similar to the mxGraph demo that you have. Just that it is using another separate library.

So the recommended approach is to still at least deploy it on a VM?

As mentioned earlier. the scenario I am trying out is similar to the mxGraph demo. So do you have any minimum resource recommendation for this? In my case, yes I also have a separate persistent data storage for the data that is to be collaborated on. But Convergence Server provides the history of the models, which my persistent storage does not have. So my point is, if history persistency becomes a requirement, then i would need to provide some storage sizing. But for now, i will go with the transient storage.

Noted on the t2/t3. large, thats the minimum?

Thanks for the replies once again!

I didn’t mean to imply that you have to deploy OrientDB in a VM. We do deploy it in a container. We just don’t deploy it in a container that has other stuff in it (like the omnibus container). When running OrientDB in a container there are some tricks to get it to cluster properly. And obviously you want to mount a volume for the databases directly so that it is persistent. Running it in a VM would be a bit easier, but since we deploy pretty much our whole infrastructure in k8s, we just wanted to have everything managed in the dame way.

Generally a t2/3.large is a good tarting point if you are running orientdb, the backend server, and the admin console all in one server.

We can go look at what the mxGraph demo is using and report back here. Might take a day or two since we are right in the middle of a release.

ic. Thanks Michael! Look forward to your reply and the new release!