Puzzled by the cpushare setting on Docker.

I wrote a test program ‘cputest.py’ in python like this:

import time
 while True:
    for _ in range(10120*40):
        pass
    time.sleep(0.008)

, which costs 80% cpu when running in a container (without interference of other runnng containers).

  • Docker container CPU and Memory Utilization
  • How can I show only the allocated resources in Docker container?
  • Docker container's /proc/<pid>/stat not updating CPU stats
  • Application takes different amount of memory on different systems
  • My 'docker run' command uses 2 cpu cores by default. How to increase it?
  • How to Increase number of cpus on Docker
  • Then I ran this program in two containers by the following two commands:

    docker run -d -c 256 --cpuset=1 IMAGENAME python /cputest.py
    
    docker run -d -c 1024 --cpuset=1 IMAGENAME python /cputest.py
    

    and used ‘top’ to view their cpu costs. It turned out that they relatively cost 30% and 67% cpu. I’m pretty puzzled by this result. Would anyone kindly explain it for me? Many thanks!

  • How to check that a local URL is reachable
  • Why is this docker image allowed past the firewall?
  • Redis keys melting and not adding up
  • Service 'web' failed to build: lstat apache/sites-enabled/000-default.conf: no such file or directory
  • Does it make sense to dockerize (containerize) databases?
  • Unable to Connect MySQL container to Tomcat Container in docker
  • One Solution collect form web for “Puzzled by the cpushare setting on Docker.”

    I sat down last night and tried to figure this out on my own, but ended up not being able to explain the 70 / 30 split either. So, I sent an email to some other devs and got this response, which I think makes sense:

    I think you are slightly misunderstanding how task scheduling works – which is why the maths doesn’t work. I’ll try and dig out a good article but at a basic level the kernel assigns slices of time to each task that needs to execute and allocates slices to tasks with the given priorities.

    So with those priorities and the tight looped code (no sleep) the kernel assigns 4/5 of the slots to a and 1/5 to b. Hence the 80/20 split.

    However when you add in sleep it gets more complex. Sleep basically tells the kernel to yield the current task and then execution will return to that task after the sleep time has elapsed. It could be longer than the time given – especially if there are higher priority tasks running. When nothing else is running the kernel then just sits idle for the sleep time.

    But when you have two tasks the sleeps allow the two tasks to interweave. So when one sleeps the other can execute. This likely leads to a complex execution which you can’t model with simple maths. Feel free to prove me wrong there!

    I think another reason for the 70/30 split is the way you are doing “80% load”. The numbers you have chosen for the loop and the sleep just happen to work on your PC with a single task executing. You could try moving the loop to be based on elapsed time – so loop for 0.8 then sleep for 0.2. That might give you something closer to 80/20 but I don’t know.

    So in essence, your time.sleep() call is skewing your expected numbers, removing the time.sleep() causes the CPU load to be far closer to what you’d expect.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.