distributed wide and deep with tf.contrib.learn api example stuck on k8s

I am new to distributed tensorflow. I tried to run distributed wide-and-deep example on one node k8s cluster, but the worker tasks all stuck at INFO:tensorflow:Create CheckpointSaverHook.

Test in localhost and in docker are all OK.

  • MongoDB doesn't work with Symfony app in Docker
  • Using Nginx as micro service API gateway
  • Elastic Beanstalk Docker images failing to pull
  • docker php mongodb passing links to mongodb connection
  • Postfix in Docker container can't resolve MX unless it gets restarted once
  • Getting custom workitem handlers into jBPM 6 docker images
  • Here is my code. https://github.com/zhoudongyan/wide-and-deep

    • docker version: 17.03.1-ce
    • k8s version: v1.6.3
    • tensorflow version: 1.1.0, python3
    • os: ubuntu 14.04 64bit

    Anyone know how to run it correctly? Thanks a lot!

  • How to import an unpopular package to Docker using the GOLang official image?
  • Pulling docker images
  • How do I expose a docker container port to another container
  • Can't pull image from docker repo
  • Kubernetes Private Docker Registry Push Error
  • windows docker on vmwarevsphere
  • Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.