gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII

I have a string I’m converting from "\r\n" to "\n" line ends using:

input.gsub(/\r\n?/, "\n")

When I run it on my Windows host, it works fine. When I run on my Linux host, in a docker container, I get this error:

  • Build postgres docker container with initial schema
  • Is there a way to setup bluemix as $DOCKER_HOST and use it with the docker binary?
  • How to get all Travis CI environment variables, excluding the default system ones?
  • Run python website on docker
  • Docker Postgress Clear all Data
  • Error in building docker image with perl installation
  • in `gsub': invalid byte sequence in US-ASCII (ArgumentError)
    

    I am running Ruby 2.2.

  • 'No space left on device' after I changed Docker's storage base directory with DOCKER_OPTIONS
  • Diff docker registry images
  • Can't call btrfs send from another program
  • Docker and Spring boot: how to hide port from url?
  • error starting docker daemon on ubuntu 14.04 (Devices cgroup isn't mounted)
  • Accessing environment variables in Docker containers linked with --link
  • 2 Solutions collect form web for “gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII”

    I fixed it by doing this Invalid byte sequence in UTF-8 (ArgumentError)

    Answering this question as documenting purposses. Looks like this encoding issues are not isolated to a single string for the user, but a known issue in docker as documented here:

    https://oncletom.io/2015/docker-encoding/

    Seems that some images depends from the original Debian image and:

    Debian removed their dependency on the locales package in 2011. It
    explains the unavailability of en_US.UTF-8.

    That will explain why some docker images on inheriting the system configuration fails to use UTF-8.

    $ docker run -ti --rm ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > US-ASCII
    

    locale command in the docker shows empty values where they should be the ones in the system:

    $ docker run -ti --rm ruby:2.1.5 locale
    LANG=
    LANGUAGE=
    LC_CTYPE="POSIX"
    LC_NUMERIC="POSIX"
    LC_TIME="POSIX"
    LC_COLLATE="POSIX"
    LC_MONETARY="POSIX"
    LC_MESSAGES="POSIX"
    LC_PAPER="POSIX"
    LC_NAME="POSIX"
    LC_ADDRESS="POSIX"
    LC_TELEPHONE="POSIX"
    LC_MEASUREMENT="POSIX"
    LC_IDENTIFICATION="POSIX"
    LC_ALL=
    

    Defining LOCALE variable on initialize should work:

    $ docker run -ti --rm -e LANG=C.UTF-8 ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > UTF-8
    

    And even better, you can define it into the Dockerfile

    #/Dockerfile
    FROM ruby:2.1.5
    ENV LANG C.UTF-8
    
    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.