gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII

I have a string I’m converting from "\r\n" to "\n" line ends using:

input.gsub(/\r\n?/, "\n")

When I run it on my Windows host, it works fine. When I run on my Linux host, in a docker container, I get this error:

  • docker - installing java 8 with curl
  • Host names are not set in docker compose
  • ENTRYPOINT & CMD commands with mongod results in unknown option error
  • Ansible - using register with a loop
  • bundling source code in docker containers
  • Which Kubernetes component creates a new pod?
  • in `gsub': invalid byte sequence in US-ASCII (ArgumentError)
    

    I am running Ruby 2.2.

  • How one should use jenkinsci/jnlp-slave for complex builds?
  • Consul for Docker container discovery: how to get latest container in Consul-Template?
  • Couchbase PHP SDK in Docker Container
  • Are docker images portable across different linux flavours?
  • How to change docker daemon root directory in CentOS7
  • Celery tasks for different python app in different Docker containers
  • 2 Solutions collect form web for “gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII”

    I fixed it by doing this Invalid byte sequence in UTF-8 (ArgumentError)

    Answering this question as documenting purposses. Looks like this encoding issues are not isolated to a single string for the user, but a known issue in docker as documented here:

    https://oncletom.io/2015/docker-encoding/

    Seems that some images depends from the original Debian image and:

    Debian removed their dependency on the locales package in 2011. It
    explains the unavailability of en_US.UTF-8.

    That will explain why some docker images on inheriting the system configuration fails to use UTF-8.

    $ docker run -ti --rm ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > US-ASCII
    

    locale command in the docker shows empty values where they should be the ones in the system:

    $ docker run -ti --rm ruby:2.1.5 locale
    LANG=
    LANGUAGE=
    LC_CTYPE="POSIX"
    LC_NUMERIC="POSIX"
    LC_TIME="POSIX"
    LC_COLLATE="POSIX"
    LC_MONETARY="POSIX"
    LC_MESSAGES="POSIX"
    LC_PAPER="POSIX"
    LC_NAME="POSIX"
    LC_ADDRESS="POSIX"
    LC_TELEPHONE="POSIX"
    LC_MEASUREMENT="POSIX"
    LC_IDENTIFICATION="POSIX"
    LC_ALL=
    

    Defining LOCALE variable on initialize should work:

    $ docker run -ti --rm -e LANG=C.UTF-8 ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > UTF-8
    

    And even better, you can define it into the Dockerfile

    #/Dockerfile
    FROM ruby:2.1.5
    ENV LANG C.UTF-8
    
    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.