gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII

I have a string I’m converting from "\r\n" to "\n" line ends using:

input.gsub(/\r\n?/, "\n")

When I run it on my Windows host, it works fine. When I run on my Linux host, in a docker container, I get this error:

  • Why is my npm dockerfile looping?
  • Get list of containers/services of docker-compose from inside a container
  • Why does building from a docker file take up all the disk space?
  • error: database is uninitialized and MYSQL_ROOT_PASSWORD not set
  • DockerHub not updating repository timestamp when updated repo pushed with same tag
  • How to install docker in docker container?
  • in `gsub': invalid byte sequence in US-ASCII (ArgumentError)
    

    I am running Ruby 2.2.

  • Docker image layers tree
  • Composer Install (own Container) with Docker missing PHP Extensions
  • Rails Puma inside of Docker Instance stops responding from inactivity
  • Docker with Django/PostgreSQL
  • Running JUnit Tests in parallel with Docker
  • Spring Cloud microservices memory usage
  • 2 Solutions collect form web for “gsub encoding error when running on Linux Docker Container Invalid byte sequence in US-ASCII”

    I fixed it by doing this Invalid byte sequence in UTF-8 (ArgumentError)

    Answering this question as documenting purposses. Looks like this encoding issues are not isolated to a single string for the user, but a known issue in docker as documented here:

    https://oncletom.io/2015/docker-encoding/

    Seems that some images depends from the original Debian image and:

    Debian removed their dependency on the locales package in 2011. It
    explains the unavailability of en_US.UTF-8.

    That will explain why some docker images on inheriting the system configuration fails to use UTF-8.

    $ docker run -ti --rm ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > US-ASCII
    

    locale command in the docker shows empty values where they should be the ones in the system:

    $ docker run -ti --rm ruby:2.1.5 locale
    LANG=
    LANGUAGE=
    LC_CTYPE="POSIX"
    LC_NUMERIC="POSIX"
    LC_TIME="POSIX"
    LC_COLLATE="POSIX"
    LC_MONETARY="POSIX"
    LC_MESSAGES="POSIX"
    LC_PAPER="POSIX"
    LC_NAME="POSIX"
    LC_ADDRESS="POSIX"
    LC_TELEPHONE="POSIX"
    LC_MEASUREMENT="POSIX"
    LC_IDENTIFICATION="POSIX"
    LC_ALL=
    

    Defining LOCALE variable on initialize should work:

    $ docker run -ti --rm -e LANG=C.UTF-8 ruby:2.1.5 ruby -e 'puts STDIN.external_encoding'
    > UTF-8
    

    And even better, you can define it into the Dockerfile

    #/Dockerfile
    FROM ruby:2.1.5
    ENV LANG C.UTF-8
    
    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.