Error Decoding msgpack data: invalid byte sequence in UTF-8
My new app use msgpack to encode data before sending to server. On server side,
its a Sinatra app that decode and store the data to database.
The app works fine until I push real data. With real data the app crash with error
"invalid byte sequence in UTF-8".
After some lengthy investigation, I found the data I sent to server is decoded
incorrectly. The offending code look like this:
unpacked = MessagePack.unpack(data)
What could possibly gone wrong?
Turns out as discussed here,
msgpack is a binary serialization format, and it expects to unpack from a raw binary string. You
need to force the data string (from HTTP POST request) to binary encoding.
MessagePack.unpack(data.force_encoding(Encoding::BINARY))
Now msgpack unpack the data properly.
P.S. If you use JRuby and msgpack-jruby, beware another issue that, msgpack-jruby
behave differently than the MRI version. It will not use default_external
encoding, but you will need to explicitly specify the encoding during unpack. (As
discussed here)