Always encode your Requests payloads in Python
At Bixoto, we use a lot of different APIs to interface with suppliers and other services. Today, I was working with an XML API using requests (via api_session) and xmltodict.
TL;DR: use requests.post(url, data=my_string.encode("utf-8")) and not requests.post(url, data=my_string).
Long version below:
The simplified code looked like this:
from api_session import APISession
import xmltodict
class TheClient(APISession):
    def post_xml_api(self, path: str, payload: dict) -> dict:
        # Transform a dict into an XML string
        xml = xmltodict.unparse(payload)
        # POST it to the API
        response = self.post_api(
            path,
            data=xml,
            headers={"Content-Type": "application/xml; charset=utf-8"},
        )
        # Parse the response XML as a dict again
        response.encoding = response.apparent_encoding
        return xmltodict.parse(response.text)
    def hello(self, name: str) -> str:
        res = self.post_xml_api("/hello", {"name": name})
        return res["message"]
# ...
client = TheClient(base_url="...")
print(client.hello("John"))  # => "Hello John!"
This worked great until I called client.hello() with a name that contained accents, such as “Élise”. The API provider complained that it wasn’t receiving UTF-8 data.
To debug the API client, I set up a simple server using nc in another terminal:
nc -l 1234
Then I used it as my base URL:
# note: this is a feature of api_session, not requests
client = TheClient(base_url="http://localhost:1234")
client.hello("Élise")
This is the result request:
POST /hello HTTP/1.1 Host: localhost:1234 User-Agent: python-requests/2.31.0 ... Content-Type: application/xml; charset=utf-8 Content-Length: 57 <?xml version="1.0" encoding="utf-8"?> <name>�lise</name>
There was indeed an issue with the encoding. I thought that Python used UTF-8 everywhere by default, but that’s not the case. The default charset for HTTP is ISO-8859-1, aka Latin-1 (see the RFC 2616).
Requests wraps Python’s http.client, which respects that:
If body is a string, it is encoded as ISO-8859-1, the default for HTTP.
The solution is to explicitly encode the request body:
# Before
response = requests.post(url, data=xml_string)
# After
response = requests.post(url, data=xml_string.encode("utf-8"))
That way, the body is already encoded and http.client doesn’t have to encode it by itself.

