One debugging method I can suggest is to use an request printer (for example using netcat as nc -l 8080) and send requests to this request printer to examine the difference in the requests sent by the 2 libraries. Also check up on #534
urequests still using HTTP/1.0
The urequests module is still using HTTP/1.0 for HTTP requests:
line 39 in micropython-lib/micropython/urllib.urequest/urllib/urequest.py on the master branch:
s.write(b" HTTP/1.0\r\nHost: ")
Since a host header is included, can this be updated to use HTTP/1.1 or even HTTP/2.0?
Some sites (e.g. mongodb) are now responding with '426 Update Required' for HTTP/1.0 requests.
Thanks
Micropython's urequests behaves differently than python's requests
Hi,
I am using Micropython's urequests module to access a website but it returns 403 Access Denied error. However, when I use Python's requests module to access the same website, it returns the actual content of the website.
When using urequests:
import urequests
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url="https://www.zara.com"
response = urequests.get(url, headers=headers)
print(response.content)
Returns:
b'<HTML><HEAD>\n<TITLE>Access Denied</TITLE>\n</HEAD><BODY>\n<H1>Access Denied</H1>\n \nYou don\'t have permission to access "http://www.zara.com/es/es/blazer-traje-estructura-p00706334.html?" on this server.<P>\nReference #18.c4d31102.1677369242.52175e1f\n</BODY>\n</HTML>\n'
<br>
When using Python's requests:
import requests
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url="https://www.zara.com"
response = requests.get(url, headers=headers)
print(response.content)
It returns the actual content of the website.
I've tried looking what could cause this problem but haven't been able to make urequests behave like requests.
I ran this code on MicroPython 1.23 on the Unix port. It returns 200 OK with the webpage data. So it seems that this issue has been fixed.
import requests
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url="https://www.zara.com"
response = requests.get(url, headers=headers)
assert response.status_code == 200, response.content
print(response.content)
Hey! I also think it is better to update the header in line 39 of
micropython-lib/micropython/urllib.urequest/urllib/urequest.pyas,As I know, most major web servers and CDNs provide backward compatibility. Therefore there will be no harm by updating it.
HTTP/2 is a binary protocol and uses HPACK for header compression. Changing this, require additional code to handle the protocol and header compression.
I also think the change to HTTP/1.1 may require some additional code, to handle the different connection types :
close,keep-alive.In addition, there are different Transfer-Endocings.
It would be good to merge back some of the changes from the CircuitPython requests library (they've updated it since originally starting with the MicroPython implementation). In it, they respond with v1.1, though I'm not sure it 100% conforms to spec...
https://github.com/adafruit/Adafruit_CircuitPython_Requests/blob/main/adafruit_requests.py