I was handling some text scraped using Scrapy and the text had non-ascii unicode charcters like \u003e. If I did this, it didn’t work:
1 |
html_text = response.text.encode('ascii', errors='ignore').decode() |
Here response.text is the string that contains unicode text (scrapy returns strings encoded in unicode). The html_text still had non ascii unicode characters like \u003e This worked:
1 |
html_text = response.text.encode('ascii', errors='ignore').decode('unicode-escape') |
Note that […]