I’m part of a pilot program testing LLMs for our company’s software development workflows. We're taking a more reserved approach so we can more carefully measure how effective and safe these tools are for us and our clients. I'm sure estimating cost is part of the intention as well.
We use Azure’s AI Foundry for privacy‑compliant model hosting. LiteLLM provides cost tracking and token distribution to the developers in the pilot program. We also host an Open WebUI instance in AWS to provide a web frontend. This web frontend has its own API key as well.
Unfortunately, the users of the pilot program have been plagued by 401
unauthorized and 403 forbidden responses. Our colleague who set up our LLM
infrastructure indicated that it was a frustrating LiteLLM bug.
Over several months, I noticed that depending on the language I was working
with, these 401 and 403 responses would occur more or less frequently:
| Language | Frequency of 401/403 |
|---|---|
| HTML | Always |
| JSX/TSX | Always |
| JavaScript | Frequently |
| YAML | Frequently |
| C# | Sometimes |
While I was noting the frequency of these error responses, other colleagues were sharing their workarounds with me:
Try replacing all the angle brackets!
I just screenshot my code and upload that.
While the second one was clever, I did not want to waste tokens on vision decoding.
I decided to bother my colleague in charge of setting up our LLM infrastructure one more time:
If LiteLLM is causing our problems, have you given any thought to using OpenRouter instead of LiteLLM for our backend?
My colleague mentioned that he would report the LiteLLM bug again since the
first one was closed as unreproducible. Something immediately clicked in my
head. Why would LiteLLM throw random 401 and 403 responses influenced by
request content?
Hey, we don't proxy any of those requests do we?
And my colleague replied:
No, it's a direct connection from clients to LiteLLM. I've considered making a proxy specifically for catching and clearing this issue but haven't gotten the time.
...
Wait, could this be the AWS XSS firewall rules?
Could our AWS WAF (web application firewall) be preventing what it thought were attempted XSS attacks? While waiting for confirmation, I theorized:
Based on what I'm seeing I think that could very well be the issue, if we're proxying requests through it.
It made so much sense. Of course a firewall that inspected HTTP requests for code injections would block requests containing code. But when you're talking to an LLM as a coding assistant, that's exactly what you want.
While I waited for a response, I imagined Jeff Bezos swatting my hand away from my keyboard as I "attempted" to commit XSS attacks against our own web server.
Just a few minutes later from my colleague:
Son of a #$@%*!
Learn from our mistake - WAFs can misinterpret LLM requests as XSS attempts, especially when the payload contains angle brackets or JavaScript-like syntax. Review and adjust your WAF rule set before rolling out LLM integrations.
