Resolve Article Scraping Issues via WAF
Introduction​
When our system attempts to retrieve articles from your website, your Audioboost Publisher Manager may flag scraping failures. These are typically caused by security measures blocking our requests.
Available solutions:
- Whitelist our User-Agent (
speakup-article) on your server/WAF - Whitelist a unique GET parameter appended to our article URLs
Understanding WAF Interference​
What is a Web Application Firewall (WAF)?​
A WAF is a security layer between your web server and external traffic. It filters malicious requests (e.g., SQL injection, DDoS attacks) using predefined rules. While essential for security, WAFs can inadvertently block legitimate scrapers like ours.
How WAFs Interfere with Scraping​
| Issue | Description |
|---|---|
| User-Agent Blocking | WAFs may ban unknown/unrecognized user agents |
| Rate Limiting | Frequent requests from a single IP address or agent trigger blocks |
| Signature Detection | GET parameters or headers may resemble attack patterns |
Common WAF Providers​
| WAF Provider | Deployment Model |
|---|---|
| Cloudflare | Cloud-based (SaaS) |
| Akamai Kona Site Defender | Cloud-based (SaaS) |
| AWS WAF | Cloud (integrated with AWS services) |
| Imperva | Cloud / hybrid |
| F5 Advanced WAF | On-premises/hardware |
Solution 1: Whitelist Speakup User-Agent​
Add our user-agent speakup-article to your WAF's allowlist.
Generic Steps​
- Access your WAF dashboard (e.g., Cloudflare, AWS WAF)
- Navigate to "Security Rules" > "Allowlists" (or equivalent)
- Create a new rule:
- Match type:
User-Agent - Value:
speakup-article
- Match type:
- Set the rule action to
ALLOW(bypass other checks) - Save and deploy changes
Cloudflare Configuration​
- Select the website that you want to manage
- In the right menu, select
Security > WAFand click "Create rule"

- Create a new rule with these settings:
- Field: "User Agent"
- Operator: "contains"
- Value:
speakup-article - Action: Skip
- Mark the WAF components to skip as shown below:

- Click "Deploy"
Provider-Specific Documentation​
- AWS WAF: User-Agent Allowlisting
- Akamai: Modify Kona Rule Sets
Solution 2: Whitelist GET Parameter​
Use a unique parameter we'll append to article URLs:
https://example.com/article?scraper_token=speakup_article
Generic Steps​
- In your WAF dashboard, locate "Allowlisted Parameters" (or "Ignore Rules")
- Add
scraper_tokento the allowlist - Configure the rule:
- Applies to path:
/*(all articles) - Action:
BYPASSorIGNORE
- Applies to path:
AWS WAF Example​
{
"Name": "AllowSpeakupScraperToken",
"Priority": 1,
"Action": "ALLOW",
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true
},
"Rule": {
"Name": "scraper_token-param-rule",
"Action": "ALLOW",
"Match": {
"QueryParameter": {
"Key": "scraper_token",
"Value": "speakup_article"
}
}
}
}
Verification​
Our team will perform a new scraping attempt within 24 business hours of whitelisting upon confirmation.
Need Help?​
For issues, contact us at support@audioboost.com with:
- Your WAF provider name
- Example article URLs
- Blocked request logs (if available)
Summary​
Whitelisting either our User-Agent or the GET parameter ensures uninterrupted article scraping. Most WAFs support these adjustments via their dashboards.