Workers Free Tier Not Enough? 7 Optimization Tips to Make 100K Requests Last a Month

Introduction
Last month, I built an image hosting service using Workers with R2 storage, and it felt perfect. But less than 3 days later, I got an email from Cloudflare: your free tier is almost exhausted.
I was confused: isn’t it 100K requests per day? How much traffic could my tiny image hosting service have? I opened Analytics and saw 120K daily requests. How is that possible? I only uploaded a few dozen images.
After spending two days digging through documentation and community discussions, I finally understood how many pitfalls there are in Workers billing. Subrequests, KV reads, cache hits – each one was “quietly” consuming my quota.
But here’s the good news: after understanding the rules, I used several optimization techniques to reduce daily requests from 120K to 30K. Now not only is the monthly free tier enough, but I have 30% left over. Today I’ll share these practical experiences with you, hoping to help you save the $5/month paid plan fee.
Don’t Be Fooled! Workers’ “100K Times” Isn’t What You Think
To be honest, Cloudflare’s documentation says “100K requests/day,” which is easily misleading. I initially thought: my Worker can be accessed 100K times before hitting the limit. Actually, that’s not how it’s calculated.
Truth 1: Subrequests Aren’t Billed Separately, But Have Quantity Limits
Using fetch() in Workers to call other APIs, read R2, or query KV – these are all called subrequests. Good news: subrequests aren’t charged separately. Bad news: the free plan only allows 50 subrequests per request, paid plans get 1000.
For example: a user visits your image hosting URL (1 billable request), the Worker needs to query KV to verify permissions (1 subrequest), then fetch the image from R2 (1 subrequest). Although there are 2 subrequests, only 1 request is billed.
But if you build an aggregation service where one request calls 10 APIs, that’s 10 subrequests. The free plan’s limit of 50 sounds like a lot, but in real projects, you easily hit the ceiling.
Truth 2: 100K Is an Account-Level Limit, Not Per Worker
This is quite a trap. I initially thought I could create multiple Workers to distribute traffic, but later discovered that 100K times is a limit for the entire account. If you register 10 Workers, they add up to only 100K times.
Want to bypass the limit with multiple Workers? It won’t work. People in the community have tried, and Cloudflare’s limit is per account. To break through, you can only pay or optimize request counts.
Truth 3: KV Read/Write and Cache API Operations Count as Requests
This is the easiest to overlook. Every time a Worker does KV.get() to read data, although it doesn’t count toward the subrequest limit, it consumes request count. If your Worker needs to read KV to verify permissions every time, that means every user visit consumes an additional KV operation.
Cache API is the same. Although using Cache can reduce requests to the origin server, Cache’s match() and put() operations themselves have costs.
The 3 Most Common Pitfalls
I stepped into these pitfalls, which caused my request count to explode:
Pitfall 1: Reverse proxy sends subrequests every time, no caching I built an API relay service that
fetch()ed the original API every time. Didn’t think about caching, so every user request had to send a subrequest. After adding Cache API with 80% hit rate, request count was cut in half.Pitfall 2: Frequent KV reads, didn’t know about cacheTtl parameter In the image hosting project, every image had to
KV.get()to check permissions. I didn’t know you could set thecacheTtlparameter to let KV cache data at edge nodes. After changing tocacheTtl: 600(10 minutes), KV reads decreased by 70%.Pitfall 3: Every jump in redirect chains counts I used to build a URL shortener where Workers returned
302redirects. Later I discovered that if a redirect chain has 3 jumps (A→B→C→destination), each jump counts as 1 subrequest. Changed to directly returning the final address, saving 2 counts.
These pitfalls combined are why I went from 100K to 120K. If you’re also running out of quota, first check if you’ve stepped into these pitfalls.
7 Optimization Tips to Make Free Tier Last a Month
After understanding the billing rules, I tried many optimization approaches. These 7 tips are battle-tested, with the most obvious effects and not too complex to implement.
Tip 1: Use Caching Well, Reduce 80% of Repeated Requests
This is the fastest method I’ve found. Many Worker projects don’t actually need real-time computation every time; results can be cached completely.
Before optimization, my image hosting project had to go through the complete flow every time an image was accessed: verify → read R2 → return. After optimization, I added Cache API:
const cache = caches.default;
const cacheKey = new Request(request.url, request);
// Check cache first
let response = await cache.match(cacheKey);
if (response) {
return response; // Cache hit, return directly
}
// Cache miss, process request
response = await handleRequest(request);
// Set cache (static images cached for 1 day)
response = new Response(response.body, {
...response,
headers: {
...response.headers,
'Cache-Control': 'public, max-age=86400',
},
});
await cache.put(cacheKey, response.clone());
return response;After the change, cache hit rate was 85%. Originally 120K daily requests, now only 18K truly go through Worker logic, saving 102K requests.
Tip 2: Three KV Optimization Moves
KV is the most commonly used storage in Workers, but also a major source of request counts. I’ve summarized 3 optimization points:
Increase cacheTtl parameter KV defaults to caching at edge nodes for 60 seconds. If your data doesn’t update frequently, you can increase this value:
// Before optimization const value = await KV.get('key'); // After optimization (cache for 10 minutes) const value = await KV.get('key', { cacheTtl: 600 });My permission verification data only updates once every half hour, so I set it to
cacheTtl: 1800, and KV reads directly decreased by 70%.Cache KV results with Cache API If data updates even slower (like config files, blacklists), you can add another caching layer at the Worker level:
const cacheKey = `kv-cache:${key}`; let cached = await caches.default.match(cacheKey); if (!cached) { const value = await KV.get(key); cached = new Response(value); await caches.default.put(cacheKey, cached.clone()); } return cached.text();Use waitUntil for non-blocking writes If you need to write to KV but don’t need to wait for the result, using
waitUntilprevents blocking the response:// Don't wait for KV write to complete before returning response event.waitUntil(KV.put('key', 'value')); return new Response('OK');
These three moves combined reduced my KV operations from 30K daily to 8K.
Tip 3: Reduce Unnecessary Subrequests
Often, subrequests can be optimized away.
I used to build an aggregation service that called 5 external APIs, then merged and returned the results. Each request = 5 subrequests. Later I changed to:
- Cache high-frequency API results for 5 minutes
- Store low-frequency API results in KV for 24 hours
- If possible, pre-process data directly and store in R2
After optimization, 80% of requests don’t need to send any subrequests and return directly from cache.
Tip 4: Use Request.cache to Control Caching Behavior
Cloudflare added the Request.cache property in November 2024, allowing more granular cache control:
// Skip cache (suitable for sensitive data)
const response = await fetch(url, { cache: 'no-store' });
// Use default caching strategy
const response = await fetch(url, { cache: 'default' });When handling user private images, I use cache: 'no-store' to ensure they won’t be cached by CDN. For public images, I use default to let Cloudflare optimize automatically.
Tip 5: Optimize Redirect Chains
If your Worker returns redirects (302/301), note that every jump in the redirect chain counts as a subrequest.
My URL shortener before optimization:
// Before optimization: return 302 redirect
return Response.redirect(targetUrl, 302);After optimization:
// After optimization: cache target address, reduce redirects
const cached = await cache.match(shortUrl);
if (cached) {
return cached; // Return cached final address directly
}For links that jump multiple times (like bit.ly → t.co → final URL), store the final address directly to avoid going through the redirect chain every time.
Tip 6: Monitor Request Distribution, Find the “Money Eaters”
Cloudflare’s Analytics is free, so definitely use it.
I discovered a pattern: 80% of request volume often comes from 20% of paths. In Workers Analytics, you can see:
- Which paths have the most requests
- Which paths have low cache hit rates
- Which paths take the longest
Find the paths with the largest request volume and optimize specifically. I discovered that my /api/status health check endpoint was being called 100 times per minute by monitoring services, accounting for 15% of daily requests. After adding 60-second caching to this endpoint, I immediately saved 15K requests/day.
Tip 7: Process Non-Urgent Tasks Off-Peak
Workers support Cron Triggers for scheduled task execution. If some tasks aren’t real-time (like statistics, cleanup, cache warming), they can be moved to off-peak periods.
My image hosting has a feature: count visits every hour. Previously, every visit would write to a KV counter. Later I changed to:
- Count in memory during visits (don’t write to KV)
- Use Cron Trigger to aggregate once per hour
This turned frequent KV writes into one batch operation per hour. Daily KV writes dropped from 20K to 24.
Real Case: Image Hosting Project from 120K → 30K Daily Requests
After talking about so many tips, you might wonder: what’s the combined effect? Let me use my own image hosting project as an example to completely review the optimization process.
Project Background
This is a simple image hosting service with the tech stack:
- Cloudflare Workers processing requests
- R2 storing image files
- KV storing image metadata and permissions
- About 2000 daily image visits (real user traffic)
On day 3 after launch, I received a Cloudflare alert: 120K daily requests, nearing the free tier limit.
Problem Diagnosis
I spent half a day using Analytics and code review to find 3 main problems:
Every image request queries KV For every image visit, the Worker had to
KV.get()metadata (filename, size, uploader). 2000 image visits = 2000 KV reads. But this metadata basically doesn’t change and doesn’t need to be queried every time.No browser caching My response headers didn’t set
Cache-Control, causing the browser to re-request every time a user refreshed the page. The same image requested 10 times by the same user = Worker called 10 times.Real-time thumbnail generation The image list page displays thumbnails, and I was using Workers to crop and generate them in real-time. One list page with 30 images = 30 image processing requests. When users flip through pages, request count explodes.
These three problems caused 2000 actual valid visits to become 120K Worker calls.
Optimization Approach
针对这3个问题,我按优先级做了优化:
Step 1: KV Cache Optimization (Fastest Results)
Add cacheTtl parameter to KV reads:
// Before optimization
const metadata = await IMAGE_KV.get(imageId);
// After optimization
const metadata = await IMAGE_KV.get(imageId, {
cacheTtl: 600 // Cache for 10 minutes
});Effect: KV reads dropped from 2000 times/day to about 300 times/day (-85%)
Step 2: Add Browser Caching
Add cache headers to image responses:
return new Response(imageData, {
headers: {
'Content-Type': 'image/jpeg',
'Cache-Control': 'public, max-age=86400', // Cache for 1 day
'CDN-Cache-Control': 'public, max-age=2592000' // CDN cache for 30 days
}
});Effect: Repeat visits reduced by 60%, requests dropped from 120K to 48K
Step 3: Pre-generate Thumbnails
Stop generating in real-time, changed to pre-generating during upload and storing in R2’s thumbnails/ directory:
// Generate thumbnail during upload
const thumbnail = await generateThumbnail(image);
await R2.put(`thumbnails/${imageId}`, thumbnail);
// Read directly when accessing
const thumbnail = await R2.get(`thumbnails/${imageId}`);Effect: Thumbnail requests no longer trigger Worker processing logic, requests dropped from 48K to 32K
Step 4: Worker Cache Layer
Finally, add Cache API to entire response:
const cache = caches.default;
let response = await cache.match(request);
if (response) return response;
// Process request...
await cache.put(request, response.clone());
return response;Effect: 78% cache hit rate, finally stabilized at around 30K daily requests
Optimization Results
| Metric | Before | After | Change |
|---|---|---|---|
| Daily requests | 120K | 32K | -73% |
| KV reads | 2000 | 300 | -85% |
| Cache hit rate | 0% | 78% | +78% |
| Cost | 20% over | 70% remaining | Save $60/year |
Now this image hosting has been running stably for 2 months, with 30-40K daily requests, and the free tier is completely sufficient. Based on the paid plan at $5/month, I saved $60 per year.
Lessons Learned
Looking back at this optimization, I’ve summarized several key points:
- Find bottlenecks first, then optimize: Don’t blindly optimize, use Analytics to find the real problems
- Caching is king: 80% of optimization effects come from caching (browser cache, CDN cache, Worker cache)
- Use KV carefully: If it can be cached, definitely set
cacheTtl, if it can be pre-computed, don’t query in real-time - Incremental optimization: I optimized in 4 steps, each showing clear effects, rather than changing too much at once
Pay or Optimize? Let’s Do the Math
At this point you might wonder: should I spend time optimizing or just pay for the upgrade? I struggled with this question too, but after doing the math it became clear.
Free Plan vs Paid Plan Comparison
| Item | Free Plan | Paid Plan ($5/month) |
|---|---|---|
| Daily requests | 100K | ~330K (10M/month) |
| Per-minute limit | 1000 | No explicit limit |
| Subrequests | 50/request | 1000/request |
| KV reads | 100K/day | 10M/month |
| CPU time | 10ms | 50ms |
| Annual cost | $0 | $60 |
From the data, the paid plan is indeed attractive: daily requests triple, subrequest limit expands 20x. But the key is, do you really need it?
When Should You Upgrade to Paid?
I’ve summarized 3 scenarios where I recommend paying directly:
Daily requests consistently exceed 100K Note “consistently” exceed. If it’s just occasional traffic spikes, it can be solved through optimization. But if it exceeds consistently for a week, it means business volume has really picked up, and paying is more worry-free.
Need lots of subrequests If you’re building a scraper, aggregation service, or API gateway where one request calls 10+ external APIs, the free plan’s 50-request limit definitely isn’t enough. In this case optimization space is limited, and the paid plan’s 1000-request limit is more reasonable.
Commercial project, stability > cost If it’s a project for clients or your own commercial product, don’t skimp on the free tier. The paid plan’s SLA is more guaranteed, and you can open tickets if problems arise. The stability that $5 brings is far more important than saving that money.
Where Are the Limits of Optimization?
Conversely, some situations make optimization a waste of time:
Over-optimization leads to complex code To save requests, you write a bunch of caching logic, pre-computation scripts, scheduled tasks, and the code becomes very hard to maintain. At that point, your time cost might far exceed $5.
Optimization has reached its limit If you’ve added all the caching you can, optimized everything you can, and it’s still not enough, then it’s a business volume issue. Don’t tough it out, pay when you should.
Time cost vs $5 Calculate your hourly rate. If optimization takes 3 hours and your hourly rate exceeds $20, paying directly is more cost-effective.
My recommendation:
- Personal projects, learning projects: Prioritize optimization, save money and learn
- Small teams, early products: Optimize to the limit first, then consider paying
- Commercial projects, client projects: Pay directly, don’t save money on this
Still Need to Optimize After Paying?
Yes. Although the paid plan has higher quotas, the billing rules are the same. If you don’t optimize, you can use up 10 million requests too.
Moreover, the paid plan charges $0.50 per million requests over quota. If you average 1 million daily requests, that’s $15 in overage fees per month, plus the base $5 = $20/month. That’s when optimization value really shows.
Conclusion
After all this, the core message is one sentence: Understand how Workers billing works, then use the right optimization methods.
Workers’ 100K free tier looks like a lot, but in actual use it’s easy to exceed. But in most cases, it’s not that business volume is too high, it’s that optimization isn’t done well.
My image hosting project is an example: real traffic was only 2000 visits, but it triggered 120K requests. Through caching, KV optimization, and pre-computation, it finally dropped to 30K, not only enough but with余额 remaining.
If you’re also running out of quota, I suggest following these steps:
- Open Workers Analytics to see which paths are “eating” the quota
- Start with caching and KV, these two optimizations have the fastest results
- Check if you’ve stepped into the 3 pitfalls mentioned in this article (subrequests, KV cacheTtl, redirect chains)
- Try all 7 tips above, combined effects are better
- If optimization reaches its limit and still isn’t enough, then consider paying
One last reminder: Cloudflare is a commercial company after all, so use the free tier reasonably. Over-”wool pulling” might get rate limited or even banned. The purpose of optimization is to use resources more efficiently, not to exploit loopholes.
If this article helped you save $5, feel free to like or share with friends who are also using Workers. Let’s all make the most of the free tier together.
Published on: Dec 1, 2025 · Modified on: Dec 4, 2025
Related Posts

Complete Guide to Deploying Astro on Cloudflare: SSR Configuration + 3x Speed Boost for China

Building an Astro Blog from Scratch: Complete Guide from Homepage to Deployment in 1 Hour
