Building a Local-First Desktop App with Python and PyQt6
I want to walk you through something I built recently. A desktop app that runs entirely on the user's machine — no cloud, no auth server, no database to manage.
If you're tired of the complexity that comes with modern web stacks, this might resonate.
Why local-first?
I was building a Reddit scraping tool. Web app version first — Next.js, Python backend, Supabase.
Problem: Reddit blocks datacenter IPs aggressively. My VPS got flagged in under 5 minutes. Proxies helped temporarily but Reddit kept updating detection. Support inbox filled with "why am I blocked?" emails.
The fix was obvious once I stopped fighting it: run the app on the user's machine. Their home IP. Their connection. Reddit sees normal browsing because it IS normal browsing.
The stack
| Component | Choice | Why |
| UI Framework | PyQt6 | Native look, 50MB vs 150MB+ Electron |
| Database | SQLite | One file, zero config, portable |
| HTTP | requests + feedparser | JSON API primary, RSS fallback |
| Packaging | PyInstaller | Single .exe output |
Core scraping logic
The actual scraping is almost embarrassingly simple:
import requests
from typing import List, Dict
def scrape_subreddit(name: str, limit: int = 100) -> List[Dict]:
"""Fetch posts from a subreddit using their JSON API."""
url = f"https://reddit.com/r/{name}.json"
params = {"limit": limit}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
}
response = requests.get(url, params=params, headers=headers)
if response.status_code == 200:
data = response.json()
return [post["data"] for post in data["data"]["children"]]
# Fallback to RSS if JSON blocked
return scrape_via_rss(name)
That's the core. A GET request from the user's IP. No proxy magic. No authentication headers. Just a normal browser request.
The RSS fallback
Reddit sometimes blocks JSON API access but leaves RSS open. Having both gives you resilience:
import feedparser
def scrape_via_rss(subreddit: str) -> List[Dict]:
"""Fallback parser using RSS feed."""
feed = feedparser.parse(f"https://reddit.com/r/{subreddit}.rss")
posts = []
for entry in feed.entries:
posts.append({
"title": entry.title,
"url": entry.link,
"author": entry.author,
})
return posts
Handling subscriptions without a server
This was the tricky part. How do you enforce usage limits when the user has the binary?
Short answer: you can't. Not really.
My approach: Trust + lightweight verification
The app makes one API call per session to check subscription status. If offline or blocked, defaults to free tier (15 scrapes/day). Could be bypassed? Sure. But users who need this for business are happy to pay.
I decided early on that I wasn't going to punish paying customers with aggressive DRM just to stop a few pirates who were never going to pay anyway.
The PyQt6 experience
I've used Electron before. It works, but shipping a whole Chromium instance for a utility app always felt wrong.
PyQt6 pros:
Native look and feel on Windows/Mac/Linux
Much smaller bundle (50MB vs 150MB+)
Better memory usage
Python ecosystem for everything else
PyQt6 cons:
Steeper learning curve than React
Documentation can be sparse
Signal/slot pattern takes getting used to
For a productivity tool like this, the trade-offs worked out. Not sure I'd use it for something more UI-heavy.
Results
After a few weeks:
Zero blocking issues. Users browse from home IPs.
No server costs. My hosting bill is literally $0.
Simpler support. Most issues are user-side, not infrastructure.
When NOT to go local-first
This approach has clear limits:
Need real-time collaboration? You need a server.
Mobile apps? Desktop-only is limiting.
Social features? Local data doesn't help.
But for single-user tools that talk to external APIs — especially APIs that fight scrapers — local-first is worth considering.
The tool is called Reddit Toolbox. Check it out at wappkit.com or search "Reddit Toolbox wappkit".