Cause - v1.4.1's config.yaml.example had an em-dash (U+2014, 3-byte UTF-8 sequence) inside the skip_nsfw comment. When DSM File Station / classic Notepad re-saves the file, they convert it to a single 0x97 byte (Windows-1252 em-dash) — which is invalid UTF-8 and crashes yaml.safe_load with UnicodeDecodeError, returning 500 on every GET / after PIN unlock. Fix - src/web/config_manager.load_config(): try utf-8 first; on UnicodeDecodeError, retry with cp1252 (which maps 0x97 → U+2014 correctly so PyYAML accepts the text). Log a warning so the next save_config() normalizes the file back to UTF-8. - config.yaml.example: replace em-dash with comma in the skip_nsfw comment so the canonical example is pure ASCII and cannot trigger this again. Bump 1.4.1 -> 1.4.2 (patch — bugfix to a regression introduced by 1.4.1's own example file). Verified - 151 tests still green. - Manual smoke: a YAML file containing raw 0x97 byte loads correctly via the cp1252 fallback, with warning logged. UTF-8 files unchanged. For existing users hit by this on the NAS, the immediate workaround is to remove the em-dash from their config.yaml's skip_nsfw comment line (or just delete the comment entirely). After 1.4.2 the loader handles it transparently.
39 lines
1.1 KiB
Text
39 lines
1.1 KiB
Text
# Reddit Media Collector - Configuration
|
|
# Copy this file to config.yaml and customize as needed
|
|
# No Reddit API credentials required - uses public JSON endpoints
|
|
|
|
targets:
|
|
# Subreddits to collect from
|
|
subreddits:
|
|
- name: "pics"
|
|
limit: 25
|
|
sort: "hot" # hot, new, top, rising
|
|
|
|
- name: "earthporn"
|
|
limit: 50
|
|
sort: "top"
|
|
time_filter: "week" # hour, day, week, month, year, all (only for "top" sort)
|
|
|
|
# Users to collect from (their submitted posts)
|
|
users: []
|
|
# - name: "example_user"
|
|
# limit: 30
|
|
|
|
download:
|
|
output_dir: "./downloads"
|
|
media_types:
|
|
- "image"
|
|
- "video"
|
|
- "gif"
|
|
min_score: 10 # Minimum upvotes to download
|
|
skip_nsfw: false # Set to true to skip NSFW posts (default: false, collects everything)
|
|
max_file_size_mb: 100 # Skip files larger than this
|
|
|
|
rate_limit:
|
|
# Be gentle! Public API has stricter limits than authenticated
|
|
requests_per_minute: 10 # Keep low to avoid 429 errors
|
|
download_delay_seconds: 2 # Delay between file downloads
|
|
|
|
logging:
|
|
level: "INFO" # DEBUG, INFO, WARNING, ERROR
|
|
file: "collector.log"
|