blob: 0db1161e8c69f7cde1bead923d5b161165b5b5fb (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
|
# yt-sync.sh Improvements
## Problem
- Current scan takes 1-2 hours
- Scans 200 videos × 15 channels = 3000 metadata fetches
- Most videos are skipped (already downloaded or too old)
- Cron running hourly causes overlap
## Speed Improvements for yt-dlp
### 1. Break on existing (RECOMMENDED)
```bash
--break-on-existing
```
Stops scanning when it hits a video already in archive. Since playlists are chronological, once we hit an old video, all subsequent are old too.
### 2. Break on date reject
```bash
--break-on-reject
```
Stops when hitting a video outside the --dateafter range. Combined with chronological order, stops at first old video.
### 3. Reduce playlist scan depth
```bash
--playlist-end 50 # Instead of 200
```
Most channels don't post 50 videos in 30 days.
### 4. Track last sync timestamp
Store last successful sync time and use tighter --dateafter:
```bash
LAST_SYNC_FILE="$YOUTUBE_DIR/.last_sync"
if [[ -f "$LAST_SYNC_FILE" ]]; then
LAST_SYNC=$(cat "$LAST_SYNC_FILE")
DATE_AFTER="--dateafter $LAST_SYNC"
else
DATE_AFTER="--dateafter $(date -d '30 days ago' '+%Y%m%d')"
fi
# After successful sync:
date '+%Y%m%d' > "$LAST_SYNC_FILE"
```
### 5. Parallel channel downloads (aggressive)
Use GNU parallel to download multiple channels simultaneously:
```bash
parallel -j 3 yt-dlp [opts] ::: "${CHANNELS[@]}"
```
Risk: More likely to trigger rate limiting.
## Scheduling Options
### Option A: Systemd timer (prevents overlap)
```ini
# ~/.config/systemd/user/yt-sync.timer
[Unit]
Description=YouTube Sync Timer
[Timer]
OnCalendar=*-*-* 00,06,12,18:00:00
Persistent=true
[Install]
WantedBy=timers.target
```
```ini
# ~/.config/systemd/user/yt-sync.service
[Unit]
Description=YouTube Sync
[Service]
Type=oneshot
ExecStart=/home/cjennings/.local/bin/yt-sync.sh all
ExecStartPost=/home/cjennings/.local/bin/yt-sync.sh sync
```
Systemd won't start a new run if previous is still running.
### Option B: Lock file wrapper
```bash
#!/bin/bash
LOCKFILE="/tmp/yt-sync.lock"
exec 200>"$LOCKFILE"
flock -n 200 || { echo "Already running"; exit 1; }
# ... run sync ...
```
### Option C: Longer cron interval
```cron
# Every 4 hours during off-peak
0 0,4,20 * * * /home/cjennings/.local/bin/yt-sync.sh all && yt-sync.sh sync
```
## Recommended Changes
1. Add `--break-on-existing` to YT_OPTS (biggest win)
2. Add `--break-on-reject` to YT_OPTS
3. Reduce `--playlist-end` to 50
4. Use systemd timer instead of cron
5. Optionally track last sync date for tighter filtering
|