Why Monitoring is Non-Negotiable
Queue failures are silent by default. A job fails, goes to failed_jobs, and sits there.
No email, no alert, no dashboard notification — unless you set it up yourself.
Users experience broken features: emails never arrive, payments don't process, reports never generate.
Your team finds out from support tickets, not from monitoring.
Production queue monitoring answers three critical questions at all times:
Is the queue healthy? — Are jobs being processed at the expected rate?
What failed? — Which jobs failed, why, and how many times?
Is there a backlog? — Are jobs piling up faster than workers can process them?
Laravel Horizon – Deep Dive
Horizon is the official Laravel queue dashboard for Redis queues. It replaces queue:work
as your worker runner and adds a real-time web dashboard.
Installation
composer require laravel/horizon
php artisan horizon:install
php artisan migrate # creates horizon_jobs table for metrics
Secure the Dashboard
// app/Providers/HorizonServiceProvider.php
protected function gate(): void
{
Gate::define('viewHorizon', function ($user) {
return in_array($user->email, config('horizon.allowed_emails', []));
});
}
// config/horizon.php
'allowed_emails' => [
'admin@yoursite.com',
'devops@yoursite.com',
],
Dashboard Sections Explained
Dashboard — real-time job throughput, wait times, and failure rate charts. Check this first when something seems slow.
Monitoring — set queue size thresholds. Horizon fires an event (and can Slack-notify you) when a queue exceeds a limit.
Recent Jobs — the last 200 processed jobs with status, duration, and queue name. Invaluable for debugging.
Failed Jobs — all failed jobs with their exception message, stack trace, and payload. One-click retry from the UI.
Metrics — per-job-class throughput, average runtime, and failure rate. This is where you find your slowest and most-failing job types.
Batches — batch progress tracking. See completion percentage, success/failure counts per batch.
Horizon Supervisor Configuration
// config/horizon.php
'environments' => [
'production' => [
// High-priority queue — more workers
'supervisor-high' => [
'connection' => 'redis',
'queue' => ['payments', 'emails'],
'balance' => 'auto',
'autoScalingStrategy' => 'time', // scale based on wait time
'minProcesses' => 2,
'maxProcesses' => 15,
'tries' => 3,
'timeout' => 60,
'nice' => 0, // OS process priority (0 = normal, negative = higher)
],
// Default queue — fewer workers
'supervisor-default' => [
'connection' => 'redis',
'queue' => ['default', 'notifications'],
'balance' => 'auto',
'minProcesses' => 1,
'maxProcesses' => 5,
'tries' => 3,
'timeout' => 90,
],
],
]
Run Horizon (replaces queue:work)
# Development
php artisan horizon
# Production — use Supervisor to keep Horizon alive
[program:horizon]
command=/usr/local/bin/php /var/www/html/artisan horizon
autostart=true
autorestart=true
user=www-data
redirect_stderr=true
stdout_logfile=/var/www/html/storage/logs/horizon.log
stopwaitsecs=3600
Horizon Queue Monitoring — Size Alerts
// config/horizon.php
'waits' => [
'redis:default' => 60, // alert if wait time > 60s on default queue
'redis:emails' => 30, // alert if wait time > 30s on emails queue
'redis:payments' => 10, // alert if wait time > 10s on payments queue
],
// Listen for the event in a service provider
use Laravel\Horizon\Events\LongWaitDetected;
Event::listen(LongWaitDetected::class, function ($event) {
\Notification::route('slack', config('services.slack.ops_channel'))
->notify(new QueueWaitAlert($event->connection, $event->queue, $event->seconds));
});
Laravel Telescope – Queue Insights
Telescope is Laravel's debugging assistant. For queues, it records every dispatched job —
its payload, status, number of attempts, exceptions thrown, and execution time.
It's essential for development and staging environments.
Installation
composer require laravel/telescope --dev # dev only, not production
php artisan telescope:install
php artisan migrate
What Telescope Shows for Queues
Jobs tab — every dispatched job with its full serialized payload, class name, queue, connection, delay, status (pending/processed/failed), and execution duration.
Exceptions tab — all exceptions thrown inside jobs, with full stack traces. Filter by job class to see why specific jobs keep failing.
Queries tab — filter by request tag to see every database query a specific job executed. Essential for catching N+1 problems inside jobs.
Tag Jobs for Easier Filtering
class SendWelcomeEmail implements ShouldQueue
{
public function __construct(protected User $user) {}
// Telescope will tag this job with "user:{id}" for easy filtering
public function tags(): array
{
return ['user:' . $this->user->id, 'email', 'onboarding'];
}
public function handle(): void
{
// ...
}
}
⚠️ Never run Telescope in production with all watchers enabled. It stores every request, query, and job to the database — which creates massive storage usage and slows down your app. Disable it in production or use it only temporarily for debugging.
Sentry – Error Tracking for Jobs
Sentry captures exceptions from failed jobs and provides context: the full stack trace,
the user who triggered the action, breadcrumbs of what happened before the failure,
and grouping of similar errors so you see "this job failed 200 times for the same reason."
Installation
composer require sentry/sentry-laravel
# .env
SENTRY_LARAVEL_DSN=https://your-dsn@sentry.io/your-project
Sentry Automatically Captures Job Failures
Once installed, Sentry's Laravel SDK automatically captures any unhandled exception
from a queued job — no extra code needed in most cases.
Add Job Context to Sentry Reports
class ProcessPayment implements ShouldQueue
{
public function __construct(
protected Order $order,
protected User $user
) {}
public function handle(): void
{
// Add context for this job — shows in Sentry alongside the error
\Sentry\configureScope(function (\Sentry\State\Scope $scope) {
$scope->setUser(['id' => $this->user->id, 'email' => $this->user->email]);
$scope->setTag('order_id', $this->order->id);
$scope->setTag('job', 'ProcessPayment');
$scope->setContext('order', [
'id' => $this->order->id,
'total' => $this->order->total,
'status' => $this->order->status,
]);
});
// Your job logic...
PaymentGateway::charge($this->order);
}
}
Sentry Performance Monitoring for Jobs
public function handle(): void
{
$transaction = \Sentry\startTransaction([
'op' => 'queue.process',
'name' => 'ProcessPayment',
]);
\Sentry\SentrySdk::getCurrentHub()->setSpan($transaction);
try {
$span = $transaction->startChild(['op' => 'payment.charge']);
PaymentGateway::charge($this->order);
$span->finish();
$transaction->setStatus(\Sentry\Tracing\SpanStatus::ok());
} catch (\Throwable $e) {
$transaction->setStatus(\Sentry\Tracing\SpanStatus::internalError());
throw $e;
} finally {
$transaction->finish();
}
}
Custom Alerts & Notifications
Don't wait for users to report broken features. Build proactive alerting directly into your queue system.
Alert on Failed Job Threshold
// app/Console/Kernel.php — check every 5 minutes
$schedule->call(function () {
$recentFailures = DB::table('failed_jobs')
->where('failed_at', '>=', now()->subMinutes(5))
->count();
if ($recentFailures >= 10) {
\Notification::route('slack', config('services.slack.alerts'))
->notify(new QueueFailureAlert(
count: $recentFailures,
window: '5 minutes'
));
}
})->everyFiveMinutes()->withoutOverlapping();
// Alert on queue backlog
$schedule->call(function () {
$pending = DB::table('jobs')->where('queue', 'payments')->count();
if ($pending > 500) {
\Notification::route('slack', config('services.slack.alerts'))
->notify(new QueueBacklogAlert('payments', $pending));
}
})->everyFiveMinutes();
Slack Alert Format
// app/Notifications/QueueFailureAlert.php
class QueueFailureAlert extends Notification
{
public function __construct(
protected int $count,
protected string $window
) {}
public function via($notifiable): array
{
return ['slack'];
}
public function toSlack($notifiable): SlackMessage
{
return (new SlackMessage)
->error()
->content("🚨 *Queue Alert* — {$this->count} jobs failed in the last {$this->window}")
->attachment(function ($attachment) {
$attachment
->title('View Failed Jobs', url('/horizon/failed'))
->fields([
'Failed Count' => $this->count,
'Time Window' => $this->window,
'Environment' => app()->environment(),
]);
});
}
}
Auto-Healing Failed Jobs
Some failed jobs are transient — a network blip, a momentary API timeout.
You can build an auto-retry system that periodically retries recent failures automatically.
// app/Console/Commands/AutoRetryFailedJobs.php
class AutoRetryFailedJobs extends Command
{
protected $signature = 'queue:auto-retry {--hours=1}';
protected $description = 'Retry failed jobs from the last N hours';
public function handle(): void
{
$hours = (int) $this->option('hours');
$jobs = DB::table('failed_jobs')
->where('failed_at', '>=', now()->subHours($hours))
// Only auto-retry jobs that aren't due to code bugs
->whereIn('exception', $this->transientExceptions())
->get();
if ($jobs->isEmpty()) {
$this->info('No transient failures to retry.');
return;
}
foreach ($jobs as $job) {
$this->call('queue:retry', ['id' => [$job->uuid]]);
}
$this->info("Retried {$jobs->count()} failed jobs.");
}
private function transientExceptions(): array
{
return [
'Illuminate\Http\Client\ConnectionException',
'GuzzleHttp\Exception\ConnectException',
];
}
}
// Schedule it
$schedule->command('queue:auto-retry --hours=1')->everyThirtyMinutes();
⚠️ Be careful with auto-retry. Only retry jobs that failed due to clearly transient errors (network timeouts, connection refused). Never auto-retry jobs that failed due to validation errors, missing models, or code bugs — they will keep failing and create noise.
Artisan Management Commands
A complete reference of every Artisan command for managing queues in production:
# Worker management
php artisan queue:work # start a worker
php artisan queue:work --once # process one job and stop
php artisan queue:work --stop-when-empty # stop when queue drains
php artisan queue:restart # gracefully restart all workers
# Failed job management
php artisan queue:failed # list all failed jobs
php artisan queue:retry all # retry all failed jobs
php artisan queue:retry {uuid} # retry one specific job
php artisan queue:forget {uuid} # delete one failed job
php artisan queue:flush # delete ALL failed jobs
php artisan queue:prune-failed --hours=48 # delete failures older than 48h
# Queue inspection
php artisan queue:monitor redis:50 # alert if redis queue > 50 pending
php artisan queue:clear redis --queue=emails # delete all pending jobs from a queue
# Horizon
php artisan horizon # start Horizon (replaces queue:work)
php artisan horizon:pause # pause all workers (jobs stay in queue)
php artisan horizon:continue # resume workers
php artisan horizon:pause-supervisor supervisor-1 # pause specific supervisor
php artisan horizon:terminate # gracefully stop Horizon
php artisan horizon:clear # clear all completed/failed jobs from Horizon DB
php artisan horizon:snapshot # capture a metrics snapshot
Conclusion
Monitoring is the difference between reacting to failures and preventing them.
Here's the monitoring stack recommendation for each environment:
Development — Telescope. See every job, its payload, queries, and exceptions in real time.
Staging — Telescope + Sentry. Catch bugs before they reach production.
Production (Redis) — Horizon + Sentry + custom Slack alerts. Full visibility, real-time metrics, error tracking.
Production (Database) — Custom scheduler alerts + Sentry + queue:monitor. No Horizon (Redis only), but still viable monitoring.