PromptsVault AI is thinking...
Searching the best prompts from our community
Searching the best prompts from our community
Prompts matching the #monitoring tag
Set up comprehensive monitoring with Prometheus and Grafana. Components: 1. Prometheus server with service discovery. 2. Node Exporter for system metrics. 3. Application instrumentation with custom metrics. 4. Alertmanager for notifications (PagerDuty, Slack). 5. Grafana dashboards for visualization (RED metrics, resource usage). 6. Recording rules for aggregations. 7. Alert rules for SLO violations. Use Docker Compose for local setup. Include retention policies and high-availability configuration.
Systematically improve product performance and user experience. Performance metrics: 1. Core Web Vitals: Largest Contentful Paint (LCP <2.5s), First Input Delay (FID <100ms), Cumulative Layout Shift (CLS <0.1). 2. Time to First Byte (TTFB <600ms). 3. Time to Interactive (TTI <5s). 4. Application response times: API calls, database queries. Performance monitoring: 1. Real User Monitoring (RUM): actual user experience data. 2. Synthetic monitoring: automated performance tests. 3. Server monitoring: CPU, memory, disk usage. 4. CDN analytics: cache hit rates, edge performance. Optimization strategies: 1. Frontend: code splitting, lazy loading, image optimization, caching. 2. Backend: database query optimization, caching layers, microservices. 3. Infrastructure: CDN, load balancing, auto-scaling. Tools: Google PageSpeed Insights, New Relic, DataDog for monitoring. Performance budget: set thresholds, alert when exceeded, gate deployments on performance regression.
Build an automated data quality monitoring system. Checks to implement: 1. Completeness (null percentage per column). 2. Uniqueness (duplicate detection). 3. Validity (regex patterns, range checks). 4. Timeliness (data freshness alerts). 5. Consistency (cross-table referential integrity). Create a dashboard showing quality scores over time with alerting for threshold breaches. Use Great Expectations or custom Python validators.
Handle errors and log effectively. Practices: 1. Catch errors at boundaries. 2. Specific error types vs generic. 3. User-friendly error messages. 4. Detailed logs for debugging. 5. Structured logging (JSON). 6. Log levels (ERROR, WARN, INFO, DEBUG). 7. Correlation IDs for tracing. 8. Never log sensitive data. Use Winston, Pino, or similar. Centralize logs with ELK or Datadog. Monitor error rates.
Build comprehensive monitoring and observability infrastructure for production systems. Monitoring stack architecture: 1. Metrics: Prometheus for collection, Grafana for visualization, 15-second scrape intervals. 2. Logging: ELK Stack (Elasticsearch, Logstash, Kibana) or EFK (Fluentd instead of Logstash). 3. Tracing: Jaeger for distributed tracing, OpenTelemetry for instrumentation. 4. Alerting: AlertManager for routing, PagerDuty for escalation. Key metrics to monitor: 1. Infrastructure: CPU (>80% alert), memory (>85%), disk space (>90%), network I/O. 2. Application: response time (<200ms target), error rate (<0.1%), throughput (requests/second). 3. Business: user signups, conversion rates, revenue metrics, feature usage. Alerting best practices: 1. Alert fatigue prevention: meaningful alerts only, proper severity levels (critical/warning/info). 2. Runbook automation: automated remediation for common issues, escalation procedures. 3. On-call rotation: 7-day rotations, primary/secondary coverage, fair distribution. Dashboard design: 1. Golden signals: latency, traffic, errors, saturation for each service. 2. SLA monitoring: 99.9% uptime target, error budget tracking, service level indicators. Log management: structured logging (JSON), log retention policies (90 days), centralized aggregation with filtering.