Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis_exporter do not rebase after redis restarts #257

Closed
Alexvianet opened this issue May 7, 2019 · 16 comments
Closed

Redis_exporter do not rebase after redis restarts #257

Alexvianet opened this issue May 7, 2019 · 16 comments

Comments

@Alexvianet
Copy link

After redis restarts: we got RedisExporterScrapeError:
max by(instance) (redis_exporter_last_scrape_error) != 0
bulk of redis instances with the same problem:

process_virtual_memory_bytes 1.4508032e+07
# HELP redis_exporter_build_info redis exporter build_info
# TYPE redis_exporter_build_info gauge
redis_exporter_build_info{build_date="2018-08-21-13:30:01",commit_sha="e0b388fbb784007cc887474f1be886d04f2dc973",golang_version="go1.10.3",version="v0.21.1"} 1                             
# HELP redis_exporter_last_scrape_duration_seconds The last scrape duration.
# TYPE redis_exporter_last_scrape_duration_seconds gauge
redis_exporter_last_scrape_duration_seconds 0.000395824
# HELP redis_exporter_last_scrape_error The last scrape error status.
# TYPE redis_exporter_last_scrape_error gauge
redis_exporter_last_scrape_error 1
# HELP redis_exporter_scrapes_total Current total redis scrapes.
# TYPE redis_exporter_scrapes_total counter
redis_exporter_scrapes_total 17941
# HELP redis_up upmetric
# TYPE redis_up gauge
redis_up{addr="MYIP:6379",alias=""} 0

It's going to normal after redis_exporter restart:

redis_process_id{addr="MYIP:6379",alias=""} 6119
# HELP redis_pubsub_channels pubsub_channelsmetric
# TYPE redis_pubsub_channels gauge
redis_pubsub_channels{addr="MYIP:6379",alias=""} 0
# HELP redis_pubsub_patterns pubsub_patternsmetric
# TYPE redis_pubsub_patterns gauge
redis_pubsub_patterns{addr="MYIP:6379",alias=""} 0
# HELP redis_rdb_changes_since_last_save rdb_changes_since_last_savemetric
# TYPE redis_rdb_changes_since_last_save gauge
redis_rdb_changes_since_last_save{addr="MYIP:6379",alias=""} 0
# HELP redis_rdb_current_bgsave_duration_sec rdb_current_bgsave_duration_secmetric
# TYPE redis_rdb_current_bgsave_duration_sec gauge
redis_rdb_current_bgsave_duration_sec{addr="MYIP:6379",alias=""} -1
# HELP redis_rdb_last_bgsave_duration_sec rdb_last_bgsave_duration_secmetric
# TYPE redis_rdb_last_bgsave_duration_sec gauge
redis_rdb_last_bgsave_duration_sec{addr="MYIP:6379",alias=""} -1
# HELP redis_rejected_connections_total rejected_connections_totalmetric
# TYPE redis_rejected_connections_total gauge
redis_rejected_connections_total{addr="MYIP:6379",alias=""} 0
# HELP redis_replication_backlog_bytes replication_backlog_bytesmetric
# TYPE redis_replication_backlog_bytes gauge
redis_replication_backlog_bytes{addr="MYIP:6379",alias=""} 1.048576e+06
# HELP redis_slowlog_length Total slowlog
# TYPE redis_slowlog_length gauge
redis_slowlog_length{addr="MYIP:6379",alias=""} 0
# HELP redis_start_time_seconds Start time of the Redis instance since unix epoch in seconds.
# TYPE redis_start_time_seconds gauge
redis_start_time_seconds{addr="MYIP:6379",alias=""} 1.557035176e+09
# HELP redis_total_system_memory_bytes total_system_memory_bytesmetric
# TYPE redis_total_system_memory_bytes gauge
redis_total_system_memory_bytes{addr="MYIP:6379",alias=""} 2.097545216e+09
# HELP redis_up upmetric
# TYPE redis_up gauge
redis_up{addr="MYIP:6379",alias=""} 1

redis_exporters logs :

time="2019-03-18T09:36:33Z" level=info msg="Redis Metrics Exporter v0.21.1    build date: 2018-08-21-13:30:01    sha1: e0b388fbb784007cc887474f1be886d04f2dc973    Go: go1.10.3"
time="2019-03-18T09:36:33Z" level=info msg="Providing metrics at :9121/metrics"
time="2019-03-18T09:36:33Z" level=info msg="Connecting to redis hosts: []string{\"10.233.171.140:6379\"}"
time="2019-03-18T09:36:33Z" level=info msg="Using alias: []string{\"\"}"

Do we have any solution except redis_exporter restarts ???

@oliver006
Copy link
Owner

Nothing else in the logs? Can you try running the exporter with the --debug flag to see if more verbose log output yields something useful?

You could also try updating to a more recent version of the exporter, 0.34.0 is the last one.

@Alexvianet
Copy link
Author

debug log:

time="2019-05-08T09:04:49Z" level=debug msg="info: # Keyspace"
time="2019-05-08T09:04:49Z" level=debug msg="info: "
time="2019-05-08T09:04:49Z" level=debug msg="c.Do() - running command: LATENCY [LATEST]"
time="2019-05-08T09:04:49Z" level=debug msg="c.Do() - done"
time="2019-05-08T09:04:49Z" level=debug msg="e.singleKeys: []exporter.dbKeyPair(nil)"
time="2019-05-08T09:04:49Z" level=debug msg="e.keys: []exporter.dbKeyPair(nil)"
time="2019-05-08T09:04:49Z" level=debug msg="allKeys: []exporter.dbKeyPair{}"
time="2019-05-08T09:04:49Z" level=debug msg="scrapeRedisHost() done"
time="2019-05-08T09:08:49Z" level=debug msg="Trying DialURL(): 10.233.171.148:6379"
time="2019-05-08T09:08:49Z" level=debug msg="DialURL() failed, err: parse 10.233.171.148:6379: first path segment in URL cannot contain colon"
time="2019-05-08T09:08:49Z" level=debug msg="Trying: Dial(): tcp 10.233.171.148:6379"
time="2019-05-08T09:08:49Z" level=debug msg="aborting for addr: 10.233.171.148:6379 - redis err: ERR invalid password"
time="2019-05-08T09:12:49Z" level=debug msg="Trying DialURL(): 10.233.171.148:6379"
time="2019-05-08T09:12:49Z" level=debug msg="DialURL() failed, err: parse 10.233.171.148:6379: first path segment in URL cannot contain colon"
time="2019-05-08T09:12:49Z" level=debug msg="Trying: Dial(): tcp 10.233.171.148:6379"
time="2019-05-08T09:12:49Z" level=debug msg="aborting for addr: 10.233.171.148:6379 - redis err: ERR invalid password"
time="2019-05-08T09:16:49Z" level=debug msg="Trying DialURL(): 10.233.171.148:6379"
time="2019-05-08T09:16:49Z" level=debug msg="DialURL() failed, err: parse 10.233.171.148:6379: first path segment in URL cannot contain colon"
time="2019-05-08T09:16:49Z" level=debug msg="Trying: Dial(): tcp 10.233.171.148:6379"
time="2019-05-08T09:16:49Z" level=debug msg="aborting for addr: 10.233.171.148:6379 - redis err: ERR invalid password"
time="2019-05-08T09:20:49Z" level=debug msg="Trying DialURL(): 10.233.171.148:6379"
time="2019-05-08T09:20:49Z" level=debug msg="DialURL() failed, err: parse 10.233.171.148:6379: first path segment in URL cannot contain colon"
time="2019-05-08T09:20:49Z" level=debug msg="Trying: Dial(): tcp 10.233.171.148:6379"
time="2019-05-08T09:20:49Z" level=debug msg="aborting for addr: 10.233.171.148:6379 - redis err: ERR invalid password"

@oliver006
Copy link
Owner

redis err: ERR invalid password" - that's odd, any idea how that could happen?

@Alexvianet
Copy link
Author

have no idea, i have found some issues according to "first path segment in URL cannot contain colon"
golang/go#19297

also i have change public ip to 127.0.0.1 in "-redis.addr=" fllag

#!/usr/bin/env bash

set -eu

RUN_DIR=/var/vcap/sys/run/redis_exporter
LOG_DIR=/var/vcap/sys/log/redis_exporter
TMP_DIR=/var/vcap/sys/tmp/redis_exporter
STORE_DIR=/var/vcap/store/redis_exporter
mkdir -p ${RUN_DIR} ${LOG_DIR} ${TMP_DIR} ${STORE_DIR}

PIDFILE=${RUN_DIR}/redis_exporter.pid

source /var/vcap/packages/redis_exporter/common/utils.sh
exec 1>> ${LOG_DIR}/$(basename "$0").stdout.log
exec 2>> ${LOG_DIR}/$(basename "$0").stderr.log

export PATH=/var/vcap/packages/redis_exporter/bin:${PATH}

case $1 in
  start)
    pid_guard ${PIDFILE} "redis_exporter"
    echo $$ > ${PIDFILE}

    exec redis_exporter \
       \
       \
       \
      -debug \
       \
       \
       \
       \
       \
      -redis.addr="127.0.0.1:6379" \
       \
       \
       \
      -redis.password="$(cat /var/vcap/store/redis/redis.conf | grep requirepass | awk '{print $2}')" \
       \
       \
       \
      -web.listen-address=":9121" \
       \
      >>  ${LOG_DIR}/redis_exporter.stdout.log \
      2>> ${LOG_DIR}/redis_exporter.stderr.log
    ;;

  stop)
    kill_and_wait ${PIDFILE}
    ;;

  *)
    echo "Usage: $0 {start|stop}"
    exit 1
    ;;

esac
exit 0

debug logs starts:

time="2019-05-08T21:07:16Z" level=info msg="Providing metrics at :9121/metrics"
time="2019-05-08T21:07:16Z" level=info msg="Connecting to redis hosts: []string{\"127.0.0.1:6379\"}"
time="2019-05-08T21:07:16Z" level=info msg="Using alias: []string{\"\"}"
time="2019-05-08T21:08:49Z" level=debug msg="Trying DialURL(): 127.0.0.1:6379"
time="2019-05-08T21:08:49Z" level=debug msg="DialURL() failed, err: parse 127.0.0.1:6379: first path segment in URL cannot contain colon"
time="2019-05-08T21:08:49Z" level=debug msg="Trying: Dial(): tcp 127.0.0.1:6379"
time="2019-05-08T21:08:49Z" level=debug msg="connected to: 127.0.0.1:6379"
time="2019-05-08T21:08:49Z" level=debug msg="Redis CONFIG err: ERR unknown command 'CONFIG'"
time="2019-05-08T21:08:49Z" level=debug msg="c.Do() - running command: INFO [ALL]"
time="2019-05-08T21:08:49Z" level=debug msg="c.Do() - done"
time="2019-05-08T21:08:49Z" level=debug msg="info: # Server"
time="2019-05-08T21:08:49Z" level=debug msg="info: redis_version:3.2.8"

@oliver006
Copy link
Owner

Could you try redis://127.0.0.1:6379 as the address?

@Alexvianet
Copy link
Author

time="2019-05-08T22:04:49Z" level=debug msg="c.Do() - running command: LATENCY [LATEST]"
time="2019-05-08T22:04:49Z" level=debug msg="c.Do() - done"
time="2019-05-08T22:04:49Z" level=debug msg="e.singleKeys: []exporter.dbKeyPair(nil)"
time="2019-05-08T22:04:49Z" level=debug msg="e.keys: []exporter.dbKeyPair(nil)"
time="2019-05-08T22:04:49Z" level=debug msg="allKeys: []exporter.dbKeyPair{}"
time="2019-05-08T22:04:49Z" level=debug msg="scrapeRedisHost() done"
time="2019-05-08T22:08:49Z" level=debug msg="Trying DialURL(): 127.0.0.1:6379"
time="2019-05-08T22:08:49Z" level=debug msg="DialURL() failed, err: parse 127.0.0.1:6379: first path segment in URL cannot contain colon"
time="2019-05-08T22:08:49Z" level=debug msg="Trying: Dial(): tcp 127.0.0.1:6379"
time="2019-05-08T22:08:49Z" level=debug msg="aborting for addr: 127.0.0.1:6379 - redis err: ERR invalid password"
time="2019-05-08T22:12:49Z" level=debug msg="Trying DialURL(): 127.0.0.1:6379"
time="2019-05-08T22:12:49Z" level=debug msg="DialURL() failed, err: parse 127.0.0.1:6379: first path segment in URL cannot contain colon"
time="2019-05-08T22:12:49Z" level=debug msg="Trying: Dial(): tcp 127.0.0.1:6379"
time="2019-05-08T22:12:49Z" level=debug msg="aborting for addr: 127.0.0.1:6379 - redis err: ERR invalid password"
time="2019-05-08T22:16:49Z" level=debug msg="Trying DialURL(): 127.0.0.1:6379"

or u mean -redis.addr="redis://127.0.0.1:6379" ???

@oliver006
Copy link
Owner

Yes, -redis.addr="redis://127.0.0.1:6379" ???

@Alexvianet
Copy link
Author

time="2019-05-09T19:05:48Z" level=debug msg="keys: []exporter.dbKeyPair(nil)"
time="2019-05-09T19:05:48Z" level=debug msg="singleKeys: []exporter.dbKeyPair(nil)"
time="2019-05-09T19:05:48Z" level=info msg="Providing metrics at :9121/metrics"
time="2019-05-09T19:05:48Z" level=info msg="Connecting to redis hosts: []string{\"redis://127.0.0.1:6379\"}"
time="2019-05-09T19:05:48Z" level=info msg="Using alias: []string{\"\"}"
time="2019-05-09T19:08:49Z" level=debug msg="Trying DialURL(): redis://127.0.0.1:6379"
time="2019-05-09T19:08:49Z" level=debug msg="connected to: redis://127.0.0.1:6379"
time="2019-05-09T19:08:49Z" level=debug msg="Redis CONFIG err: ERR unknown command 'CONFIG'"
time="2019-05-09T19:08:49Z" level=debug msg="c.Do() - running command: INFO [ALL]"
time="2019-05-09T19:08:49Z" level=debug msg="c.Do() - done"
time="2019-05-09T19:08:49Z" level=debug msg="info: # Server"
time="2019-05-09T19:08:49Z" level=debug msg="info: redis_version:3.2.8"
time="2019-05-09T19:08:49Z" level=debug msg="info: redis_git_sha1:00000000"
time="2019-05-09T19:08:49Z" level=debug msg="info: redis_git_dirty:0"
time="2019-05-09T19:08:49Z" level=debug msg="info: redis_build_id:35fe56cfc9186c17"
time="2019-05-09T19:08:49Z" level=debug msg="info: redis_mode:standalone"
time="2019-05-09T19:08:49Z" level=debug msg="info: os:Linux 4.4.0-133-generic x86_64"
time="2019-05-09T19:08:49Z" level=debug msg="info: arch_bits:64"
time="2019-05-09T19:08:49Z" level=debug msg="info: multiplexing_api:epoll"
time="2019-05-09T19:08:49Z" level=debug msg="info: gcc_version:4.8.4"
time="2019-05-09T19:08:49Z" level=debug msg="info: process_id:5550"
time="2019-05-09T19:08:49Z" level=debug msg="info: run_id:d636b100e59921cf50c5e20e724aa0dee52da5ac"
time="2019-05-09T19:08:49Z" level=debug msg="info: tcp_port:6379"
time="2019-05-09T19:08:49Z" level=debug msg="info: uptime_in_seconds:3391"
time="2019-05-09T19:08:49Z" level=debug msg="info: uptime_in_days:0"
...
time="2019-05-09T19:08:49Z" level=debug msg="info: # CPU"
time="2019-05-09T19:08:49Z" level=debug msg="info: used_cpu_sys:1.26"
time="2019-05-09T19:08:49Z" level=debug msg="info: used_cpu_user:0.60"
time="2019-05-09T19:08:49Z" level=debug msg="info: used_cpu_sys_children:0.00"
time="2019-05-09T19:08:49Z" level=debug msg="info: used_cpu_user_children:0.00"
time="2019-05-09T19:08:49Z" level=debug msg="info: "
time="2019-05-09T19:08:49Z" level=debug msg="info: # Commandstats"
time="2019-05-09T19:08:49Z" level=debug msg="info: cmdstat_auth:calls=15,usec=92,usec_per_call=6.13"
time="2019-05-09T19:08:49Z" level=debug msg="info: "
time="2019-05-09T19:08:49Z" level=debug msg="info: # Cluster"
time="2019-05-09T19:08:49Z" level=debug msg="info: cluster_enabled:0"
time="2019-05-09T19:08:49Z" level=debug msg="info: "
time="2019-05-09T19:08:49Z" level=debug msg="info: # Keyspace"
time="2019-05-09T19:08:49Z" level=debug msg="info: "
time="2019-05-09T19:08:49Z" level=debug msg="c.Do() - running command: LATENCY [LATEST]"
time="2019-05-09T19:08:49Z" level=debug msg="c.Do() - done"
time="2019-05-09T19:08:49Z" level=debug msg="e.singleKeys: []exporter.dbKeyPair(nil)"
time="2019-05-09T19:08:49Z" level=debug msg="e.keys: []exporter.dbKeyPair(nil)"
time="2019-05-09T19:08:49Z" level=debug msg="allKeys: []exporter.dbKeyPair{}"
time="2019-05-09T19:08:49Z" level=debug msg="scrapeRedisHost() done"
time="2019-05-09T19:12:49Z" level=debug msg="Trying DialURL(): redis://127.0.0.1:6379"
time="2019-05-09T19:12:49Z" level=debug msg="DialURL() failed, err: ERR invalid password"
time="2019-05-09T19:12:49Z" level=debug msg="Trying: Dial(): redis 127.0.0.1:6379"
time="2019-05-09T19:12:49Z" level=debug msg="aborting for addr: redis://127.0.0.1:6379 - redis err: dial redis: unknown network redis"
time="2019-05-09T19:16:49Z" level=debug msg="Trying DialURL(): redis://127.0.0.1:6379"
time="2019-05-09T19:16:49Z" level=debug msg="DialURL() failed, err: ERR invalid password"
time="2019-05-09T19:16:49Z" level=debug msg="Trying: Dial(): redis 127.0.0.1:6379"
time="2019-05-09T19:16:49Z" level=debug msg="aborting for addr: redis://127.0.0.1:6379 - redis err: dial redis: unknown network redis"
time="2019-05-09T19:20:49Z" level=debug msg="Trying DialURL(): redis://127.0.0.1:6379"
time="2019-05-09T19:20:49Z" level=debug msg="DialURL() failed, err: ERR invalid password"
time="2019-05-09T19:20:49Z" level=debug msg="Trying: Dial(): redis 127.0.0.1:6379"

@Alexvianet
Copy link
Author

Alexvianet commented May 10, 2019

I have found, that after each redis restarts, requirepass in /var/vcap/store/redis/redis.conf is chenging each time

@oliver006
Copy link
Owner

Ok, weird. So we can close this?

@Alexvianet
Copy link
Author

can redis_exporter reread redis.file if it changed?
We use pivotal-cf/cf-redis-release https://github.com/pivotal-cf/cf-redis-release
with properties:
redis.security.require_pass:
description: Require clients to issue AUTH PASSWORD before processing any other commands. This will be overwritten everytime the broker allocates it as in use.

Maybe we can propose some reload config functional for this ?

@oliver006
Copy link
Owner

I see what the problem is.
How does your application deal with the issue of changing passwords? Constantly refresh the password from the file?

I think password file refresh would be a possibility but with the advent of PR #256 this logic will all be removed from the exporter and should be pushed up into your SD / Prometheus server.

@Alexvianet
Copy link
Author

yea it's refresh the password from the file

@Alexvianet
Copy link
Author

Alexvianet commented May 22, 2019

@oliver006 so in v1.0.0 it's possible to refresh password file ?

@oliver006
Copy link
Owner

Unfortunately, no. You'll have to move that reload functionality into your orchestration system and pick up the new password there and propagate it to the exporter.

@oliver006
Copy link
Owner

Not an exporter issue, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants