Zabbix monitoring

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ehorn

Active Member
Jun 21, 2012
342
52
28
Thanks for mentioning this Patrick...

While these types of tools are not in my professional domain, I definitely want to play with this in the lab...

Bumped for interest...
 

Deslok

Well-Known Member
Jul 15, 2015
1,122
125
63
34
deslok.dyndns.org
Thanks for mentioning this Patrick...

While these types of tools are not in my professional domain, I definitely want to play with this in the lab...

Bumped for interest...
Try the prebuilt appliance for lab use they're super quick to deploy and test that way
 

briandm81

Active Member
Aug 31, 2014
300
68
28
42
So how has Zabbix 3.0 been for people? I had a VM on my lab run out of space, so I'm looking at deploying the appliance and trying it out. Thoughts? Guidance?
 

wildchild

Active Member
Feb 4, 2014
389
57
28
Well.. we use it as our main monitoring system.
If i were you i would use auto discovery as much as possible both in network and templates.
Make sure you system has enough storage and you use varnish or another cacher as front end.
Make sure you spawn enough pingers , discoverers etc.
Try to seperate mariadb/mysql from the front and zabbix back end
Mysql/mariadb will grow fast so make sure you set you house keeping with enough thought
 
  • Like
Reactions: Patrick

dba

Moderator
Feb 20, 2012
1,477
184
63
San Francisco Bay Area, California, USA
Just wondering if we have anyone here who is a Zabbix guru or knows someone who is. I want to see if I can hire someone to setup Zabbix in the lab and do a few tutorials. I have the pre-configured Zabbix VM up and running but a bit of a learning curve.
Zabbix graphing is really primitive. Check out Grafana when the built-in stuff gets too annoying for you. They have a pretty nice Zabbix plug-in.
 
  • Like
Reactions: Patrick

Jb boin

Member
May 26, 2016
49
16
8
36
Grenoble, France
www.phpnet.org
For a way better use of disk space and memory i use the TokuDB storage engine (available on MariaDB and Percona MySQL) which compresses the datas on disk and memory and its use of fractal indexes instead of classical B-Tree ones gives very good result on huge tables (with millions of rows).

I use it for the "history", "history_uint", "trends" and "trends_uint" tables which in my case have more than 316 milion rows, 430 million rows, 14 milion rows and 51 milion rows on my main Zabbix server.

I dont remember the exact figures but using TokuDB default zlib compression made these tables between 3 and 10 times smaller than using InnoDB and at least 1.5 times smaller than InnoDB with "row_format=compressed" (which was also way slower).

Other available compression types : TokuDB System Variables


Please note that the "tokudb_cache_size" variable which is the equivalent of the "innodb_cache_size" for InnoDB has a default value of 50% of the system memory, which is rarely optimum.


I also use these specific settings :
  • tokudb_directio=1 # bypass the kernel file cache, equivalent to "innodb_flush_method=O_DIRECT" for InnoDB

  • tokudb_commit_sync = 0 # instead of committing after each transaction to disk, does it asynchronously, limits the IO but adds the risk of loosing, in case of a crash, the transaction(s) committed since the last flush, equivalent to "innodb_flush_log_at_trx_commit=2" for InnoDB

  • tokudb_fsync_log_period = 1000 # When "tokudb_commit_sync" is set to 0, force fsync() to be called at least every X milliseconds, equivalent to "innodb_flush_log_at_timeout=1" for InnoDB


--
As for the database that keeps growing, dont forget to :
  • Check the "Housekeeping" parameters, if you dont manually clean the database (by using table partitions and/or scripts/events), the internal housekeeping should be active

  • Most of the templates check too often values, for example, most users don't need to check every 30 seconds the bandwidth or the load of a server, once per minute (or even less) could suffice ; no need to retrieve every minutes the total memory capacity or every hour the disk capacity of a volume as its unlikely to change in that window (for these values prefer to use a short history length)

  • Don't get values that are not useful or that are redundant, for example, no need to retrieve the 3 load average values, only retrieve the 1 minute load average and if needed put a trigger on the last X minutes of this value or don't retrieve the number of processes on the system if its of no use ; getting free space in % and in bytes of a volume might be redundant

  • Don't put the "history" setting too high on items as every values retrieved will be kept for that duration (use trends if you don't need the precision after a certain time) : Keeping an history for 1 month on an item checked every minutes is about 45000 values, if you check this value on 25 systems, you end up with ~1,1 million values stored on the database after that month ; if for example you check the bandwidth usage, the usual way is to get separately the incoming and outgoing traffic, thus doubling the number of values to retrieve the bandwidth usage for an interface on a system and there could be more than a network interface (and don't forget to disable the loopback interface (lo) monitoring as its usually not giving any interesting informations)
 
Last edited:
  • Like
Reactions: dba