Zabbix monitoring

Deslok · Feb 18, 2016

Patrick said:
Ok that is now added to my to do list.

You and me both on that

ehorn · Feb 19, 2016

Thanks for mentioning this Patrick...

While these types of tools are not in my professional domain, I definitely want to play with this in the lab...

Bumped for interest...

Deslok · Feb 19, 2016

ehorn said:
Thanks for mentioning this Patrick...

While these types of tools are not in my professional domain, I definitely want to play with this in the lab...

Bumped for interest...

Try the prebuilt appliance for lab use they're super quick to deploy and test that way

briandm81 · Jul 7, 2016

So how has Zabbix 3.0 been for people? I had a VM on my lab run out of space, so I'm looking at deploying the appliance and trying it out. Thoughts? Guidance?

wildchild · Jul 7, 2016

Well.. we use it as our main monitoring system.
If i were you i would use auto discovery as much as possible both in network and templates.
Make sure you system has enough storage and you use varnish or another cacher as front end.
Make sure you spawn enough pingers , discoverers etc.
Try to seperate mariadb/mysql from the front and zabbix back end
Mysql/mariadb will grow fast so make sure you set you house keeping with enough thought

dba · Jul 7, 2016

Patrick said:
Just wondering if we have anyone here who is a Zabbix guru or knows someone who is. I want to see if I can hire someone to setup Zabbix in the lab and do a few tutorials. I have the pre-configured Zabbix VM up and running but a bit of a learning curve.

Zabbix graphing is really primitive. Check out Grafana when the built-in stuff gets too annoying for you. They have a pretty nice Zabbix plug-in.

Jerry Renwick · Jul 7, 2016

wildchild said:
i used ( and bought ) this book, and found it very reading worthy : Zabbix Cookbook | PACKT Books

Thank you for your sharing, It will be a brand new topic!

Jb boin · Jul 10, 2016

For a way better use of disk space and memory i use the TokuDB storage engine (available on MariaDB and Percona MySQL) which compresses the datas on disk and memory and its use of fractal indexes instead of classical B-Tree ones gives very good result on huge tables (with millions of rows).

I use it for the "history", "history_uint", "trends" and "trends_uint" tables which in my case have more than 316 milion rows, 430 million rows, 14 milion rows and 51 milion rows on my main Zabbix server.

I dont remember the exact figures but using TokuDB default zlib compression made these tables between 3 and 10 times smaller than using InnoDB and at least 1.5 times smaller than InnoDB with "row_format=compressed" (which was also way slower).

Other available compression types : TokuDB System Variables

Please note that the "tokudb_cache_size" variable which is the equivalent of the "innodb_cache_size" for InnoDB has a default value of 50% of the system memory, which is rarely optimum.

I also use these specific settings :

tokudb_directio=1 # bypass the kernel file cache, equivalent to "innodb_flush_method=O_DIRECT" for InnoDB

tokudb_commit_sync = 0 # instead of committing after each transaction to disk, does it asynchronously, limits the IO but adds the risk of loosing, in case of a crash, the transaction(s) committed since the last flush, equivalent to "innodb_flush_log_at_trx_commit=2" for InnoDB

tokudb_fsync_log_period = 1000 # When "tokudb_commit_sync" is set to 0, force fsync() to be called at least every X milliseconds, equivalent to "innodb_flush_log_at_timeout=1" for InnoDB

--
As for the database that keeps growing, dont forget to :

Check the "Housekeeping" parameters, if you dont manually clean the database (by using table partitions and/or scripts/events), the internal housekeeping should be active

Most of the templates check too often values, for example, most users don't need to check every 30 seconds the bandwidth or the load of a server, once per minute (or even less) could suffice ; no need to retrieve every minutes the total memory capacity or every hour the disk capacity of a volume as its unlikely to change in that window (for these values prefer to use a short history length)

Don't get values that are not useful or that are redundant, for example, no need to retrieve the 3 load average values, only retrieve the 1 minute load average and if needed put a trigger on the last X minutes of this value or don't retrieve the number of processes on the system if its of no use ; getting free space in % and in bytes of a volume might be redundant

Don't put the "history" setting too high on items as every values retrieved will be kept for that duration (use trends if you don't need the precision after a certain time) : Keeping an history for 1 month on an item checked every minutes is about 45000 values, if you check this value on 25 systems, you end up with ~1,1 million values stored on the database after that month ; if for example you check the bandwidth usage, the usual way is to get separately the incoming and outgoing traffic, thus doubling the number of values to retrieve the bandwidth usage for an interface on a system and there could be more than a network interface (and don't forget to disable the loopback interface (lo) monitoring as its usually not giving any interesting informations)

Search

Zabbix monitoring

Deslok

Well-Known Member

ehorn

Active Member

Deslok

Well-Known Member

briandm81

Active Member

wildchild

Active Member

dba

Moderator

Jerry Renwick

Active Member

Jb boin

Member