Tuesday 27 August 2019

Traefik + Authelia on Kubernetes

I have recently been really getting into Kubernetes and have found it to be an amazing Container Orchestration product after dabbling with Docker for a while.  For those of you who aren't familiar with it, visit the Kubernetes official page.

Traefik is an edge router for Kubernetes that is fairly easy to use and setup within a Kubernetes cluster.  If you are not familiar with this, visit the Traefik official page.

Authelia is a cloud ready multi-factor authentication product and gives the ability to front end Authenticate such things as Prometheus or Alertmanager and bind them to LDAP groups/users.  Visit the Authelia official page for more information.

The main point of this article was there was no article on using Traefik + Authelia on Kubernetes.  The examples mainly point to using NGINX as the ingress traffic manager.  The main thing to note is that you need the following on your Ingress Config (note the annotations section) for whatever service you are publishing, the following example is the ingress config that I use to publish prometheus:

apiVersion: extensions/v1beta1
kind: Ingress
  name: prometheus
  namespace: monitoring
    kubernetes.io/ingress.class: traefik
    traefik.ingress.kubernetes.io/redirect-entry-point: ssl
    traefik.ingress.kubernetes.io/redirect-permanent: "true"
    ingress.kubernetes.io/auth-type: forward
    ingress.kubernetes.io/auth-url: http://authelia.test.com/api/verify?rd=                  https://authelia.test.com/%23/
  - host: prometheus.test.com
      - backend:
          serviceName: prometheus
          servicePort: http

The redirect-entry-point and redirect-permanent means that standard users cannot hit the prometheus page on HTTP.  Authelia has a rule that if you authenticate to a non SSL site, it will fail the authentication by saying the site is not secure.  The auth-url setting needs the %23 for the hash character otherwise it doesnt work.  In your Authelia config, ensure that you have a bypass for the authelia page

Wednesday 10 June 2015

get-startvmpolicy causes a 'systemDefault' was not found error

I was recently trying to set start orders for Trend Deep Security AV appliances on my hosts as I found some were not configured correctly.  Upon running the get-startvmpolicy command across the VM's to check their status, on some of them I received the following error:

Get-VMStartPolicy : 4/06/2015 9:53:22 a.m.    Get-VMStartPolicy        Requested value 'systemDefault' was not found.At line:1 char:40+ foreach ($vm in $vms){get-vmstartpolicy <<<<  -vm VM}    + CategoryInfo          : NotSpecified: (:) [Get-VMStartPolicy], VimException    + FullyQualifiedErrorId : Core_BaseCmdlet_UnknownError,VMware.VimAutomation.ViCore.Cmdlets.Commands.Host.GetVMStartPolicy

So it turns out after logging this with VMware that it is a known bug and has the Bug ID 1336800.  This is set to be fixed in the next release of PowerCLI

Saturday 24 January 2015

vSphere Replication Port Requirements

If you have tried to figure out what the port requirements are for just vSphere Replication then you have most likely seen VMware's knowledge base article KB1009562.  I found it a little bit confusing to follow specifically since I was only using vSphere Replication to replicate VM's between separate environments, separated by firewalls for every component.  One thing to note is that the vSphere Appliance versions should match at each site to get the best results as a 5.8 version cannot talk to a 5.5 version.  The following is a table of the required ports and in which direction.  *sorry about no visio, that may come*.  Note that VRS stands for Virtual Replication Server.

Source Destination Ports TCP
Site A vCenter Site A VRS 5480 / 8043
Site A VRS Site A vCenter 80 / 443
Site A vCenter Site B vCenter 80 / 10443
Site B vCenter Site A vCenter 80 / 10443
Site A VRS Site B vCenter 80
Site B VRS Site A vCenter 80
Site B VRS Site B vCenter 80 / 443
Site B vCenter Site B VRS 5480 / 8043
ESXi Hosts Site A / Site B Site A / Site B VRS 31031 / 44046

It also pays to note that the environment I was working had no DNS resolution between either sites.  The solutions are in each environment create a DNS Zone with all the relevant records in each site OR you can edit the relevant host files on all the components (which takes a while).  With the above setup you should then be able to replicate a VM from one site to another.

Wednesday 22 October 2014

HP Agentless Management Service (AMS) causes "Can't Fork" and MKS errors

If you find that you can't suddenly remote ANY VM's on a host and are given a "Cannot connect to MKS" error and trying to do anything on an ESXi console generate a "Cant Fork" error, then you need to upgrade your HP management Agents on your HP ESXi servers.  This error impacts version 5.0 and upwards and is outlined in KB2085618

I can confirm that upgrading to version 10.0.1 as per the article does remove the error.  Please note this only impacts servers that run the HP customised ISO / HP installed management agents with the impacted versions as listed in the KB.

Add ZFS storage space usage to SSH MOTD in Ubuntu 14.04

If you want to change the SSH Message of the Day so that it doesn't show the default / drive space AND have it show the size of your zfs pool then you have to do the following to allow it to display.  Please note that this is done on Ubuntu 14.04

First edit the disk.py file under /usr/lib/python2.7/dist-packages/landscape/lib with your favourite text editor and at the end of the STABLE_FILESYSTEMS entry add ', "zfs"' as shown in the screenshot below, then save and quit the editor

Next, edit the disk.py file under /usr/lib/python2.7/dist-packages/landscape/sysinfo with your favourite text editor and find the line main_info = get_filesystem_for_path("/", self._mounts_file, and edit the "/" path to the zfs file system you want, in my case its "/zfs/storage".  Also edit root_main_info = get_filesystem_for_path("/"..... and replace the "/" with the same entry as above.  These changes can be seen in the screenshot below.  Save and then quit the editor.  *Note a recent update removes the root_main_info line and is no longer required*

Done, the next time you ssh in to your server, you'll see your ZFS file system space!!!

Monday 29 September 2014

HP NMI Error with ESXi - Gen8

In the environment many of our ESXi hosts (HP DL380 Gen8's) have generated PSOD's with NMI generated events.  If you are experiencing this issue please UPGRADE your ilo4 firmware to version 1.51.  The issue described is under HP article c04332584.  You can download the latest firmware from HP.

Tuesday 2 September 2014

Deploying OVA fails with "Failed to Deploy OVF/OVA package: The operation is not supported on the object"

We recently had an issue deploying either an OVA/OVF file that was created from one of the virtual machines in our environment.  Deploying this OVA failed instantly (after entering cluster locations / datastore) with the following error: "The Operation is not supported on the object".  After much investigation and examining the /var/log/hostd.log file on the ESXi host we were trying to deploy too, we found the line:  "Video Ram size edit is not supported when auto-detect is True."  Editing the Virtual Machine hardware settings and editing the Video Card and changing the setting from Auto-detect to Specify custom settings (This can be anything) and then exporting the virtual machine as an OVA allowed the deployment to work! Very interesting that this setting caused a problem!  Previously we have also encountered this when the Virtual Machine has an ISO mounted when to converting to OVA.