Author: GS

Technology Blog

Deploying and Configuring Kubernetes (K8S) on Centos 8

This document will explain how to setup Kubernetes aka K8S on Centos 8 with the following components providing network capabilities: Tigera Calico for the Network Stack, MetalLB for the Load Balancer and Nginx Ingress Controller for Inbound Traffic (HTTP, HTTPS). Just a note these nodes need at least 2 CPUs and at least 4GB of…
Read more

Another SSSD Gotcha! ldap_group_nesting_level!

So I ran into another SSSD gotcha specifically with nested groups with Active Directory LDAP. This issue manifested itself as my user id along with others being members of a group that we should not have been members of. So you say how can this be corrected with SSSD. Well SSSD has a parameter called:…
Read more

RHEL 7.x and SSSD and /etc/resolv.conf

Ran into an interesting situation with /etc/resolv.conf with “options rotate timeout:1” set with SSSD in regards to DNS lookups and nameservers not being up and SSSD marking an entire domain down. With this specific situation the last server in /etc/resolv.conf had been left down by accident following a reconfiguration of VMWare. When the servers were…
Read more

RHEL 7 and NFSv4 with Kerberos

Over the past year I have been tasked with building out a large Secure NFSv4 Environment using DRBD, Corosync and Pacemaker and ran into a plethora of issues which included gotcha’s with setting up NFSv4 Server and Client Security settings related to gssproxy/rpc-gssd, how to enforce quotas remotely with rpc-rquotad, to setting up idmapd or…
Read more

Removing DRBD Devices and Volumes from Highly Available NFS

In working through attempting to add a new volume without causing an outage with a highly available NFS setup I had to come up with a methodology to remove the DRBD device to effectively rinse and repeat until we came up with the right steps. Below outlines the steps to remove a DRBD block device/volume.…
Read more

Adding a block device to DRBD with Corosync and Pacemaker for use with Highly Available NFS

Over the past few weeks I have been working with DRBD, Corosync and Pacemaker and adding addition block devices to DRBD to create new volumes and noticed that there was no solid methodology or steps to complete this task but after some trial and error I came up with steps to avoid taking an outage…
Read more

Iterating a JSON using Jackson-Databind Library like JDOM for XML

I recently came across a situation that required me to be able to iterate over a JSON message payload similar to what can be done with JDOM in regards to XML similar to what I do within my Stax XML Mapreduce InputFormat.  So basically in this case you need to treat JSONArray’s similar to XML…
Read more

Xml Processing with MapReduce/Spark using an Xml StaX Parser

XmlStaxInputFormat / XmlStaxFileRecordReader Github Project – https://github.com/gss2002/xml-stax-mr After some time it seemed like a gap that existed with Hadoop MapReduce and Spark that the existing XmlInputFormat classes from Mahout were using fseek and searching for strings as the file is read in from HDFS. The ability to break up a large Xml file becomes extremely important…
Read more

Building an RPM for Spark 2.x for Vendor Hadoop Distributions

Building an RPM for Spark 2.x for Vendor Hadoop Distribution It may be necessary to produce an alternate packaged version of Spark for usage in a vendor provided Hadoop Distribution. This became apparent many times to me when loading Hortonworks HDP into an Enterprise Environment where update/upgrade cycles do not allow for upgrade of HDFS…
Read more

How to use the Native IBM MQ Client Receiver with Spark Streaming

How to use the Native IBM MQ Client Receiver with Spark Streaming After using Apache Nifi and IBM MQ I noticed that Nifi could not easily guarantee order of incoming messages as failover can occur at anytime. This becomes a problem specifically with database and table replication when the replicating software puts messages to a…
Read more