7 tips for Linux cluster admins to help keep auditors happy
Try these strategies to satisfy auditors' requirements without wasting your time or testing your patience.
The beauty of building extra-large Linux clusters is it's easy. Hadoop, OpenStack, hypervisor, and high-performance computing (HPC) installers enable you to build on commodity hardware and deal with node failure reasonably simply. Learning and managing Linux administration on a small scale involves basic day-to-day tasks; however, when planning and scaling production to several thousand node clusters, it can take over your life, including your weekends and holidays.
Specific requirements for encrypting people-related data in transit and at rest have been heavily discussed elsewhere, so I won't be covering them here. Rather, we'll focus on preparations to keep an audit off the backs of your Linux admin team.
1. Fundamentals: Connecting your cluster to the world
It's tempting to build a cluster on a standalone network with admin access on a second corporate LAN interface. Like Oracle databases in the past, Hadoop and HPC clusters tend to execute all running tasks in a cluster with a single user identification (UID) account (e.g., "hadoop").
Audit needs to prove not only how personal data is stored, but also how data is manipulated, aggregated, or anonymized, and that includes who can create, change, or log in these application-specific accounts. That's you and your admin team in the spotlight.
2. Don't let software installers create accounts or Linux groups
Use your favorite configuration manager or identity manager to create needed accounts on each cluster node (or directory) first. If the Hadoop account and group already exist, the cluster software installer will use those instead. There are several reasons we want this behavior, as outlined below in the next three steps.
Continuing reading article on Opensource.com