DevOps Tools - KANNIKA LIBRETA

DevOps Engineer

Responsible for helping Developers of an application team build their source code, test it, and deploy it to production servers (or any other environment) managed by Operations team - performing releases.

The process of building, testing, deploying, operating, and monitoring is called CI/CD the pipeline which can be automated by DevOps

They need to know how to integrate components such as SQL Databases, libraries, APIs, and SDKs on the production OS system (ie: through Docker containers and Kubernetes)

CI/CD Pipeline

o New feature or bug fix committed to Git repo by developers (version/Git Tags: env.major.minor.patch => ie: P.3.2.1)

=> Triggers CI Jenkins server which will perform a build* = combine source code and its dependencies to build a runnable instance of our product (Dockerfile can be used to build a docker image either before or after testing - *usually testing is done after Docker image is built and a container instance of it is running)

* Programs that are written in languages like C++, Java, C, or Go language should be compiled. While, JavaScript, Python, and Ruby programs can work without the build stage.

=> Run automated tests to check for correctness of the code changes - (ie: smoke integration, regression tests) - if failed, it can provide quick feedback to the developers (contributing towards continuous integration/continuous delivery)

-- The testing code files can exist in a separate container which can later be linked to the app container to perform the tests OR testing can also be done in a separate testing environment where the testing code files live.

=> If all tests pass, Jenkins will push the image to a docker repo (ie: private repo under DockerHub)/AWS repo

=> Jenkins will update (tag/version of the image in) the yaml file (K8s manifest files)

=> Jenkins will utilize the kubectl tool to apply it to the Kubernetes cluster (either in staging / prod. env)
-- OR get image deployed similarly on AWS EKS

GitOps

('Deploy') GitOps relies on Git as the single source of truth, meaning that only changes in git can trigger builds and deployments of infrastructure and applications. GitOps' ArgoCD works well with Kubernetes. Installing ArgoCD provides a complete GitOps solution (others ex: FluxCD, JenkinsX)

How it works?

ArgoCD is installed as part of the K8s cluster. An ArgoCD agent is configured (using yaml files) to track K8s manifest file changes (ie: yaml files) from the AppConfiguration Git repo (only contains deployment and service yaml files not App src code => GitOps repo). Once when a change is detected, automatically pull those changes and apply them in the K8s cluster.

So the overall CI/CD pipeline now works like this . . .

Developer commits code to App src code Git repo, Jenkins CI is triggered which builds the code, tests it, and pushes the Docker image to a registry, and finally updates the K8s manifest files in GitOps repo. Now ArgoCD will take over to deploy it in K8s cluster. DevOps Eng. can also update the yaml files if services change so only CD pipeline will be triggered and it does not re-build code w/o any src code changes.

Benefits

Since config files are stored in a Git(Ops) repo if a new deployment causes errors then it can easily be rolled back to the previous version since Git keeps track of previous versions of the code and access can be controlled by configuring GitOps repo settings. It also provides Cluster Disaster Recovery if the K8s cluster were to crash and ability to manage multiple clusters. With ArgoCD's UI it provide info about the application state (ie: pods were created and are in ahealthy state) and clearer visibility of the cluster.

GitHub Actions

('build', 'test', 'deploy') GitHub Actions is a platform that allows you to automate workflows (ie: CI/CD, greet new contributors) like build, test, and deployment. You can create workflows that build and test every pull request to your repository, or deploy merged pull requests to production.

Terms:

issue = bug in the code noted in GitHub | workflow = chain of actions |
GitHub event = (ie: PullRequest created, Issue created, Contributor added, PullRequest merged, OR scheduled cron job - run every 30 min) |
GitHub Marketplace = starter workflows (template) preconfigured for specific languages and frameworks

Some tasks in the workflow may include: prepare release notes, update the version/tag#

A Typical workflow has following sections

(on: )Listen to Event
=> (Jobs: )Trigger Jobs
=> (runs-on: [env])State environment ; (steps: )Execute actions sets
=> (- uses: [pre-coded template]) ; (- name: [name_ur_action_set])
=> (run: [commands])Execute cmds on Runner-Server
(ie: if a issue is created sort it based on priority, label it, and assign it to a developer)

Benefits: Setup of CI/CD pipeline w/ GitHub Actions is easy since it is already in GitHub. Works well with any environment. We can also setup required reviewers so that workflow with deployment will only execute if the required reviewers have approved.

Where is the code executed?
It is executed on the GitHub Action Runner server managed by GitHub but we can also host it on our own (Runner) server. Each job in a workflow runs in a fresh virtual environment. By default jobs run in parallel but it can be changed to run one after the other.

For Self-Hosted Runners: 1) Download and Extract the Scripts 2) Configure and authenticate the runner with tokens 3) Start listening for jobs
Runners can also be grouped to listen for jobs from a specific GitHub repo/org => multiple runner pools for specific tools (ie: Java, Node.js, etc)

When you want the workflow to deploy the code on a very specific server - Environments are created in the settings of the repo

Secrets

Create: Repo-Settings > Secrets and Variables > Actions > New Repository Secret

Once created secret values can never be viewed by anyone, it can only updated/deleted. Secret values cannot be accessed and used in the app's source code only in workflows and not even when you fork a repo. There are also environment secrets which are accessed when workflow is assigned for a specific environment (and env secrets are checked 1st then repo secrets)

Learn More about GitHub Action's Syntax

Gradle

('build') Gradle is a build automation tool known for its flexibility to build software.

Terms:

Project - A Gradle project requires a set of tasks to execute (ie: deploying app to staging environments)
Task - A piecie of work performed by the build (ie: compiling classes, creating JAR files)
Build Scripts - aka. build.gradle file, is located in the root directory of the project. Every Gradle build comprises one or more projects

Pro: Faster than Maven, Gradle can be used for ANT build projects and for Maven repos.

How to build a Gradle project

1) git clone the project to your local laptop: https://github.com/marcobehlerjetbrains/gradle-tutorial

2) cd into the directory and type:$ gradlew build

gradlew stands for gradle wrapper and it is an embedded version of gradle so we don't need to install gradle separately on our machine
gradle.bat is used to execute on windows while only gradle is used to execute on linux and macos

3) The location of the executable

The artifact/*.jar file lives in front-end/build/libs

4) Commands

:$ gradlew clean deletes build folders
:$ gradlew test goes through test files, execute them against the src code
If you didn't make any changes to your code, gradle will have tests results/reports cached and won't run the tests again against the code

5) What is in build.gradle file?

build.gradle is written in groovy or it can written in kotlin => build.gradle.kts
Every language that is used in gradle need to be in the plugin section.

It contains a dependencies section which can import 3rd party softwares. The dependencies are downloaded from the mentioned repo in the repositories section.
Under the test section, we can tell gradle to use JUnit to run the tests
group, version, sourceCompatibility

6) Creating your own Gradle project

If you want to build a gradle project, do :$ brew install gradle and go to ur newly created empty prj repo, :$ gradle init

7) How to write a Gradle task? => In build.gradle
tasks.register("hello"){
// You can add some groovy code in here
dependsOn <another_taskname> // Call another task to get executed here before continuing
println "Hello KanKan"
}
and from terminal, we can now execute:$ gradle hello
> Hello KanKan
. . .
BUILD SUCCESSFUL

Maven

('build') Maven is used for Java-based projects, managing its builds, dependencies, and versions (ie: libraries, JAR files). It is a Maven prj if it has a pom.xml file.

Installing Maven:$ brew install maven
Unless the project already comes with maven wrapper (mvnw interchangeable w/ mvn)

:$ mvnw validate // will try building the maven project
:$ mvn clean // deletes .class files
:$ mvn compile //
:$ mvn test // convention to have main and test folders in scr of prj and it compiles the test files
:$ mvn package // compiles code, tests code, and puts it in a jar file

Contents of pom.xml
project (root tag)
modelVersion, groupID, artifactID, version, packaging (ie: jar, pom), name, description, parent, properties, dependencies, build (contains plugins)
=> The final executable will look like: artifactID-version.packaging OR pom.xml

Jenkins

('build', 'test', 'deploy', 'release') It helps trigger and automate the CI/CD pipelines or any other task.

Jenkins Infrastructure

- Master server: controls pipelines and schedules builds
- Agents/Minions (aka Jenkins slave): Performs the builds

Example workflow

A developer commits code to git repo, the jenkins master notices this and triggers the appropriate pipeline and distributes the build to one of the agents to run. Agents are selected based on configured labels. The Agent runs the build which is usually just a bunch of linux commands to build, test, and distribute the code.

2 main types of jenkins agents

- permanent node agents => dedicated servers for running jobs
- cloud agents => dynamic agents spun up on demand (ie: docker, k8s, aws)

2 main build types

- freestyle build: simplest method to create a build which feels like shell scripting

- pipelines: use jenkins file with groovy syntax to specify what happens during the build. Pipelines are broken down to stages for each component of the build (Common stages include: Clone, Build, Test, Package, Deploy)

Blue Ocean is an add-on jenkins. It makes the CI/CD pipelines look nicer, easier to manage, and troubleshoot.

Getting Started

Build the Jenkins-BlueOcean image
docker build -t myjenkins-blueocean:2.332.3-1 .
Create the 'jenkins' network
docker network create jenkins
Run the Jenkins-BlueOcean container
docker run --name jenkins-blueocean --restart=on-failure --detach \
--network jenkins --env DOCKER_HOST=tcp://docker:2376 \
--env DOCKER_CERT_PATH=/certs/client --env DOCKER_TLS_VERIFY=1 \
--publish 8080:8080 --publish 50000:50000 \
--volume jenkins-data:/var/jenkins_home \
--volume jenkins-docker-certs:/certs/client:ro \
myjenkins-blueocean:2.332.3-1
Get the Jenkins initial admin password
docker exec jenkins-blueocean cat /var/jenkins_home/secrets/initialAdminPassword
goto: http://localhost:8080/ and type your usrnm & pswd on the jenkins web UI

Some Jenkins Info

Go to Manage Jenkins on the left (following are some important info to know but no actions required)

Under System Configuration
- Configure System > Gives info about the Jenkins server and set global parameters
- Manage Plugins
- Manage Nodes and Clouds > this is where we'll setup agents and clouds
Under Security
- Manage credentials > this is where ssh keys are stored, API tokens
Under Tools & Action
- Prepare for shutdown > this is where you want to take jenkins server offline to perform some maintenance

First Build Job (freestyle build)

On Dashboard, click on New item to create either of the 2 types of build
- Type name and select 'Free Style' or 'pipeline' (for this example, we'll select freestyle)
- Next you can select info about your build (but we can execute some shell commands under Build > Execute Shell > Textbox = echo "Hello Kannika" > save
- Click 'Build Now' on the left side of the home menu page and it would run it
- Once done, click on the build # under build History and click on console output
- If you go back to project, and configure to edit our build and create a file using the shell commands (echo "Kankan" > testkk.txt). Then build it => we can view the newly created file under workspace from dashboard > my_job_name
- And if we build again, the file will still exist which means we are not clearing out our workspace but we can do this by:

dashboard > my_job_name > Configure > under Build Environment > check Delete workspace before build starts

Getting inside the Jenkins sever

Go inside the Docker container that is running jenkins from our local laptop's terminal / cmd line and execute:
:$ docker exec -it jenkins-blueocean bash
Go inside the volume jenkins-data from the docker run command at the start
:$ cd /var/jenkins_home
:$ cd workspace/my_job_name
And we can see the test file that we just created

Adding Docker to Jenkins

> Dashboard > Manage Jenkins > Manage Nodes and Clouds > Configure Clouds > Manage Plugin > Checkbox Docker and Download now and install after restart
And now if you go to Configure Clouds, we can add Docker as our node agent

Create and run a new container from your local laptop's terminal which we will use as our node-agent/jenkins-slave

:$ docker run -d --restart=always -p 127.0.0.1:2376:2375 --network jenkins -v /var/run/docker.sock:/var/run/docker.sock alpine/socat tcp-listen:2375,fork,reuseaddr unix-connect:/var/run/docker.sock
Execute below command and get the name of the container that we just created
:$ docker ps
:$ docker inspect <name_of_container>

and get the <ip_address_of_container> from the output

Now back to Jenkins UI

Configure Clouds > Add a new cloud > Docker > Docker Cloud Details > Docker Host URI = tcp://<ip_address_of_container> (Also try clicking on Test connection) > Save

Add Docker Agent templates - Give it a label and a name, check enabled
Docker image will be our base image for our node agent (ie: devopsjourney1/myjenkinsagents:python)
Instance Capacity is how many agents you can spawn (ie: 2)
Remote File System conventional default value: /home/jenkins
And now if we go to our my_job_name and configure => checkbox Restrict where this project can be run ; textbox = label_of_node_agent > save (this will now run the build in our node agent)

Automatic Trigger from GitHub Repo Change

Also in configure, under Build Triggers > checkbox Poll SCM > Textbox = */5 * * * * (randomly perform the build every 5min like a cron job)

Build Job (pipeline)

Dashboard > New Item > select pipeline > scroll down to Pipeline section
Unlike freestyle build, pipeline requires a pipeline script - 2 options:
1) Code directly under jenkins UI (Pipeline Definition = Pipeline script and write your script in the textbox)
2) Point it to a Jenkinsfile in src repo which will have the steps to perform the build (Pipeline Definition = Pipeline script from SCM ; SCM = Git > Point it to Git repo url ; Script Path = path/file name = Jenkinsfile)
Once done, click save and Build Now

Jenkinsfile

stages > stage > steps
examples of stage are: build, test, deploy
steps would be the respective terminal commands

When looking at console output, click on Open Blue Ocean, which shows it in a nicer interface

*Note: The docker containers that are running: 1 for jenkins (master server), and 1 for node-agent (agents/minions aka jenkins slave)

Learn more about jenkinsfile syntax

Ansible

('operate') A tool that automates IT tasks. It can help repetitive system admin tasks that are performed manually (ie: deploy an updated application, backups, and system reboots). Manually, it would look like: ssh into each server and execute linux commands. It also supports cloud provisioning.

Benefits

Allows to execute tasks from your own machine instead of ssh'ing to remote servers
Configuration/Installation/Deployment steps can be setup in a single YAML file instead of shell script, and terminal commands
Re-use same files for different environments

More reliable and less likely for human errors

Ansible is agentless

You just need ssh access to all the remote servers and have Ansible installed in your local machine, which will act as the control machine.

Modules = small programs that do the actual work

They get sent (pushed) from the control machine to the target servers
When they've completed their job, they get removed
Module is a small specific task (ie: start nginx server, install nginx server, copy a file)
List of modules in Ansible Official Documentation

Since modules are very granular and very specific, when handling complex application => you will need
multiple modules arranged in a certain sequence grouped together to develop the complete configuration = Ansible Playbooks

Ansible Playbooks

Ansible uses YAML
Sequential modules are grouped in tasks => 1 configuration

Each module has a description of the task, name of the module, and arguments
Before the tasks section, we must indicate the host where the tasks will be executed and the remote username that will be used to login
Describing which tasks should be executed on which host with which user = Play
Playbook = 1 or more Plays (multiple Plays in a single YAML file)
Sample Playbook looks like:
- name_of_Play, host, remote_user, any variable declaration (optional), tasks, modules (name, base_module, arguments)

Ansible Inventory List

when we provide names of the host like this in our Ansible Playbook (host: databases). In our Hosts file, we will give it value like this
. . .
[databases]
10.24.0.7
10.24.0.8
[webservers]
10.24.0.1
web2.myserver.com
. . .
It lists and groups all the machines' ip_address or their hostname that are involved in the task executions

Ansible for Docker

With a Dockerfile, we can use it to prepare an application environment and container
With Ansible Playbook, we can create an alternate to Dockerfile - It can be used to create a Docker container, Vagrant container, cloud instance, and bare metal server (personally owned server) - It allows the app to be deployed across many env

Ansible Playbook can also manage the host and not just the docker container

Ansible Tower

An UI dashboard from RedHat
A place to centrally store automation tasks across teams

Configure permissions and manage inventory

Learn More About Ansible Playbook Syntax

Comparing w/ Puppet and Chef

COMPARING W/ PUPPET AND CHEF

Uses Ruby instead of YAML

Must be installed in the target servers and follows master-agent/slave whereas Ansible is agentless

Chef

Automation tool to manage servers
Functions very similar to Ansible and uses cookbooks instead of playbooks. It is implemented using the Ruby language.
Since Chef is an older product there might be more documentations and well-established set of resources
Treats infrastructure as code (IasC)

Puppet

('monitor') A configuration management tool ensuring that all systems (servers) are configured to a desired and predictable state.
A simple script in ruby can be written and deployed onto the managing servers and it can easily be used to deploy back to the running state if multiple servers crash down.
It can be used as: Configuration Management Tool, Deployment Tool, Implements IasC
Manifests are written by SysAdmin => Manifests are compiled into Catalogs => Catalogs are deployed onto the clients => Catalogs are executed on the clients by the agent => Clients are configured to the desired state
Conclusion: Learn Ruby, understand how Ansible works => you'll be able to pickup puppet quickly

Circle CI

It works pretty much just like Jenkins. It helps with the building, running security scans, testing, and deploying.
-

Example Walkthrough:

You have a GitHub repo with your application code files and test code files. Now to setup CircleCI, create a folder called .circleci/ inside your application repo. Then, create a file called config.yml - This is where we will describe the steps such as: build, test, deploy when a push or a merge request occurs. Now we have to connect our repo to CircleCI - go to circleci.com and login with your GitHub account and through this we can connect our repo to CircleCI (very easy setup).
-

Jenkins vs CircleCI

Jenkins - multi-threading | CircleCI - parallelism. CircleCI does not require a dedicated internal server as it is a cloud-hosted platform. Instead, it runs on an online server that can be scaled as per requirements. Also, Jenkins is dependent on various plugins for continuous integration - these extensions need to be installed manually. CircleCI has a much better UI than Jenkins. Jenkins has a vast plugin ecosystem with thousands of plugins available for various integrations and functionality compared to CircleCI. CircleCI is designed to scale horizontally, allowing for concurrent builds and parallelization. Jenkins can scale vertically by increasing server resources, but it requires manual setup and configuration for distributed builds. Jenkins has a large and established community with extensive documentation, plugins, and user forums and CircleCI is not bad itself.

Learn More About CircleCI Syntax

Terraform

some basic info

('operate') An IasC tool, mostly used for managing public cloud infrastructure such as AWS, GCP and Azure.
Used to automate and manage your infrastructure, platform and the services that run on that platform
Terms

Declarative = define WHAT end result you want
Imperative = define exact steps - HOW to get to the end result you want

*Infrastructure Provisioning = To run your application => spin up 5 servers and deploy 5 microservices as docker containers on each one and AWS platform could be used to build the whole infrastructure
A Tool for Infrastructure Provisioning. The preparation for infrastructure setup involves:

1) private network space ; 2) EC2 server instances ; 3) Install Docker and other tools ; 4) Security

=> So 1) DevOps team provisions the infrastructure and 2) Software Developers deploy the application

Comparing Ansible and Terraform

Both follow IasC => both automate provisioning, configuring, and managing the infrastructure
Terraform => mainly used for provisioning the infrastructure (initial setup of the server environment)
Ansible => mainly a configuration tool (deploy apps; install/update software)

So Terraform can be used to:

Create an infrastructure ; Make changes to an existing infrastructure ;
Replicate an infrastructure (ie: an infrastructure worked well in DEV env, we can replicate it to Staging and PROD env)

So the core designs a plan to get the current state of infrastructure to the desired state and it then uses the resources from providers to execute the plan. Here is a simple example of a config file where we describe the desired state (declarative):

# Configure the AWS Provider
provider "aws" {
version = "~> 2.0"
region = "us-east-1"
}

# Create a VPC (resource)
resource "aws_vpc" "example" {
cidr_block = "10.0.0.0/16"
}

Life w/o Terraform

Example Scenario w/o Terraform

A business wants to rollout an application => so they come up with requirements for the application
Business Analyst converts the reqs into a set of high level technical requirements
Solution Architect/TechLead receives this and designs an architect for this application including the infrastructure = type, spec, # of servers, DBs, etc
These resources must be deployed in the on-premise environment of the company
- any other required hardware must be ordered through the procurement team and it will be delivered to the company's data center
Next onto the infrastructure team
- Field Engineers - would be in-charge of the stack of equipments
- System / Network Admin - performs initial configurations and make it available on the company's network
- Storage & Backup Admin - setup storage and backups for the servers
Now the system can be handed over to the application team to deploy their application

Con: slow deployment, expensive, limited automation

Some companies automated infrastructure provisioning with: Shell, Python, Ruby, Perl, Powershell
Using these languages to build a script resulted in long code files instead use => Infrastructure as Code with Terraform

Configuration Management (ie: Ansible) - Designed to install and manage software

Infrastructure Provisioning (ie: Terraform) - Deploy/Acquire immutable infrastructure resources: servers, databases, network components

* While Cloud Formation is specifically designed to deploy services in AWS, Terraform can be used across different cloud providers or on-premise providers.

* Provider = a provider (ie: databases, servers) helps Terraform manage 3rd party platforms through their api

Getting started with Terraform

Terraform uses HashiCorp Configuration Language = a simple declarative language
All infrastructure resources can be defined within the configuration files *.tf

Example: main.tf

# creates an aws ec2 instance
resource "aws_instance" "webserver" {
ami = "ami-80dfuu98j48uf9844"
instance_type = "t2.micro"
}
# creates an aws s3 bucket
resource "aws_s3_bucket" "finance" {
bucket = "ami-80874jrf857h9844"
tags = {
Description = "Finance-and-Payroll"
}
}

resource "aws_iam_user" "admin-user" {
name = "lucy"
tags = {
Description = "Team-Leader"
}
}

In the above file, we tell what should the end result look like and Terraform will take the necessary steps to make the current state match that. It works in 3 phrases: Init ; Plan ; Apply
Terraform manages the life cycle of its resources from its provisioning, to configuration, and to decommissioning. Terraform keeps track of the current state through *.tfstate file

HCL format
<block> <parameters> {
key1 = value1
key2 = value2
}

A block in terraform contains info about the infrastructure platform and a set of resources within it that we want to create. An example: we want to create a file where Terraform is installed
$ mkdir terraform-local-file
$ cd terraform-local-file
Inside terraform-local-file directory, create a file

local.tf

# resource = Block name
/* "local_file" = provider_resourceType *Note: local is a basic provider, other providers include: AWS, Azure, etc and each provider has a set of resources and for each of these resources it would require its own type of arguments - more details on these can be found in the terraform documentation */
# "plant"= resource name
# Argument: stuff written inside the block in key = value format
resource "local_file" "plant" {
filename = "plants.txt"
content = "We love plants!"
}

Run the 3 step commands
$ terraform init
To see the execution plan carried out by Terraform
$ terraform plan
$ terraform apply
To get the details of the resource created
$ terraform show

Lab Notes
a TF file and is used for writing configuration files in Terraform using HCL
the content of a file will not be displayed when using local_sensitive_file instead of the local_file resource.
Resources are destroyed using:$ terraform destroy

Terraform uses a plugin based architecture to work with 100s of platforms. Terraform providers are distributed by Hashicorp and are publicly available in the Terraform registry: registry.terraform.io
There are 3 tiers of providers:

1) Official providers (owned by Hashicorp, ie: AWS, Azure, local) ;
2) Verified providers (owned by 3rd party that has gone through a partner provider process with Hashicorp, ie: heoku, digitalocean) ;
3) Community providers (owned by individual contributing users)

Terraform init is a safe command that can be run many times without affect the resource and it installs the required plugin inside a hidden directory called: .terraform/plugins

When Terraform apply command is executed, it will consider all files with *.tf in the configuration directory.

Types of variables

Input Variables

main.tf

resource "local_file" "pet" {
filename = var.filestuff["filename"]
content = var.filestuff["content"]
}

resource "random_pet" "my-pet" {
prefix = var.prefix[0]
separator = var.separator
length = var.length
}

variables2.tf

// Some extra variable types
variable "bella" {
type = object({
name = string
  color = string
  age = number
  food = list(string)
  favorite_pet = bool
})
default = {
  name = "bella"
  color = "orange"
  age = 7
  food = ["fish", "milk", "grapes"]
  favorite_pet = true
}
}

variables.tf

variable "filestuff" {
type = map
default = {
"filename" = "/root/pets.txt"
"content" = "We love pets!"
}
}

variable "prefix" {
default = ["Ms", "Mr", "Mrs"]
type = list
// This also works: type = list(string)
}
variable "separator" {
default = "."
}
# type forces the variable to be a number and descr is optional
variable "length" {
default = "1"
type = number
description = "Number of random names generated by terraform"
}
// tuple example
variable kitty {
type = tuple([string, number, bool])
default = ["cat", 7, true]
}

If the default values are left as blank (default not included in variable block) in the variable.tf file, then when we run terraform apply command, it will prompt us to enter values for each referenced variable in main.tf unless we set environment variables or send in variables as command line arg with -var flag (ie:$ terraform apply -var filename="/root/pets.txt" -var content="..." ...)

Setting Environment Variables for Terraform
:$ export TF_VAR_filename="/root/pets.txt"
. . .
:$ terraform apply

Variable values can also be set with Variable Definition files

terraform.tfvars

filename = "/root/pets.txt"
content = "We love pets!"
. . .

Variable Definition Precedence - values get rewritten as Terraform checks from left to right
Environment Variables >> terraform.tfvars >> *.auto.tfvars (alphabetical order) >> -var var_name=var_value OR -var-file *.tfvars

Linking 2 Resources together using Resource Attribute Reference

resource "local_file" "pet" {
filename = var.filestuff["filename"]
content = "My favorite pet is ${random_pet.my-pet.id}"
}

// Note: random is part of Terraform documentation
// When terraform apply command is run this resource will generate a random name (ie: Bull and support an output: id=Mr.Bull)
// This id attribute can be referenced in another resource like used in the above resource
resource "random_pet" "my-pet" {
prefix = var.prefix
separator = var.separator
length = var.length
}

Types of Resource Dependencies
In the above example, Terraform figures out that the local file depends on random_pet resource so it executes accordingly but without referencing the resource and if we want to execute it in some order => uses depends_on sub-block

explicit dependency example

// depends_on sub-block will make sure to execute random_pet first and then local_file
resource "local_file" "pet" {
filename = var.filename
content = var.content
depends_on = [
random_pet.my-pet
]
}

resource "random_pet" "my-pet" {
prefix = var.prefix
separator = var.separator
length = var.length
}

Output Variables in Terraform
Output variables get printed on the screen when terraform apply command is run.

Output variables example

main.tf

resource "local_file" "pet" {
filename = var.filename
content = "My favorite pet is ${random_pet.my-pet.id}"
}

resource "random_pet" "my-pet" {
prefix = var.prefix
separator = var.separator
length = var.length
}
output pet-name {
value = random_pet.my-pet.id
description = "Record the value of pet ID generated by the random-pet resource"
}
:$ terraform init
:$ terraform apply
:$ terraform output
> pet-name = Mr.Bull

Output variables are useful when you want to display provision details of a resource or feed the output variables to other tools like Ansible for configuration management.

Purpose of using State in Terraform
A state file is like a blueprint of all the resources that terraform manages. When Terraform creates a resource, it creates an identity in the state. State file also tracks the metadata details such as resource dependencies. Terraform stores a cache of attribute values for all resources in terraform.tfstate file. The cached attributes can used with this command:$ terraform plan --refresh=false
In bigger organizations, when working with a team, it is better to have the terraform.tfstate file located on a remote system like AWS S3 so that multiple members don't change the state at the same time.

Terraform State Consideration
State file contains sensitive information - has every detail about our infrastructure. Don't ever manualy try to edit the state file, it is meant for Terraform's internal use

useful Terraform commands

Check for syntax errors
:$ terraform validate
Show the changed files since last terraform apply
:$ terraform fmt
Show the current state of the infrastructure
:$ terraform show
To see a list of all providers used in the configuration directory
:$ terraform providers
To update the cache/state -file
:$ terraform refresh
To see a dependency graph
:$ apt update
:$ apt install graphviz -y
:$ terraform graph | dot -Tsvg > graph.svg
Now the graph.svg file will contain the visual graph

Mutable vs Immutable Infrastructure
Mutable => upgrading an nginx server (or any other resource) from v1.17 to v.118 and then to v1.19
Immutable => adding an nginx server (or any other resource) with v1.18 and deleting an existing one with v1.17 and then same v1.19

Lifecycle rules, and data sources

Lifecycle Rules
We don't want terraform to delete a resource and then create a new one, so ...
resource "local_file" "pet" {
filename = "/root/pets.txt"
content = "We love pets"
file_permission ="0700"
lifecycle = {
create_before_destroy = true
}
}

If you don't want any resource to be destroyed or prevent resources from accidentally getting deleted
lifecycle = {
prevent_destroy = true
}

Data Sources allow Terraform to read attributes from resources which are provisioned outside its control. (for ex: we create pets.txt in root dir with Terraform but what if we manually create plants.txt in root dir?) It uses data block to do this but infrastructure resources with data blocks can only be read, no deleting or updating it.

main1.tf

resource "local_file" "pet" {
filename = "/root/pets.txt"
content = "We love pets"
}
data "local_file" "plant" {
filename = "/root/plants.txt"
}
// Now the tfstate will have info about plants.txt file as well

main2.tf

resource "local_file" "pet" {
filename = "/root/pets.txt"
content = data.local_file.plant.content
}
// read the content from plants.txt and set it in pets.txt
data "local_file" "plant" {
filename = "/root/plants.txt"
}

Meta Arguments
Create multiple instances of the same resource
Example1

main.tf

resource "local_sensitive_file" "name" {
filename = "/root/user-data"
content = "password: S3cr3tP@ssw0rd"
count = 3
}

More examples of creating multiple instances and For_each example

Example 2

main2.tf

resource "local_sensitive_file" "name" {
filename = var.users[count.index]
content = var.content
count = length(var.users)
}

variable2.tf

variable "users" {
type = list(string)
default = [ "/root/user10", "/root/user11", "/root/user12", "/root/user10"]
}
variable "content" {
default = "password: S3cr3tP@ssw0rd"
}

For_each example

variable1.tf

variable "filename" {
// because 'for_each' only works with sets and maps
type=set(string)
default = [
"/root/pets.txt"
"/root/cats.txt"
"/root/dogs.txt"
]
}

main1.tf

resource "local_file" "pet" {
filename = each.value
for_each = var.filename
}

variable2.tf

variable "filename" {
// keeping it as list
type=list(string)
default = [
"/root/pets.txt"
"/root/cats.txt"
"/root/dogs.txt"
]
}

main2.tf

resource "local_file" "pet" {
filename = each.value
for_each = toset(var.filename)
}

Note: Output value of count is a list and for_each is a map

Version Constraints

Examples of specifying a version for a provider

main.tf

terraform {
required_providers {
      local = {
          source = "hashicorp/local"
          version = "1.2.2"
      }
  }
}
resource "local_file" "innovation" {
  filename = var.path
  content = var.message
}

main2.tf
terraform {
  required_providers {
      k8s = {
source = "hashicorp/kubernetes"
version = "> 1.12.0, != 1.13.1, < 1.13.3 "
      }
      helm = {
          source = "hashicorp/helm"
// use any v from 1.2.0 to 1.2.9
          version = "~> 1.2.0"
      }
  }
}

main3.tf

/* use version greater than 3.45.0, less than 3.48.0, but not equal to 3.46.0 */
terraform {
  required_providers {
      google = {
          source = "hashicorp/google"
          version = "> 3.45.0, !=3.46.0, < 3.48.0"
      }
  }
}
resource "google_compute_instance" "special" {
  name = "aone"
  machine_type = "e2-micro"
  zone = "us-west1-c"
}

Other tools Devops engineers may use

GitLab CI/CD

GitHub and GitLab are separate and competing platforms. GitLab CI/CD can be used to run tests, build a docker image, and deploy. It allows keeping CI/CD & code management in the same place and there's almost no setup process.
Architecture
It has a main instance/server that host our application code and manages pipeline, and some GitLab runners (agents) that runs our CI/CD tasks. That's why no setup is needed but we can also use our machines and perform work on our own runners.

OpenStack

OpenStack is an open-source cloud platform like AWS. It allows organizations to deploy and manage their own private or public clouds, giving them more control over the infrastructure compared to AWS. While OpenStack allows scaling, it requires manual configuration and management. It also requires more technical expertise for deployment and management. It involves setting up and configuring the necessary components and infrastructure.

JFrog

(somewhat like DockerHub - collection of executables, artifacts)

('delivery') JFrog Artifactory is a universal binary repository manager that supports multiple package formats like Maven, npm, Docker, NuGet, and more. It provides a central hub for managing and distributing software packages across various technologies and ecosystems. It enables automated continous integration and delivery. Add Artifactory to your toolchain and store build artifacts in your Artifactory repository.
Key features: artifact mgmt, build integration, security & access control, metadata & search capabilities, and replication & distribution (delivery)

BitBucket

(similar to GitHub and GitLab)

Bitbucket is a web-based platform that provides Git and Mercurial repository hosting. It offers version control capabilities along with collaboration features like pull requests, code review, and issue tracking. Bitbucket is often used in conjunction with other Atlassian tools like Jira and Confluence.
Key features: version control, collaboration & code review, issue tracking & prj mgmt, CI/CD integration, and offers extensions

Prometheus

(monitoring)

Prometheus is a free software application used for event monitoring and alerting. It records metrics in a time series database built using an HTTP pull model, with flexible queries and real-time alerting.

Nagios

(monitoring)

Nagios Core is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and services. Monitoring for computer systems, designed to run on the Linux OS and can monitor devices running Linux, Windows and Unix OSes. Nagios software runs periodic checks on critical parameters of application, network and server resources.