Create and switch new branch

1
git checkout -b <branch-name>

Find guilty with binary search

1
2
3
4
5
6
git bisect start             # Search start 
git bisect bad # Set point to bad commit
git bisect good v2.6.13-rc2 # Set point to good commit|tag
git bisect bad # Say current state is bad
git bisect good # Say current state is good
git bisect reset # Finish search

Find lines matching the pattern (regex or string) in tracked files

1
git grep --heading --line-number 'foo bar'

Forced push but still ensure you don’t overwrite other’s work

1
git push --force-with-lease <remote-name> <branch-name>

List all branches and their upstreams, as well as last commit on branch

1
git branch -vv

List all git aliases

1
git config -l | grep alias | sed 's/^alias\.//g'

List ignored files

1
git check-ignore *

Prevent auto replacing LF with CRLF

1
git config --global core.autocrlf false

Prunes references to remove branches that have been deleted in the remote

1
git fetch -p

Remove branches that have already been merged with master

1
2
git branch --merged master | grep -v '^\*\|  master' | xargs -n 1 git branch -d
# will not delete master if master is not checked out

See all commits made since forking from master

1
git log --no-merges --stat --reverse master..

Show changes over time for specific file

1
git log -p <file_name>

Trace git remote commands

Add following environmental variables into your terminal.

1
2
3
export GIT_TRACE_PACKET=1
export GIT_TRACE=1
export GIT_CURL_VERBOSE=1

Visualize the version tree

1
git log --pretty=oneline --graph --decorate --all
  • You can save alias on the .gitconfig file to improve this
    1
    2
    3
    [alias]
    lg = log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit
    lg2 = log --graph --abbrev-commit --decorate --format=format:'%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n'' %C(white)%s%C(reset) %C(dim white)- %an%C(reset)' --all

What changed since two weeks?

  • Classic

    1
    git log --no-merges --raw --since='2 weeks ago'
  • New versions

    1
    git whatchanged --since='2 weeks ago'

Bash common mistakes and useful tricks

Never use backticks

  • ❌ wrong

    1
    2
    3
    # they are not always portable to other OS
    # can no be nested without spaces
    `call_command_in_subshell`
  • ✔️ correct

    1
    $(call_command_in_subshell)

Multiline pipe

  • ❌ incorrect

    1
    2
    # hard to read
    ls ${long_list_of_parameters} | grep ${foo} | grep -v grep | pgrep | wc -l | sort | uniq
  • ✔️ correct

    1
    2
    3
    4
    5
    6
    7
    ls ${long_list_of_parameters}	\
    | grep ${foo} \
    | grep -v grep \
    | pgrep \
    | wc -l \
    | sort \
    | uniq

Avoid overusing grep and grep -v

  • ❌ incorrect

    1
    ps ax | grep ${processname} | grep -v grep
  • ✔️ correct (with userland utilities):

    1
    pgrep ${processname}

Replace awk(1) to print an element

  • ❌ incorrect

    1
    ${listofthings} | awk '{ print $3 }' # get the third item
  • ✔️ correct

    1
    2
    listofthings=(${listofthings}) # convert to array
    ${listofthings[2]} # get the third item (start counting from 0)

Use built in variable expansion instead of sed/awk

  • ❌ incorrect

    1
    2
    VAR=FOO
    printf ${VAR} | awk '{print tolower($0)}' # foo
  • ✔️ correct

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # ${VAR^} # upper single
    # ${VAR^^} # upper all
    # ${VAR,} # lower single
    # ${VAR,,} # lower all
    # ${VAR~} # swap case single
    # ${VAR~~} # swap case all

    VAR=BAR
    printf ${VAR,,} # bar
  • same thing with string replacement.

    1
    2
    3
    4
    5
    6
    7
    8
    # ${VAR/PATTERN/STRING} # single replacement
    # ${VAR//PATTERN/STRING} # all match replacement
    # Use ${VAR#PATTERN} ${VAR%PATTERN} ${VAR/PATTERN} for string removal

    VAR=foofoobar
    ${VAR/foo/bar} # barfoobar
    ${VAR//foo/bar} # barbarbar
    ${VAR//foo} # bar

Avoid seq for ranges

Use the built in {x..y} expression instead.

1
2
3
for k in {1..100}; do
$(do_awesome_stuff_with_input ${k})
done

Timeouts

Bash arithmetic instead of expr

Never use bc(1) for modulo operations

disown

disown is a bash built-in that can be used to remove a job from the job table of a bash script. You can remove one or multiple of these processes with disown and the script will not care about it anymore.

Basic parallelism

  • Usually people use & to send a process to the background and wait to wait for the process to finish.
  • People then often use named pipes, files and global variables to communicate between the parent and sub programs.

xargs

For file-based in-node parallelization, xargs is the easiest way to parallelize the processing of list elements.

1
2
3
4
5
6
7
8
9
# simple example: replace all occurences of "foo" with "bar" in ".txt" files
# will process each file individually and up 16 processes in parallel
find . -name "*.txt" | xargs -n1 -P16 -I{} sed -i 's/foo/bar/g' {}

# complex example: HDF5 repack for transparent compression of files
# find all ".h5" files in "${dirName}" and use up to 64 processes in parallel to independently compress them
find ${dirName} -name "*.h5" | xargs -n1 -P64 -I{} \
sh -c 'echo "compress $1 ..." && \
h5repack -i $1 -o $1.gz -f GZIP=1 && mv $1.gz $1' _ {}

coproc and GNU parallel

Trapping, exception handling and failing gracefully

  • trap is used for signal handling in bash, a generic error handling function may be used like this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    readonly banner="my first bash project >>"
    function fail() {
    # generic fail function for bash scripts
    # arg: 1 - custom error message
    # arg: 2 - file
    # arg: 3 - line number
    # arg: 4 - exit status
    echo "${banner} ERROR: ${1}." >&2
    [[ ${2+defined} && ${3+defined} && ${4+defined} ]] && \
    echo "${banner} file: ${2}, line number: ${3}, exit code: ${4}. exiting!"

    # generic clean up code goes here (tempfiles, forked processes,..)

    exit 1
    } ; trap 'fail "caught signal"' HUP KILL QUIT
1
do_stuff ${withinput} || fail "did not do stuff correctly" ${FILENAME} ${LINENO} $?

Trapping on EXIT instead of a specific signal is particularly useful for cleanup handlers since this executes the handler regardless of the reason for the script’s termination. This also includes reaching the end of your script and aborts due to set -e.

You don’t need cat

  • Sometimes cat is not available, but with bash you can read files anyhow.

    1
    2
    batterystatus=$(< /sys/class/power_supply/BAT0/status)
    printf "%s\n" ${batterystatus}
  • Avoid cat where reading a file can be achieved through passing the file name as a parameter.

    • ❌ incorrect cat ${FILENAME} | grep -v ...
    • ✔️ correct grep -v ... ${FILENAME}.

locking (file based)

  • flock(1) is an userland utility for managing file based locking from within shell scripts. It supports exclusive and shared locks.

Use the getopt builtin for command line parameters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
printf "This script is: %s\n" ${0##/*/}

[[ "${#}" == 0 ]] && {
# no arguments
printf "No options given: %s\n" ${OPTIND}
exit 1
}

log="" # numeric, log
table="" # single fill
stores=( ) # array

# : after a letter is for string into parameter
while getopts ":dhls:t:" opt; do
case "${opt}" in
d) set -x ;;
h) printf "Help page\n" ; exit ;;
s) stores[${#stores[*]}]="${OPTARG}" ;;
t)
if [ -z "${table}" ]; then
table="${OPTARG}"
fi
;;
l) (( log++ )) ;;
*)
printf "\n Option does not exist: %s\nOne option\n" ${OPTARG}
exit 1
;;
esac
done

# set debug if log is more than two
[[ "${log}" >= 2 ]] && {
set -x ; log=""
}
[[ "${log}" == "" ]] && unset log

Style Guide

Based on Community Bash Style Guide

When to use bash

  • ✔️ It need to glue userland utilities together.
  • ❌ Do complex tasks (e.g. database queries).

Style conventions

  • Use the #!/usr/bin/env bash shebang wherever possible.

  • Memorize and utilize set -eu -o pipefail at the very beginning of your code.

    • Never write a script without set -e at the very very beginning.
      • This instructs bash to terminate in case a command or chain of command finishes with a non-zero exit status, avoiding unhandled error conditions.
      • Use constructs like if myprogramm --parameter ; then ... for calls that might fail and require specific error handling. Use a cleanup trap for everything else.
    • Use set -u in your scripts.
      • This will terminate your scripts in case an uninitialized variable is accessed.
      • Uninitialized variables will fail in case it’s used in another script which sets the -u flag, better for security.
    • Use set -o pipefail to get an exit status from a pipeline (last non-zero will be returned).
  • Never use TAB for indentation.

    • Consistently use two (2) or four (4) character indentation.
  • Always put parameters in double-quotes: util "--argument" "${variable}".

  • Avoid putting if .. then, while .. do, for .. do, case .. in on a new line.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    if ${event}; then
    ...
    fi

    while ${event}; do
    ...
    done

    for v in ${list[@]}; do
    ...
    done
  • Never forget that you cannot put a space/blank between a variable name and it’s value during an assignment

    1
    2
    RET1=false   # assign
    RET2 = false # will not assign
  • Always set local function variables local.

  • Write clear code.

    • Never obfuscate what the script is trying to do.
    • Never shorten uncessesarily with a lot of commands per line of code chained with a semicolon.
  • Bash does not have a concept of public and private functions.

    • Public functions get generic names, whereas.
    • Private functions are prepended by two underscores (RedHatconvention).
  • Try to stick to the pushd, popd, and dirs builtins for directory stack manipulation where sensible.

  • Every line must have a maximum of eighty (80) terminal columns

  • Like in other dynamic languages, switch/case blocks should be aligned:

    1
    2
    3
    4
    5
    case ${contenders}; in
    teller) x=4 ;;
    ulam) c=1 ;;
    neumann) v=7 ;;
    esac
  • Only trap / handle signals you actually do care about.

  • Use the builtin readonly when declaring constants and immutable variable.

  • Assign integer variables, arrays, etc. with typeset/declare (see also).

  • Always work with return values instead of strings passed from a function or userland utility (where applicable).

  • Write generic small check functions instead of large init and clean-up code.

    1
    2
    3
    4
    5
    6
    7
    # both functions return non-zero on error
    function is_valid_string?() {
    [[ $@ =~ ^[A-Za-z0-9]*$ ]]
    }
    function is_integer?() {
    [[ $@ =~ ^-?[0-9]+$ ]]
    }
  • Be as modular and plugable as possible. If a project gets bigger, split it up into smaller files with clear and obvious naming scheme.

  • Clearly document code parts that are not easily understood (long chains of piped commands for example).

  • Try to stick to restricted mode where sensible and possible to use.

    • Use set -r with caution: while this flag is very useful for security sensitive environments, scripts have to be written with the flag in mind.
    • Adding restricted mode to an existing script will most likely break it.
  • Scripts should somewhat reflect the following general layout:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    #!/usr/bin/env bash
    #
    # AUTHORS, LICENSE and DOCUMENTATION
    #
    set -eu -o pipefail

    Readonly Variables
    Global Variables

    Import ("source scriptname") of external source code

    Functions
    `-. function local variables
    `-. clearly describe interfaces: return either a code or string

    Main
    `-. option parsing
    `-. log file and syslog handling
    `-. temp. file and named pipe handling
    `-. signal traps

    ------------------------------------------------------------------------
    To keep in mind:
    - quoting of all variables passed when executing sub-shells or cli tools
    - testing of functions, conditionals and flow (see style guide)
    - makes restricted mode ("set -r") for security sense here?
  • Silence is golden - like in any UNIX programm, avoid cluttering the terminal with useless output.

Resources

Linting and static analysis

Portability

Test driven development and Unit testing

Profiling

Flag

Script

  • A switch or simple Boolean option.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    #!/bin/bash

    while [ True ]; do
    if [ "$1" = "--alpha" -o "$1" = "-a" ]; then
    ALPHA=1
    shift 1
    else
    break
    fi
    done

    echo $ALPHA
  • Steps

    • Infinite loop until break instruction is reached on if.
    • The if statement attempts to match whatever argument is found in the first position ($1) to either --alpha or -a.
    • Prints the value of ALPHA when it finishes.

Test the script

  • It detects the --alpha argument.

    1
    2
    $ bash ./test.sh --alpha
    1
  • It detects the --a argument.

    1
    2
    $ bash ./test.sh -a
    1
  • No --alpha or -a argument, no output.

    1
    2
    $ bash ./test.sh

  • Extra arguments are ignored.

    1
    2
    3
    $ bash ./test.sh --alpha foo
    1
    $

Detecting arguments

Script

  • Catch arguments that aren’t intended as options: dump remaining arguments into a Bash array.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    #!/bin/bash

    while [ True ]; do
    if [ "$1" = "--alpha" -o "$1" = "-a" ]; then
    ALPHA=1
    shift 1
    else
    break
    fi
    done

    echo $ALPHA

    ARG=( "${@}" )
    for i in ${ARG[@]}; do
    echo $i
    done

Test the script

  • It detects the --alpha argument, and also types foo.

    1
    2
    3
    $ bash ./test.sh --alpha foo
    1
    foo
  • No --alpha argument so the output line is empty, it types foo after that empty line.

    1
    2
    3
    $ bash ./test.sh foo

    foo
  • It detects the --alpha argument, and also types foo and bar.

    1
    2
    3
    4
    $ bash ./test.sh --alpha foo bar
    1
    foo
    bar

Options with arguments

Script

  • Some options require an argument all their own.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    #!/bin/bash

    while [ True ]; do
    if [ "$1" = "--alpha" -o "$1" = "-a" ]; then
    ALPHA=1
    shift 1
    elif [ "$1" = "--config" -o "$1" = "-c" ]; then
    CONFIG=$2
    shift 2
    else
    break
    fi
    done

    echo $ALPHA
    echo $CONFIG

    ARG=( "${@}" )

    for i in ${ARG[@]}; do
    echo $i
    done
  • To implement this, you can use the shift keyword as you did on the switch, but shift the arguments by 2 instead of 1.

  • elif compares each argument to both --config and -c.

    • If match: the value of a variable called CONFIG is set to the value of whatever the second argument is.
    • This means that the –config option requires an argument.
    • All arguments shift place by 2: 1 to shift --config or -c, and 1 to move its argument.

Test the script

1
2
3
4
$ bash ./test.sh --config my.conf foo bar
my.conf
foo
bar
1
2
3
4
$ bash ./test.sh -a --config my.conf baz
1
my.conf
baz

CloudFront

  • CDN (Content Delivery Network)
  • It retrieves data from Amazon S3 bucket and distributes it to multiple datacenter locations.
  • It delivers the data through a network of data centers called edge locations. The nearest edge location is routed when the user requests for data, resulting in lowest latency, low network traffic, fast access to data, etc.

Set up

AWS Console - public bucket

  1. Sign in to AWS management console.
  2. Upload Amazon S3 and choose every permission public.
  3. Go to CloudFront console: Select a delivery method for your content - > Get Started.
    4.Origin Domain Name -> Amazon S3 bucket created.
  4. Next, dafult, and Create Distribution button.
  5. When the Status column changes from “In Progress” to “Deployed”, select the Enable option.
  6. Wait around 15 minutes for the domain name to be available in the Distributions list.

Cloudformation - private bucket

graph LR;

A[Bucket]
B[Cloudfront]
C[User]

A -- bucket data --> B;
B -- bucket data --> C;
C --> B;
B -- request with OAI --> A;

Bucket

1
2
3
4
5
6
7
8
Bucket:
Type: AWS::S3::Bucket
Properties:
AccessControl: Private
BucketName: private-bucket
Tags:
- Key: description
Value: "Private files"

OAI (Origin Access Identity)

1
2
3
4
5
CloudFrontOriginIdentity:
Type: AWS::CloudFront::CloudFrontOriginAccessIdentity
Properties:
CloudFrontOriginAccessIdentityConfig:
Comment: 'origin identity'

Update Bucket Policy

1
2
3
4
5
6
7
8
9
10
11
12
13
BucketPolicy:
Type: AWS::S3::BucketPolicy
Properties:
Bucket: private-bucket
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
AWS: !Sub 'arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity'
# you may get the recently created with '${CloudFrontOriginIdentity}'
Action: 's3:GetObject'
Resource: arn:aws:s3:::private-bucket/*

CloudFront Distribution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
publicDistribution:
Type: AWS::CloudFront::Distribution
Properties:
DistributionConfig:
Origins:
- DomainName: private-bucket.s3.us-east-2.amazonaws.com
# careful with '${bucket name}.s3.${region}.amazonaws.com'
Id: S3-private-bucket
S3OriginConfig:
OriginAccessIdentity: !Sub 'origin-access-identity/cloudfront/${CloudFrontOriginIdentity}'
Enabled: 'true'
Comment: Some comment
DefaultCacheBehavior:
AllowedMethods:
- GET
- HEAD
TargetOriginId: S3-private-bucket
ForwardedValues:
QueryString: 'false'
Cookies:
Forward: none
ViewerProtocolPolicy: redirect-to-https
ViewerCertificate:
CloudFrontDefaultCertificate: 'true'
0%