Thursday, 25 June 2015

bash history 时间分组


灵感来源:
不同 patch 不同 project。突然间要重新使用之前 patch to 某一个 project 的 commands, 才发现很死鬼乱水。因为那些 command paths 都是 similar 的, 除非有办法可以快速一眼望去哪些 commands 是坐落在哪一个时间段。

先看一下我的 ~/.bashrc 的一部分:

shopt -s histverify
shopt -s histappend
HISTTIMEFORMAT="%Y/%m/%d %T "
alias histime='history'
alias hisdefault='(HISTTIMEFORMAT=""; history;)'
alias h=hisdefault
alias htime=histime
HISTFILESIZE= 
HISTSIZE= 
HISTCONTROL=ignoreboth



上面的设置现阶段用得还 ok (除了 HISTSIZE=, 看我的解答), 但是 history 增加到成千上万的时候,那些 timestamp 就变得比较 meaningless, 一眼望去都没想过要去看时间。

通常我 type similar 的 command 时, 要 search 的时候只能靠 grep (Tab + Page Up/Down 也可以), 可是那个时间由于 multiple tabs 没有排好好,而且不懂哪里一个打哪里一个, 这个是我跑 histime 的截屏:


所以我就着手写了 hisblock.sh。

我用的 utilities 也是 builtin 的。可是我的 history 有 4 万多条 lines,shell script 一个个去 parse 很不 effcient, 慢到鬼样。

我也试过用很复杂的 binary tree , 除2 再 除 2 循环找到 time range, 但是越做越复杂,很可能有 bug 。

不过后来想想, 直接伪造两条纪录当浮标, 用 append 方式塞进去 history, 就能够直接 sort 了, 然后用 grep 找到浮标的 index, 再用 sed 拉出来, 完全不用烦什么 binary tree sorting 找时间。


然后分两种选择, either 用 default 的 B 或 加上 D。

Block 意思是连续 highlight color 只能在特定的时间范围内。 比如说你 pass 3600 秒, 在 1 小时内的 command lines 都是同一个黄色, 然后下一个小时换成红色, 以此论推。

Distance 则是 highlight color 会检查每一条 command 和下一条 command 的距离, 一旦超过特定时间范围才会变下一种 color。比如说你 pass 30 秒,如果接下去的每一个 commands 都在 30 秒内发生(不是总共哦,而是 "每条的下一条" 重新 check 30 秒), 那么就会全部同一种颜色。

代码如下:

#!/usr/bin/env bash
#Author: <limkokhole@facebook.com>
fname="hisblock.sh"
h_tmp_f="/tmp/hisblock.log"
h_tmp_f2="/tmp/hissorted.log"
p_usage () {
    echo -e "
        BASIC SYNOPSIS:
                source ${fname} SINGLE_QUOTE [from_date] [from_time] [to_date] [to_time] SINGLE_QUOTE interval_in_seconds [B|D]
        Example Usage:
                . ${fname} '2015/05/21' 120 #entire day
                . ${fname} '01:30:00 07:30:00' 120 #default today
                . ${fname} '2015/05/21 01:30:00 07:30:00' 120 #same day
                . ${fname} '2015/05/21 01:30:00 2015/05/22 12:30:00' 120 B #'B' stands for fixed time Block, default
                . ${fname} '2015/05/21 01:30:00 2015/05/22 12:30:00' 120 #120 seconds
                . ${fname} '2015/05/21 01:30:00 2015/05/22 12:30:00' 15 D #'D' for distance between each history line instead of fixed block
"
}

h_swap () {
    if (( "$start_t" > "$end_t" )); then #swap to ignore "to timestamp" and "from timestamp" arg order
        read start_t end_t <<<"$end_t $start_t"
    fi
}

#u must use source OR dot(like how .bashrc do) to run this script bcoz history corrupted even u do `HISTFILE=~/.bash_history` and `set -o history`
if [[ "$(basename -- "$0")" == "$fname" ]]; then
    echo "Don't run $0, instead please use source OR better use . dot" >&2
    p_usage
    exit #can only `return' from a function or sourced script
fi

do_distance=false
if [ "$#" -eq 3 ]; then
    if [[ "$3" == 'D' ]]; then
        do_distance=true
    fi
    t_block="$2"
elif [ "$#" -eq 2 ]; then
 t_block="$2"
else
    p_usage
    return #sourcing don't use exit
fi

d_atom=(`echo ${1}`)
d_len="${#d_atom[@]}"
if (( "$d_len" == 4 )); then
    start_t="$(date -d "${d_atom[0]} ${d_atom[1]}" +%s)" #start timestamp
    end_t="$(date -d "${d_atom[2]} ${d_atom[3]}" +%s)" #end timestamp
elif (( "$d_len" == 3 )); then
    start_t="$(date -d "${d_atom[0]} ${d_atom[1]}" +%s)" #start timestamp
    end_t="$(date -d "${d_atom[0]} ${d_atom[2]}" +%s)" #end timestamp
elif (( "$d_len" == 2 )); then
    today_d="$(date '+%Y/%m/%d')"
    start_t="$(date -d "${today_d} ${d_atom[0]}" +%s)" #start timestamp
    end_t="$(date -d "${today_d} ${d_atom[1]}" +%s)" #end timestamp
elif (( "$d_len" == 1 )); then
    start_t="$(date -d "${d_atom[0]} 00:00:00" +%s)" #start timestamp
    end_t="$(date -d "${d_atom[0]} 23:59:59" +%s)" #end timestamp
else
    p_usage
    return
fi
if [[ "$start_t" =~ ^[0-9]+$ && "$end_t" =~ ^[0-9]+$ && "$t_block" =~ ^[0-9]+$ ]]; then :; else p_usage; return; fi;
h_swap

HISTTIMEFORMAT="%s %Y/%m/%d %T "
next_t="0"

p_red=$(tput setaf 1)
p_green=$(tput setaf 10)
p_yellow=$(tput setaf 11)
p_blue=$(tput setaf 21)
p_orig=$(tput sgr0)
c_arr=($p_red $p_green $p_yellow)
c_arr_len="${#c_arr[@]}"
color_index=0

history >"$h_tmp_f"
printf "%s\n" "START $start_t `date -d @${start_t}`" >>"$h_tmp_f"
printf "%s\n" "END $end_t `date -d @${end_t}`" >>"$h_tmp_f"
sort -k2 -n "$h_tmp_f" > "$h_tmp_f2"
start_index="$(grep -n "^S" "$h_tmp_f2"|cut -f1 -d: )"
end_index="$(grep -n "^E" "$h_tmp_f2" |cut -f1 -d: )"
set -f #noglob
if [ "$do_distance" = false ] ; then #block
    sed -n $(($end_index + 1))'q;'"$start_index","$end_index"p "$h_tmp_f2" |  while read -r line; do
        h_atom=(`echo "${line}"`)
        curr_t="${h_atom[1]}"
        if (( "$curr_t" > "$next_t" )); then
            printf "%s" "${c_arr[ $(($color_index % $c_arr_len)) ]}"
            ((color_index+=1))
            ((next_t="$curr_t"+"$t_block"))
        fi
        h_tail=( "${h_atom[@]:2}" )
        echo "${h_atom[0]} ${h_tail[@]}"
    done
else #distance
    prev_t=0
    sed -n $(($end_index + 1))'q;'"$start_index","$end_index"p "$h_tmp_f2" |  while read -r line; do
        h_atom=(`echo "${line}"`)
        curr_t="${h_atom[1]}"
        if (( $(($curr_t - $prev_t)) > "$t_block" )); then
            printf "%s" "${c_arr[ $(($color_index % $c_arr_len)) ]}"
            ((color_index+=1))
        fi
        prev_t="$curr_t"
        h_tail=( "${h_atom[@]:2}" )
        echo "${h_atom[0]} ${h_tail[@]}"
    done
fi
set +f #reset glob
printf "%s" "${p_orig}"


然后这个是跑了 hisblock.sh 代码的截屏:


 跟之前的青一色相比, 是不是比较清晰了一些叻 :)