Blog | 老司机的文档集

使用cobbler批量安装centos系统

2022年4月20日 · 阅读需 9 分钟

基本介绍：

PXE（preboot execute environment）由Intel发明的通过网络快速引导操作系统的技术，其原理是在机器引导时通过server端为网卡DHCP分配IP信息，并通知client端next_server中的tftp地址，client端继续通过tftp下载系统引导镜像，加载并完成启动。这里我们还会用到另外一项技术叫kickstart，由红帽开发，早先用于其系统安装工具中以完成自动化安装，已被众多发行版支持。系统引导时可以通过kickstart配置文件中指定的安装流程自动完成后续步骤，减少人工干预。而通常手工配置dhcp、tftp、kickstart等往往比较繁琐，这里我们会利用红帽开发的另外一款工具cobbler，通过cobbler来完成整个dhcp、tftp、kickstart等组成的server端环境的快速搭建和管理，以此提高效率。

cobbler安装配置：

我们使用CentOS7作为server端系统，为了节约现场部署时间，我们将提前准备好环境并直接带到现场使用，以下所有操作将在一台ThinkPad上完成。

因私有化环境无需连外网，因此在实际使用时我们为了简化部署流程，可以将selinux和防火墙禁用掉，如需要启用防火墙的话则需要放开http/dhcp/tftp等服务的对应端口：

# disable selinux
sed -i 's/^SELINUX=.*$/SELINUX=disabled/' /etc/selinux/config

# disable iptables
systemctl disable firewalld
systemctl stop firewalld

reboot

安装cobbler及相关的依赖包：cobbler提供了命令行管理工具和一个web管理工具，分别由cobbler和cobbler-web两个包提供

yum install epel-release
yum install cobbler cobbler-web httpd dhcp tftp xinetd rsync bind

配置cobbler：cobbler配置文件放置在/etc/cobbler目录，在启动之前需要server端IP，dhcp等相关信息，首先修改 /etc/cobbler/settings主配置文件，需要修改的参数有以下：

# 通过以下命令生成系统安装后的默认root密码
openssl passwd -1
# 并将生成的密码修改到配置中
default_password_crypted: “$1$RUNYOYnz$QgzdhCD2T7qXWI1IPpAih0”

# server端ip，对外提供dhcp和http服务，必须为一个固定内网ip地址
server: 192.168.1.1

# next_server为tftp服务所在ip，通常是需要和server保持一致
next_server: 192.168.1.1

# 打开cobbler对相关服务的自动管理功能，如配置变更和启停等
manage_dhcp: 1
manage_tftpd：1

修改依赖组件的配置：

sed -i '/disable/c\\tdisable\t\t\t= no' /etc/xinetd.d/tftp
service xinetd restart
修改dhcp网段：vi /etc/cobbler/dhcp.template
subnet 192.168.1.0 netmask 255.255.255.0 {
     option routers             192.168.1.1;
     option domain-name-servers 192.168.1.1;
     range dynamic-bootp        192.168.1.100 192.168.1.200;
     option subnet-mask         255.255.255.0;
     filename                   "/pxelinux.0";
     default-lease-time         21600;
     max-lease-time             43200;
     next-server                $next_server;
}

启动服务：

systemctl start httpd
systemctl start cobblerd

systemctl enable httpd
systemctl enable cobblerd
 服务检查：cobbler提供了check命令可用于检查各项配置是否满足需要

cobbler check
# 通常第一次会提示下载loader
cobbler get-loaders
# 如中途修改cobbler配置后需重启cobbler服务
systemctl restart cobblerd
# 如变更了dhcp、tftp等相关信息需重新同步配置
cobbler sync
# 顺便配置好web管理页面的访问密码
htdigest /etc/cobbler/users.digest "Cobbler" cobbler

可以反复通过check命令来检查环境是否部署OK，并根据实际需求调整各项配置文件，直至check结果复合要求即可。至此cobbler的安装及配置完成。web端工具访问地址：https://192.168.1.1/cobbler_web

系统镜像准备：

接下来我们需要将系统镜像导入cobbler中，并自定义安装引导的kickstart配置。我们要部署到节点上的系统是CentOS7。需要注意的是如果需要通过kickstart定制一些基础软件包的安装，那么需要使用软件包更全的DVD iso，因minimal iso中提供的软件包有限。

# 将iso挂载到本地目录
mount -o auto CentOS-7-x86_64-DVD-1611.iso /mnt/
# 导入到cobbler中
cobbler import --name=centos7 --arch=x86_64 --path=/mnt
# 查看导入的系统及profile
cobbler distro list
cobbler distro report --name=centos7-x86_64
cobbler profile list
# 卸载iso mount point
umount /mnt/

可以看到上面的步骤中我们将CentOS7镜像导入到cobbler中，有几个核心概念需要理解：

distro - 及系统发行版本，不同的镜像导入后对应不同的distro，如centos7-x86_64，不同的distro对应不同的引导镜像；

profile - distro的配置文件，一个distro可以有多个profile，默认导入时会自动生成一个profile，不同的profile可以定义不同的kernel选项，使用不同的kickstart配置；

system - 各个机器所使用的profile实例，与机器MAC地址绑定，可以细化到机器级别的自定义安装，如果所有机器安装都是统一的则无需使用system配置。

接下来需要理解的是cobbler中对kickstart文件的管理方式，ks文件是我们需要重点关注的中间产物，决定了系统自动化部署的执行流程和最终效果。ks文件与profile绑定，默认生成的profile会指向一个默认的ks文件，通常我们需要对其进行自定义来满足不同的部署要求。当系统通过PXE引导至profile选择菜单后，一旦选定了需要部署的系统，接下来就会按照该profile所对应的ks文件来执行一系列的安装操作。

在cobbler中ks文件的实例是通过cgi动态生成的，而生成ks实例所依赖的则是ks templates和snippets， cobbler通过template来将ks文件主体流程部分模板化，通过snippets来管理可以在不同ks templates中公用的流程片段。

我们的需求如下：

安装一个精简的CentOS7系统；同时默认安装一些必要的软件包；首次安装时只对系统盘进行分区和格式化，其他磁盘不动；为了便于管理我们将更改网卡名为ethX，且默认禁用IPv6,；为了方便使用虚拟机测试整个安装流程，需要在磁盘分区时自动适配磁盘名如vda/sda；安装完成后能对一些基础配置进行初始化。

首先拷贝cobbler默认的template生成一个自定义的ks template，

# kickstart template
# (includes %end blocks)
# do not use with earlier distros

#platform=x86, AMD64, or Intel EM64T
# System authorization information
auth --useshadow --enablemd5
# System bootloader configuration
#bootloader --location=mbr
# Partition clearing information
clearpart --all --initlabel
# Use text mode install
text
# Firewall configuration
firewall --disabled
# Run the Setup Agent on first boot
firstboot --disable
# System keyboard
keyboard us
# System language
lang en_US
# Use network installation
url --url=$tree
# If any cobbler repo definitions were referenced in the kickstart profile, include them here.
$yum_repo_stanza
# Network information
$SNIPPET('network_config')
# Reboot after installation
reboot

#Root password
rootpw --iscrypted $default_password_crypted
# SELinux configuration
selinux --disabled
# Do not configure the X Window System
skipx
# System timezone
timezone Asia/Shanghai

# Install OS instead of upgrade
install
# Clear the Master Boot Record
zerombr
# Allow anaconda to partition the system as needed
#autopart
$SNIPPET('main_partition_select')

%pre
$SNIPPET('log_ks_pre')
$SNIPPET('kickstart_start')
$SNIPPET('pre_install_network_config')
$SNIPPET('pre_partition_select_custom')
# Enable installation monitoring
$SNIPPET('pre_anamon')
%end

%packages
@^minimal
@core
chrony
wget
net-tools
python-setuptools
rsync
lrzsz
expect
tcl
ntpdate
-selinux-policy*
-NetworkManager*
-kexec-tools
-snappy
-wpa_supplicant
-ppp
%end

%addon com_redhat_kdump --disable --reserve-mb='auto'

%end

%post --nochroot
$SNIPPET('log_ks_post_nochroot')
%end

%post
$SNIPPET('log_ks_post')
# Start yum configuration
$yum_config_stanza
# End yum configuration
$SNIPPET('post_install_kernel_options')
$SNIPPET('post_install_network_config')
$SNIPPET('download_config_files')
$SNIPPET('cobbler_register')
# Enable post-install boot notification
$SNIPPET('post_anamon')
$SNIPPET('post_install_custom_sys')
# Start final steps
$SNIPPET('kickstart_done')
# End final steps

%end

注意ks template中的红色部分为我们增加的自定义snippets，第一个pre_partition_select_custom作用是自动根据磁盘类型来生成分区和格式化选项，同时兼容虚拟机和物理机，内容如下：

# Determine architecture-specific partitioning needs
if [ -b /dev/vda ]; then
  cat >/tmp/partinfo << EOF
clearpart --initlabel --all
ignoredisk --only-use=vda
bootloader --location=mbr --boot-drive=vda --driveorder=vda
clearpart --initlabel --drives=vda
part /boot --fstype=ext3 --ondisk=vda --size=500
part / --fstype=xfs --size=1024 --grow --ondisk=vda --asprimary
EOF
elif [ -b /dev/sda ]; then
  cat >/tmp/partinfo << EOF
clearpart --initlabel --all
ignoredisk --only-use=sda
bootloader --location=mbr --boot-drive=sda --driveorder=sda
part /boot --fstype=ext3 --ondisk=sda --size=500
part / --fstype=xfs --size=100000 --ondisk=sda --asprimary
part /data --fstype=xfs --grow --ondisk=sda
EOF
fi

第二个post_install_custom_sys作用是在系统安装最后阶段对一些必要的配置进行更改，其中运行的是shell脚本，内容如下：

# cat snippets/post_install_custom_sys
if ! grep -q 'custom_sysctl' /etc/sysctl.conf; then
  cat >>/etc/sysctl.conf<<EOF
## custom_sysctl
fs.file-max = 262144
net.core.somaxconn = 10240
vm.swappiness = 0
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 1048576
net.core.wmem_default = 524288
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 2500
net.ipv4.tcp_max_syn_backlog = 40960
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 30
EOF
fi

chmod +x /etc/rc.d/rc.local

if grep -q '^UseDNS' /etc/ssh/sshd_config; then
  sed -i 's/^UseDNS .*/UseDNS no/' /etc/ssh/sshd_config
else
  sed -i 's/^#UseDNS .*/UseDNS no/' /etc/ssh/sshd_config
fi

接下来还需要修改内核引导参数，完成网卡名字的变更及IPv6禁用：

sed -i -e 's|^GRUB_CMDLINE_LINUX=\"|GRUB_CMDLINE_LINUX=\"net.ifnames=0 biosdevname=0 |g' /etc/default/grub
sed -i -e 's|^GRUB_CMDLINE_LINUX=\"|GRUB_CMDLINE_LINUX=\"ipv6.disable=1 |g' /etc/default/grub
grub2-mkconfig -o /boot/grub2/grub.cfg

通过这几部分的组合，即可生成一个完整可用的ks文件，下面我将介绍如何通过虚拟机来测试安装。

使用虚拟机测试PXE

安装虚拟化相关软件包，使用kvm虚拟机，同时安装图形界面虚拟机管理工具virt-manager方便安装操作。网络选择使用bridge模式,点击新建虚拟机，在安装选项中选择PXE,注意内存设置必须大于1G，否则PXE引导进入系统后很可能报错。

CVM使用ISO镜像安装银河麒麟v10 arm系统

2021年12月16日 · 阅读需 3 分钟

背景：云上没有kylin的arm镜像,需要自己做一个

1 准备

iso: Kylin-Server-10-SP2-aarch64-Release-Build09-20210524.iso

一台arm的cvm, 一块数据盘

scp Kylin-Server-10-SP2-aarch64-Release-Build09-20210524.iso x.x.x.x:/kylin.iso

2 配置grub

修改grub配置增加从iso引导的入口，重启机器时从iso引导进入安装流程

# cat /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.

menuentry 'Install Kylin Linux Advanced Server V10' --class red --class gnu-linux --class gnu --class os {
    set isolabel="Kylin-Server-10"
    set isofile="/kylin.iso"
    insmod iso9660
    loopback loop $isofile
    linux (loop)/images/pxeboot/vmlinuz inst.stage2=hd:LABEL=Kylin-Server-10 ro iso-scan/filename=$isofile console=tty0 video=efifb:off video=VGA-1:640x480-32@60me
    initrd (loop)/images/pxeboot/initrd.img
}

上面的参数从哪获取来？ 1

mount /kylin.iso /mnt
find /mnt -name grub.cfg

找到的内容作为linux行的参考

blkid /kylin.iso

可以获得isolabel信息

下一步

vi /etc/default/grub
#修改GRUB_TIMEOUT=60 增加timeout方便web vnc登录操作
grub2-mkconfig --ouput=/boot/grub2/grub.cfg
sync
reboot

3 开始装系统

系统会安装到数据盘，因为系统盘被iso占用，mount状态无法使用，必须有独立的数据盘用来装系统注意安装cloud-init包。

4 制作云镜像

重启回到原先的系统

yum -y install qemu-img
qemu-img convert -f raw -O qcow2 /dev/vdb /kylin.qcow2

5 镜像创建CVM后启动失败问题一例

报错信息如下：

/dev/disk/by-uuid/bed44859-b637-4490-b7f9-f62f952f6hfa Warning:does not exist

Generating "/run/initramfs/rdsosreport.txt"

Entering emergency mode. Exit the shell to continue."journalctl" to view system logs.TypeYou might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /bootaftermounting them and attach it to a bug report.

原因分析：

1.virtio驱动安装的不准确或者异常

2.内核缺陷本身导致

解决方法：

1.Virtio驱动的修复

查询
grep -i virtio /boot/config-$(uname -r)
是否包含在临时文件系统
lsinitrd /boot/initramfs-$(uname -r).img | grep virtio
修复临时文件系统
vim /etc/dracut.conf
add_drivers+="virtio_blk virtio_scsi virtio_net virtio_pci virtio_ring virtio"
dracut -f

2.内核缺陷规避

echo 'add_drivers+="xen-netfront xen-blkfront "' > /etc/dracut.conf.d/xen.conf
KERNEL_VERSION=$(rpm -q kernel --qf '%{V}-%{R}.%{arch}\n'|head -n1)
dracut -f /boot/initramfs-$KERNEL_VERSION.img $KERNEL_VERSION

银河麒麟v10 aarch64机器构建percona-xtrabackup-80 rpm包

2021年7月21日 · 阅读需 1 分钟

1 环境准备

yum install cmake3 openssl-devel libaio libaio-devel automake autoconf bison libtool ncurses-devel \
    libgcrypt-devel libev-devel libcurl-devel zlib-devel vim-common readline-devel python-sphinx rpm-build

2 获取最新SRPM包

# 查看需要下载的版本
https://repo.percona.com/yum/release/8/SRPMS/
#如：
wget https://repo.percona.com/yum/release/8/SRPMS/percona-xtrabackup-80-8.0.25-17.1.generic.src.rpm

3 BUILD RPM

rpm -ivh percona-xtrabackup-80-8.0.25-17.1.generic.src.rpm
cd ~/rpmbuild
rpmbuild -bb --nodebuginfo SPECS/percona-xtrabackup.spec

OVER

UOS arm64机器build percona-xtrabackup-80 deb包

2021年7月21日 · 阅读需 1 分钟

1 系统环境

root@VM-0-14-linux:~# cat /etc/os-release
PRETTY_NAME="uos 20"
NAME="uos"
VERSION_ID="20"
VERSION="20"
ID=uos
HOME_URL="https://www.chinauos.com/"
BUG_REPORT_URL="http://bbs.chinauos.com"

root@VM-0-14-linux:~# uname -a
Linux VM-0-14-linux 4.19.0-arm64-server #1635 SMP Mon Jan 13 16:07:12 CST 2020 aarch64 GNU/Linux

root@VM-0-14-linux:~# cat /etc/debian_version
10.1
root@VM-0-14-linux:~#

2 配置perconca官方apt源

wget https://repo.percona.com/apt/percona-release_latest.buster_all.deb
dpkg -i percona-release_latest.buster_all.deb
# 修改脚本中两个变量
vi /usr/bin/percona-release
CODENAME=buster
OS_VER=buster
# 开启perconca源
percona-release enable-only tools release

3 BUILD

# 安装依赖
apt-get build-dep percona-xtrabackup-80
# 构建
apt-get source --compile percona-xtrabackup-80

OVER

how to build a static tmux bin

2021年6月30日 · 阅读需 1 分钟

build-tmux-static.sh

#!/bin/bash
TARGETDIR=$1
if [ "$TARGETDIR" = "" ]; then
TARGETDIR=$(python -c 'import os; print os.path.realpath("local")')
fi
mkdir -p $TARGETDIR

libevent() {
  curl -LO https://github.com/libevent/libevent/releases/download/release-2.0.22-stable/libevent-2.0.22-stable.tar.gz
  tar -zxvf libevent-2.0.22-stable.tar.gz
  cd libevent-2.0.22-stable
  ./configure --prefix=$TARGETDIR && make && make install
  cd ..
}

ncurses() {
  curl -LO https://ftp.gnu.org/pub/gnu/ncurses/ncurses-6.0.tar.gz
  tar zxvf ncurses-6.0.tar.gz
  cd ncurses-6.0

  ./configure --with-termlib --prefix $TARGETDIR \
              --with-default-terminfo-dir=/usr/share/terminfo \
              --with-terminfo-dirs="/etc/terminfo:/lib/terminfo:/usr/share/terminfo" \
              --enable-pc-files \
              --with-pkg-config-libdir=$TARGETDIR/lib/pkgconfig \
  && make && make install
  cd ..
}

tmux() {
  curl -LO https://github.com/tmux/tmux/releases/download/3.2a/tmux-3.2a.tar.gz
  tar zxvf tmux-3.2a.tar.gz
  cd tmux-3.2a
  PKG_CONFIG_PATH=$TARGETDIR/lib/pkgconfig ./configure --enable-static --prefix=$TARGETDIR && make && make install
  cd ..
  cp $TARGETDIR/bin/tmux .
}

libevent
ncurses
tmux

使用dozzle通过web界面实时查看docker日志

2021年6月8日 · 阅读需 1 分钟

1 运行dozzle

docker run --detach --volume=/var/run/docker.sock:/var/run/docker.sock --net host  amir20/dozzle --addr 127.0.0.1:8080  --base /dockerlogs

2 反向代理

server {
    listen 80;
    server_name xxx;
    client_max_body_size 1G;
    add_header  Access-Control-Allow-Origin "https://xxx";
    add_header  Access-Control-Allow-Methods "GET, POST, OPTIONS";
    add_header  Access-Control-Allow-Headers "Origin, Authorization, Accept";
    add_header  Access-Control-Allow-Credentials true;

    location ^~ /dockerlogs {
        proxy_pass http://localhost:8080;
    }
}

3 访问

http://x.x.x.x/dockerlogs

中标麒麟系统ansible执行yum模块报错的问题分析

2021年3月22日 · 阅读需 7 分钟

在使用中标麒麟V7Update6版本时，遇到了一个ansible执行报错的问题

问题现象

在中标麒麟（neokylin）系统中部署某服务，使用到了ansible，但是执行时发现有yum模块的task报错如下：

TASK [common : Install basic rpms] **************************************************************************
fatal: [node01]: FAILED! => {"changed": false, "msg": ["Could not detect which major revision of yum is in use, which is required to determine module backend.", "You can manually specify use_backend to tell the module whether to use the yum (yum3) or dnf (yum4) backend})"]}

报错为yum模块无法判断出系统的yum版本，提示需要手工执行yum的use_backend参数。同样的task在原生RHEL7系统执行没有遇到任何问题，看样子调入了中标麒麟的某个坑里。

问题分析

根据报错，很明确是因为ansible无法自动判断出系统使用的yum版本导致，我们知道当ansible中yum模块不指定use_backend参数时，将尝试自动判断，而ansible的setup模块可以获取对应的必要信息，其中一个变量ansible_pkg_mgr及对应yum后端模块，接下来我们执行setup模块输出ansible_pkg_mgr变量来验证下我们的判断：

# ansible -i hosts node01 -m setup -a "filter=ansible_pkg_mgr"
node01 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false
}

果然没有办法获取到ansible_pkg_mgr变量，先看下系统版本信息:

~]# cat /etc/neokylin-release
NeoKylin Linux Advanced Server release V7Update6 (Chromium)

接下来根据报错提示信息找到ansible相关代码，在yum.py中，相关代码如下： ansible/plugins/action/yum.py

        if module not in ["yum", "yum4", "dnf"]:
            facts = self._execute_module(module_name="setup", module_args=dict(filter="ansible_pkg_mgr", gather_subset="!all"), task_vars=task_vars)
            display.debug("Facts %s" % facts)
            module = facts.get("ansible_facts", {}).get("ansible_pkg_mgr", "auto")
            if (not self._task.delegate_to or self._task.delegate_facts) and module != 'auto':
                result['ansible_facts'] = {'pkg_mgr': module}

        if module != "auto":

            if module == "yum4":
                module = "dnf"

            if module not in self._shared_loader_obj.module_loader:
                result.update({'failed': True, 'msg': "Could not find a yum module backend for %s." % module})
            else:
                # run either the yum (yum3) or dnf (yum4) backend module
                new_module_args = self._task.args.copy()
                if 'use_backend' in new_module_args:
                    del new_module_args['use_backend']

                display.vvvv("Running %s as the backend for the yum action plugin" % module)
                result.update(self._execute_module(module_name=module, module_args=new_module_args, task_vars=task_vars, wrap_async=self._task.async_val))
                # Now fall through to cleanup
        else:
            result.update(
                {
                    'failed': True,
                    'msg': ("Could not detect which major revision of yum is in use, which is required to determine module backend.",
                            "You can manually specify use_backend to tell the module whether to use the yum (yum3) or dnf (yum4) backend})"),
                }
            )
            # Now fall through to cleanup

如代码所示，当执行yum未指定use_backend参数时，ansible会执行setup模块并根据ansible_pkg_mgr来自动判断yum的版本，获取不到则会报错，继续看下该参数的获取过程，找到pkg_mgr.py，关键代码如下：

ansible/module_utils/facts/system/pkg_mgr.py

    def collect(self, module=None, collected_facts=None):
        facts_dict = {}
        collected_facts = collected_facts or {}

        pkg_mgr_name = 'unknown'
        for pkg in PKG_MGRS:
            if os.path.exists(pkg['path']):
                pkg_mgr_name = pkg['name']

        # Handle distro family defaults when more than one package manager is
        # installed or available to the distro, the ansible_fact entry should be
        # the default package manager officially supported by the distro.
        if collected_facts['ansible_os_family'] == "RedHat":
            pkg_mgr_name = self._check_rh_versions(pkg_mgr_name, collected_facts)
... ...

 def _check_rh_versions(self, pkg_mgr_name, collected_facts):
        if collected_facts['ansible_distribution'] == 'Fedora':
            if os.path.exists('/run/ostree-booted'):
                return "atomic_container"
            try:
                if int(collected_facts['ansible_distribution_major_version']) < 23:
                    for yum in [pkg_mgr for pkg_mgr in PKG_MGRS if pkg_mgr['name'] == 'yum']:
                        if os.path.exists(yum['path']):
                            pkg_mgr_name = 'yum'
                            break
                else:
                    for dnf in [pkg_mgr for pkg_mgr in PKG_MGRS if pkg_mgr['name'] == 'dnf']:
                        if os.path.exists(dnf['path']):
                            pkg_mgr_name = 'dnf'
                            break
            except ValueError:
                # If there's some new magical Fedora version in the future,
                # just default to dnf
                pkg_mgr_name = 'dnf'
        elif collected_facts['ansible_distribution'] == 'Amazon':
            pkg_mgr_name = 'yum'
        else:
            # If it's not one of the above and it's Red Hat family of distros, assume
            # RHEL or a clone. For versions of RHEL < 8 that Ansible supports, the
            # vendor supported official package manager is 'yum' and in RHEL 8+
            # (as far as we know at the time of this writing) it is 'dnf'.
            # If anyone wants to force a non-official package manager then they
            # can define a provider to either the package or yum action plugins.
            if int(collected_facts['ansible_distribution_major_version']) < 8:
                pkg_mgr_name = 'yum'
            else:
                pkg_mgr_name = 'dnf'
        return pkg_mgr_name

以上代码可以看到当判断系统为红帽系，则会继续判断系统版本信息，当主版本号小于8则使用yum，否则使用dnf，这里我们初步判断为麒麟对系统做了某些修改导致无法获取到主版本号。先执行setup获取发行版代号验证下是否执行了上述逻辑：

# ansible -i hosts node01 -m setup -a "filter=ansible_distribution"
node01 | SUCCESS => {
    "ansible_facts": {
        "ansible_distribution": "RedHat",
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false
}

# ansible -i hosts node01 -m setup -a "filter=ansible_distribution_major_version"
node01 | SUCCESS => {
    "ansible_facts": {
        "ansible_distribution_major_version": "V7Update6",
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false
}

通过setup模块的输出结果可看到系统是判断为redhat发行版，但是通过ansible_distribution_major_version获取到的发行版主版本号为V7Update6, 而和上面判断yum版本的代码关联起来看就会发现问题所在，int(collected_facts['ansible_distribution_major_version']) < 8 中，ansible_distribution_major_version 变量在其初始化的代码中对应为为distribution_version.split('.')[:2][0]的取值，而当系统中获取到的值是V7Update6时，该显然无法满足转换为int的要求。接下来看下V7Update6这个关键字的定义位置，根据经验系统版本相关信息应该在/etc/os-release中：

~]# cat /etc/os-release
NAME="NeoKylin Linux Advanced Server"
VERSION="V7Update6 (Chromium)"
ID="neokylin"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="V7Update6"
PRETTY_NAME="NeoKylin Linux Advanced Server V7Update6 (Chromium)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:neokylin:enterprise_linux:V7Update6:GA:server"
HOME_URL="https://www.cs2c.com.cn/"
BUG_REPORT_URL="https://bugzilla.cs2c.com.cn/"

NEOKYLIN_BUGZILLA_PRODUCT="NeoKylin Linux Advanced Server 7"
NEOKYLIN_BUGZILLA_PRODUCT_VERSION=V7Update6
NEOKYLIN_SUPPORT_PRODUCT="NeoKylin Linux Advanced Server"
NEOKYLIN_SUPPORT_PRODUCT_VERSION="V7Update6"

这里果然可以看到VERSION_ID的值被定义为V7Update6，而系统原生发行版中该值是7，我们来看下os-release中对VERSION_ID参数的说明：

man os-release
... ...

       VERSION_ID=
           A lower-case string (mostly numeric, no spaces or other characters outside of 0-9, a-z, ".",
           "_" and "-") identifying the operating system version, excluding any OS name information or
           release code name, and suitable for processing by scripts or usage in generated filenames. This
           field is optional. Example: "VERSION_ID=17" or "VERSION_ID=11.04".
... ...

根据man文档中的描述，VERSION_ID取值范围为全小写，通常为数值型，不应有空格或其他特殊字符，可包含的字符为0-9a-z._-,那么这里可以看到两个问题，第一个问题是kylin的VERSION_ID不符合此描述，包含了大写字符，第二个问题是VERSION_ID可以包含a-z字母，但是通常是数值如17,11.04等。但由于常见发行版都将此处处理为数值型，就导致ansible按照此约定俗成固化了其获取系统版本的方法，并试图将一个字符串转换为int，不能满足当VERSION_ID包含了字母的情况。

验证结论

通过以上判断看到VERSION_ID是导致该问题现象的关键，那么我们可以尝试修改一下该参数值，再执行setup看看是否可以正常工作：

# grep VERSION_ID /etc/os-release
VERSION_ID="7"

这里我把VERSION_ID修改成了数字7，再执行setup观察ansible_pkg_mgr变量是否能获取到：

# ansible -i hosts node01 -m setup -a "filter=ansible_pkg_mgr"
node01 | SUCCESS => {
    "ansible_facts": {
        "ansible_pkg_mgr": "yum",
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false
}

可以看到，修改os-release中VERSION_ID为纯数值后，setup就可以正常判断到系统版本，进而可以获取到正确的yum版本了。通过以上可以看到操作系统中即便是一些不起眼的细枝末节，处理不当也可能引发"连锁反应"。

bashrc与profile的加载顺序

2021年2月24日 · 阅读需 3 分钟

在使用bashrc和profile设置环境变量时，如果多个地方都有同一个变量的设置，则需要注意不同配置文件的加载顺序问题

背景

如果加载顺序没弄明白，有可能会在使用过程中遇到各种困扰，比如为什么设置了profile但是环境变量不生效？为什么变量ssh后获取的不一样？下面我们以CentOS7系统为例，通过一个简单的小实验来观察下到底bash的几个配置文件加载顺序是怎样的。

我们知道可以用来设置环境变量的文件常用的有以下几个：

/etc/profile
/etc/profile.d/*.sh
/etc/bashrc
~/.bash_profile
~/.bashrc

而不同的文件加载时机又分为login shell和non-login shell两种情况。这两种情况需要区分对待，及不同的文件要在对应场景下才能生效。假设有一个相同的变量设置出现在各个文件里面，通过对不同文件的变量值进行差异设置即可观察出各个配置的加载优先级和生效情况。

实验

先写入各个配置文件如下：

# tail -n1 /etc/profile /etc/bashrc /etc/profile.d/well.sh ~/.bash_profile ~/.bashrc
==> /etc/profile <==
export WELL=etc-profile

==> /etc/bashrc <==
export WELL=etc-bashrc

==> /etc/profile.d/well.sh <==
export WELL=etc-profile-d

==> /root/.bash_profile <==
export WELL=home-bash-profile

==> /root/.bashrc <==
export WELL=home-bashrc

接下来开始观察，需要注意的是每次修改配置之后新开shell重新加载环境配置：

[root@localhost ~]# echo $WELL
home-bash-profile
[root@localhost ~]# ssh localhost 'echo $WELL'
home-bashrc
[root@localhost ~]#


[root@localhost ~]# sed -i '$d' ~/.bashrc
[root@localhost ~]# sed -i '$d' ~/.bash_profile
[root@localhost ~]#


[root@localhost ~]# echo $WELL
etc-bashrc
[root@localhost ~]# ssh localhost 'echo $WELL'
etc-bashrc
[root@localhost ~]#


[root@localhost ~]# sed -i '$d' /etc/bashrc


[root@localhost ~]# echo $WELL
etc-profile
[root@localhost ~]# ssh localhost 'echo $WELL'
etc-profile-d
[root@localhost ~]#

# 重新写入~/.bashrc后
[root@localhost ~]# echo $WELL
home-bashrc
[root@localhost ~]# ssh localhost 'echo $WELL'
etc-profile-d
[root@localhost ~]#


# 重新写入~/.bash_profile,去掉~/.bashrc后
[root@localhost ~]# echo $WELL
home-bash-profile
[root@localhost ~]# ssh localhost 'echo $WELL'
etc-profile-d
[root@localhost ~]#

需要注意的是以上测试是将变量放到每个配置末行，因为配置之间有互相加载的机制，如果放在其他位置则测试结果会不一样。

结论

观察上面的结果，可以得出以下实验结论：

1 login shell会加载所有配置,优先级为~/.bash_profile ~/.bashrc /etc/bashrc /etc/profile /etc/profile.d

2 non-login shell时加载优先级为 ~/.bashrc /etc/bashrc /etc/profile.d

3 non-login shell不会加载的配置有 ~/.bash_profile /etc/profile

4 两种情况下都会加载的有~/.bashrc /etc/bashrc /etc/profile.d

那么如果我们需要在系统全局设置一个环境变量，要保证login shell和non-login shell都能表现一致，需要如何设置呢？

因为~/.bashrc为用户局部配置文件，不影响全局，而/etc/bashrc为系统内置文件不建议修改，如果是有全局环境变量需要设置建议放置到/etc/profile.d

over.

使用conventional-changelog和Strapdown.js为git仓库自动生成changelog html页面

2021年1月15日 · 阅读需 3 分钟

一个项目的changelog对于使用者来说虽然不需要重点关注，但很重要

基本思路

通常软件产品对外发布时，我们需要提供一份changelog以告知使用者新版本所发生的变化，有两种方式可以产生需要的changelog内容，一种是人工整理和编写，另外一种是通过工具实现自动化。这里介绍一种通过开源工具的组合快速实现自动生成的方法。

我们在开发过程中所有变更都会反映到git commit messages里面，git提交历史几乎可以反映软件的所有变更，基于此我们可以使用工具直接将git提交历史转化为changelog，再经过简单加工处理即可对外输出一个html页面。

规范提交

这就要求在代码提交过程中我们的commit message要规范化，其中一种被广为认可的规范名为约定式提交。详细可参考约定式提交一个简单的提交类型参考如下：

build: 变更仅影响工具出包或者build环境等外部依赖问题
ci: 对CI配置的变更
docs: 仅文档内容变更
feat: 新特性
fix: bug修复
perf: 无bug修复/无新特性，仅性能提升
refactor: 无bug修复/无新特性/无性能提升，仅重构
style: 仅代码风格更改
test: 仅测试代码变更

提交转化为markdown

有了规范的提交记录，下面就可以通过工具实现提交记录到markdown的转化。这里介绍一个工具叫conventional-changelog，命令行版本使用方法如下：

# install
npm install -g conventional-changelog-cli
# generate changelog markdown file
cd your-git-repo-project-home
conventional-changelog -p angular -i CHANGELOG.md -s -r 0

示例中用到的参数：

-i : 读入已有changelog文件
-p : 预设模板，可以是angular/atom/codemirror/ember/eslint/express/jquery/jscs/jshint
-s : 写到目标文件名和-i指定的文件同名
-r : 指定需要生成的release数量，0表示重新生成所有

更多参数可以执行conventional-changelog --help查看。

markdown转化为html

这样我们就得到了一份名为CHANGELOG.md的历史变更记录文件，为markdown格式。接下来再通过另外一个工具名叫strapdown.js来自动生成html。

strapdown.js是一个js文件，不需要像上面生成markdown那样在server端生成，只需要在单个html页面中引入该js文件即可实现从markdown自动渲染出html页面。详细可参考strapdown.js

使用方法如下：

cat >changelog.html <<"EOF"
<!DOCTYPE html>
<html>
<title>XXX Changelog</title>
<meta charset="utf-8">
<xmp theme="darkly" style="display:none;">
EOF

cat CHANGELOG.md >>changelog.html
cat >>changelog.html <<"EOF"
</xmp>
<script src="http://strapdownjs.com/v/0.2/strapdown.js"></script>
</html>

这样我们就通过拼接的方式生成了一份changelog.html。需要注意的是changlog内容中不能包含</xmp>关键字。

over.

使用rpmrebuild修改rpm包内容

2020年4月7日 · 阅读需 1 分钟

某些特殊紧急情况下... ...

某些特殊紧急情况下没法等到重新从源码编译打包，手里只有一个打包好的rpm，但是里面内容需要在安装前就改掉，比如修改某个文件内容等，这个时候rpmrebuild命令可以派上用场。 rpmrebuild工作时会把rpm包内容释放到一个临时目录，如果需要修改rpm包里面的文件的话，可以通过-m参数指定执行的命令，比如/bin/bash，这样就可以得到一个交互式的shell，有了交互式shell想象空间就很大了，你可以在这个shell环境下对rpm包释放出来的文件任意修改，当退出这个shell时，rpmrebuild会把改动打包回新的rpm。例如：

rpmrebuild -m /bin/bash -np rpm/xxx.rpm
# 此时我们得到一个交互shell，
# 比如知道需要修改的文件名为aaa，可以这样操作：
find / -name aaa
# 尽情发挥吧，完了退出
ctrl+D

现在你就得到修改好内容之后的新rpm了。

基本介绍：​

cobbler安装配置：​

系统镜像准备：​

使用虚拟机测试PXE​

1 准备​

2 配置grub​

3 开始装系统​

4 制作云镜像​

5 镜像创建CVM后启动失败问题一例​

报错信息如下：​

原因分析：​

1.virtio驱动安装的不准确或者异常​

2.内核缺陷本身导致​

解决方法：​

1.Virtio驱动的修复​

2.内核缺陷规避​

1 环境准备​

2 获取最新SRPM包​

3 BUILD RPM​

OVER​

1 系统环境​

2 配置perconca官方apt源​

3 BUILD​

OVER​

1 运行dozzle​

2 反向代理​

3 访问​

问题现象​

问题分析​

验证结论​

背景​

实验​

结论​

基本思路​

规范提交​

提交转化为markdown​

markdown转化为html​

基本介绍：

cobbler安装配置：

系统镜像准备：

使用虚拟机测试PXE

1 准备

2 配置grub

3 开始装系统

4 制作云镜像

5 镜像创建CVM后启动失败问题一例

报错信息如下：

原因分析：

1.virtio驱动安装的不准确或者异常

2.内核缺陷本身导致

解决方法：

1.Virtio驱动的修复

2.内核缺陷规避

1 环境准备

2 获取最新SRPM包

3 BUILD RPM

OVER

1 系统环境

2 配置perconca官方apt源

3 BUILD

OVER

1 运行dozzle

2 反向代理

3 访问

问题现象

问题分析

验证结论

背景

实验

结论

基本思路

规范提交

提交转化为markdown

markdown转化为html