Tools

See Also: https://westurner.github.io/wiki/projects#tools

Packages

A software package is an archive of files with a manifest that lists the files included. Often, the manifest contains file checksums and a signature.

Many packaging tools make a distinction between source and/or binary packages.

Some packaging tools provide configuration options for:

  • Scripts to run when packaging

  • Scripts to run at install time

  • Scripts to run at uninstal time

  • Patches to apply to the “vanilla” source tree, as might be obtained from a version control repository

There is a package maintainer whose responsibilities include:

  • Testing new upstream releases

  • Vetting changes from release to release

  • Repackaging upstream releases

  • Signing new package releases

Packaging lag refers to how long it takes a package maintainer to repackage upstream releases for the target platform(s).

Anaconda

Anaconda is a maintained distribution of Conda packages for many languages; especially Python.

Note

https://en.wikipedia.org/wiki/Anaconda_(installer) (1999) is the installer for RPM-based Linux distributions; which is also written in Python (and C).

APT

APT (“Advanced Packaging Tool”) is the core of Debian package management.

  • An APT package repository serves DEB packages created with Dpkg.

  • An APT package repository can be accessed from a local filesystem or over a network protocol (“apt transports”) like HTTP, HTTPS, RSYNC, FTP, and BitTorrent (debtorrent).

    An example of APT usage (e.g. to maintain an updated Ubuntu Linux system):

apt-get update
apt-get upgrade
apt-get dist-upgrade

apt-cache show bash
apt-get install bash

apt-get --help
man apt-get
man sources.list

AUR

AUR (Arch User Repository) contains PKGBUILD packages which can be installed by pacman.

Bower

Bower is “a package manager for the web” (Javascript packages) built on NPM.

Conda

Conda is a package build, environment, and distribution system written in Python to install packages written in any language.

  • Conda was originally created for the Anaconda Python Distribution, which installs packages written in Python, R, Javascript, Ruby, C, Fortran

  • Conda packages are basically tar archives with build, and optional link/install and uninstall scripts.

  • conda-build generates conda packages from conda recipes with a meta.yaml, a build.sh, and/or a build.bat.

  • Conda recipes reference and build from a source package URI OR a VCS URI and revision; and/or custom build.sh or build.bat scripts.

  • conda skeleton can automatically create conda recipes from PyPI (Python), CRAN (R), and CPAN (Perl)

  • conda skeleton-generated recipes can be updated with additional metadata, scripts, and source URIs (as separate patches or consecutive branch commits of e.g. a conda-recipes repository in order to get a diff of the skeleton recipe and the current recipe).

  • Conda (and Anaconda) packages are hosted by https://binstar.org, which hosts free public and paid private Conda packages.

    • Anaconda Server is an internal “Private, Secure Package Repository” that “supports over 100 different repositories, including PyPI, CRAN, conda, and the Anaconda repository.”

To create a fresh conda env:

# Python 2.7
conda create -n science --yes python readline conda-env

# Python 3.X
conda create -n science3 --yes python=3 readline conda-env

Work on a conda env:

source activate exmpl2
conda list
source deactivate

conda-env writes to and creates environments from environment.yml files which list conda and Pip packages.

Work with conda envs and environment.yml files:

# Install conda-env globally (in the "root" conda environment)
conda install -n root conda-env

# Create a conda environment with ``conda-create`` and install conda-env
conda create -n science python=3 readline conda-env pip

# Install some things with conda (and envs/science/bin/pip)
# https://github.com/westurner/notebooks/blob/gh-pages/install.sh
conda search pandas; conda info pandas
conda install blaze dask bokeh odo \
              sqlalchemy hdf5 h5py \
              scikit-learn statsmodels \
              beautiful-soup lxml html5lib pandas qgrid \
              ipython-notebook
pip install -e git+https://github.com/rdflib/rdflib@master#egg=rdflib
pip install arrow sarge structlog

# Export an environment.yml
#source deactivate
conda env export -n science | tee environment.yml

# Create an environment from an environment.yml
conda env create -n projectname -f ./environment.yml

To install a conda package from a custom channel:

conda install -c pydanny cookiecutter   # OR pip install cookiecutter

The conda-forge custom channel packages are built with continuous integration on multiple platforms:

Sources:

See also: Anaconda, conda-forge (conda-smithy)

conda-forge

# create a conda package recipe from a pypi package
cd $VIRTUAL_ENV/src
conda skeleton pypi jupyterthemes
ls -ld jupyterthemes/
edit jupyterthemes/meta.yaml
# - git repo tags || pypi releases

# create a conda-forge feedstock from a conda recipe
## https://github.com/conda-forge/conda-smithy#making-a-new-feedstock
cd $VIRTUAL_ENV/src
ls -ld jupyterthemes
conda-smithy init jupyterhemes
ls jupyterthemes-feedstock/

# build a conda-forge feedstock with docker
# FROM condaforge/linux-anvil
cat ./scripts/run_docker_build.sh
./scripts/run_docker_build.sh
./ci_support/run_docker_build.sh

DEB

DEB is the Debian software package format.

DEB packages are built with Dpkg and often hosted in an APT package repository.

dnf

dnf is a an open source package manager written in Python.

  • dnf was introduced in Fedora 18.

  • dnf is the default package manager in Fedora 22; replacing Yum.

    • [ ] yum errors if TODO package is installed (* Salt provider)

    • [ ] repoquery redirects with an error to dnf repoquery

    • See dnf help (and man dnf)

  • dnf integrates with the Anaconda system installer.

  • dnf supports Delta RPM packages (DRPM), which often significantly reduce the required amount of network transfer required to regularly retrieve and upgrade to the latest repository packages.

ebuild

ebuild is a software package definition format.

  • ebuilds are like special Bash scripts.

  • ebuilds have USE flags for specifying build features.

  • Gentoo is built from ebuild package definitions stored in Gentoo Portage.

  • Portage packages are built from ebuilds.

  • The emerge Portage command installs ebuilds.

fpm

fpm (effing package management) is a tool for building many types of software packages from many other types of software packages (e.g. DEB. RPM, Python Packages); often more easily than working with the actual package manager.

  • fpm package source types include: dir rpm gem python empty tar deb cpan npm osxpkg pear pkgin virtualenv zip.

  • fpm target package types include: rpm deb solaris puppet dir osxpkg p5p puppet sh tar zip

Homebrew

Homebrew is a package manager (brew) for OS X.

NPM

NPM is a Javascript package manager created for Node.js.

  • an NPM package is defined by a package.json JSON file.

  • NPM packages are installed with the npm CLI utility.

  • Bower builds upon NPM.

NuGet

NuGet is an open source package manager for Windows.

pacman

Pacman is an open source package manager which installs .pkg.tar.xz files for Arch Linux.

PEX

PEX (Python Executable) is a ZIP-based software package archive format with an executable header.

Ports

A Ports collection contains Sources (e.g. archived releases and patch sets) and Makefiles designed to compile software Packages for particular operating systems distributions’ kernel and standard libraries usually for a particular platform.

RPM

RPM (RPM Package Manager, RedHat Package Manager) is a package format and a set of commandline utilities written in C and Perl.

  • RPM packages can be installed with rpm, Yum, dnf.

  • RPM pacage can be built with tools like rpmbuild and fpm

  • Python packages can be built into RPM packages with setuptools’ bdist_rpm, fpm

  • List contents of RPM packages (archives) with e.g. less and lesspipe:

    less ~/path/to/local.rpm   # requires lesspipe to be configured
    
  • RPM Packages are served by and retrieved from repositories by tools like Yum and dnf:

    • Local: directories of RPM packages and metadata

    • Network: HTTP, HTTPS, rsync, FTP

    • dnf supports Delta RPM packages (DRPM), which often significantly reduce the required amount of network transfer required to regularly retrieve and upgrade to the latest repository packages.

Note

There’s not yet a debtorrent for RPM, Yum, dnf.

Python Packages

A Python Package is a collection of source code and package data files.

  • Python packages have dependencies: they depend on other packages

  • Python packages can be served from a package index

  • PyPI is the community Python Package Index

  • A Python package is an archive of files (.zip (.egg, .whl), .tar, .tar.gz,) containing a setup.py file containing a version string and metadata that is meant for distribution.

  • An source dist (sdist) package contains source code (every file listed in or matching a pattern in a MANIFEST.in text file).

  • A binary dist (bdist, bdist_egg, bdist_wheel) is derived from an sdist and may be compiled and named for a specific platform.

  • sdists and bdists are defined by a setup.py file which contains a call to a distutils.setup() or setuptools.setup() function.

  • The arguments to the setup.py function are things like version, author, author_email, and homepage; in addition to package dependency strings required for the package to work (install_requires), for tests to run (tests_require), and for optional things to work (extras_require).

  • A package dependency string can specify an exact version (==) or a greater-than (>=) or less-than (<=) requirement for each package.

  • Package names are looked up from an index server (--index), such as PyPI, and or an HTML page (--find-links) containing URLs containing package names, version strings, and platform strings.

  • easy_install (Setuptools) and Pip can install packages from: the local filesystem, a remote index server, or a local index server.

  • easy_install and pip read the install_requires (and extras_require) attributes of setup.py files contained in packages in order to resolve a dependency graph (which can contain cycles) and install necessary packages.

.

Note

JSON-LD for package metadata and environment build metadata could be helpful.

Distutils

Distutils is a collection of tools for common packaging needs.

  • Distutils is included in the Python standard library.

Setuptools

Setuptools is a Python package for working with other Python Packages.

  • Setuptools builds upon Distutils

  • Setuptools is widely implemented

  • Most Python packages are installed by setuptools (by Pip)

  • Setuptools can be installed by downloading ez_setup.py and then running python ez_setup.py; or, setuptools can be installed with a system package manager (apt, yum)

  • Setuptools installs a script called easy_install which can be used to install packages from the local filesystem, a remote index server, a local index server, or an HTML page

  • easy_install pip installs Pip from PyPI

  • Like easy_install, Pip installs python packages, with a number of additional configuration options

  • Setuptools can build RPM and DEB packages from python packages, with some extra configuration:

    python setup.py bdist_rpm --help
    python setup.py --command-packages=stdeb.command bdist_deb --help
    

Pip

Pip is a tool for installing, upgrading, and uninstalling Python packages.

pip help
pip help install
pip --version

sudo apt-get install python-pip
pip install --upgrade pip

pip install libcloud
pip install -r requirements.txt
pip uninstall libcloud
  • Pip stands upon Distutils and Setuptools.

  • Pip retrieves, installs, upgrades, and uninstalls packages.

  • Pip can list installed packages with pip freeze (and pip list).

  • Pip can install packages as ‘editable’ packages (pip install -e) from version control repository URLs which must begin with vcs+, end with #egg=<usuallythepackagename>, and may contain an @vcstag tag (such as a branch name or a version tag).

  • Pip installs packages as editable by first cloning (or checking out) the code to ./src (or ${VIRTUAL_ENV}/src if working in a Virtualenv) and then running setup.py develop.

  • Pip configuration is in ${HOME}/.pip/pip.conf.

  • Pip can maintain a local cache of downloaded packages, which can lessen the load on package servers during testing.

  • Pip skips reinstallation if a package requirement is already satisfied.

  • Pip requires the --upgrade and/or --force-reinstall options to be added to the pip install command in order to upgrade or reinstall.

  • At the time of this writing, the latest stable pip version is 1.5.6.

Warning

With Python 2, pip is preferable to Setuptools’s easy_install because pip installs backports.ssl_match_hostname in order to validate HTTPS certificates (by making sure that the certificate hostname matches the hostname from which the DNS resolved to).

Cloning packages from source repositories over ssh:// or https://, either manually or with pip install -e avoids this concern.

There is also a tool called Peep which requires considered-good SHA256 checksums to be specified for every dependency listed in a requirements.txt file.

For more information, see: https://legacy.python.org/dev/peps/pep-0476/#python-versions

Pip Requirements File

Plaintext list of packages and package URIs to install.

Requirements files may contain version specifiers (pip >= 1.5)

Pip installs Pip Requirement Files:

pip install -r requirements.txt
pip install --upgrade -r requirements.txt
pip install --upgrade --user --force-reinstall -r requirements.txt

An example requirements.txt file:

# install pip from the default index (PyPI)
pip
--index=https://pypi.python.org/simple --upgrade pip

# Install pip 1.5 or greater from PyPI
pip >= 1.5

# Git clone and install pip as an editable develop egg
-e git+https://github.com/pypa/pip@1.5.X#egg=pip

# Install a source distribution release from PyPI
# and check the MD5 checksum in the URL
https://pypi.python.org/packages/source/p/pip/pip-1.5.5.tar.gz#md5=7520581ba0687dec1ce85bd15496537b

# Install a source distribution release from Warehouse
https://warehouse.python.org/packages/source/p/pip/pip-1.5.5.tar.gz

# Install an additional requirements.txt file
-r requirements/more-requirements.txt

Peep

Peep works just like Pip, but requires SHA256 checksum hashes to be specified for each package in requirements.txt file.

Warehouse

Warehouse is the “Next Generation Python Package Repository”.

All packages uploaded to PyPI are also available from Warehouse.

Wheel

  • Wheel is a newer, PEP-based standard (.whl) with a different metadata format, the ability to specify (JSON) digital signatures for a package within the package, and a number of additional speed and platform-consistency advantages.

  • Wheels can be uploaded to PyPI.

  • Wheels are generally faster than traditional Python packages.

Packages available as wheels are listed at https://pythonwheels.com/.

RubyGems

RubyGems is a package manager for Ruby packages (“Gems”).

Yum

Yum is a tool for installing, upgrading, and uninstalling RPM packages.

Version Control Systems

Version Control Systems (VCS) — or Revision Control Systems (RCS) — are designed to solve various problems in change management.

  • VCS store code in a repository.

  • Changes to one or more files are called changesets, commits, or revisions

  • Changesets are comitted or checked into to a repository.

  • Changesets are checked out from a repository

  • Many/most VCS differentiate between the repository and a working directory, which is currently checked out to a specific changeset identified by a revision identifier; possibly with uncommitted local changes.

  • A branch is forked from a line of development and then merged back in.

  • Most projects designate a main line of development referred to as a trunk, master, or default branch.

  • Many projects work with feature and release branches, which, ideally, eventually converge by being merged back into trunk. (see: HubFlow for an excellent example of branching)

  • Traditional VCS are centralized on a single point-of-failure.

  • Some VCS have a concept of locking to prevent multiple peoples’ changes from colliding

  • Distributed Version Control Systems (DVCS) (can) clone all revisions of every branch of a repository every time. *

  • DVCS changesets are pushed to a different repository

  • DVCS changesets are pulled from another repository into a local clone or copy of a repository

  • Teams working with DVCS often designate a central repository hosted by a project forge service like SourceForge, GNU Savannah, GitHub, or BitBucket.

  • Contributors send patches which build upon a specific revision, which can be applied by a maintainer with commit access permissions.

  • Contributors fork a new branch from a specific revision, commit changes, and then send a pull request, which can be applied by a maintainer with commit access permissions.

CVS

CVS (cvs) is a centralized version control system (VCS) written in C.

CVS predates most/many other VCS.

Subversion

Apache Subversion (svn) is a centralized revision control system (VCS) written in C.

To checkout a revision of a repository with svn:

svn co https://svn.apache.org/repos/asf/subversion/trunk subversion

Bazaar

GNU Bazaar (bzr) is a distributed revision control system (DVCS, RCS, VCS) written in Python and C.

https://launchpad.net hosts Bazaar repositories; with special support from the bzr tool in the form of lp: URIs like lp:bzr.

To clone a repository with bzr:

bzr branch lp:bzr

Git

Git (git) is a distributed version control system for tracking a branching and merging repository of file revisions written in C (DVCS, VCS, RCS).

To clone a repository with git:

git clone https://github.com/git/git

GitFlow

GitFlow is a named branch workflow for Git with master, develop, feature, release, hotfix, and support branches (git flow).

Gitflow branch names and prefixes are configured in .git/config; the defaults are:

GitFlow Branch Names

Branch Name

Description (and Code Labels)

master

Stable trunk (latest release)

develop

Development main line

feature/<name>

New features for the next release (e.g. ENH, PRF)

release/<name>

In-progress release branches (e.g. RLS)

hotfix/<name>

Fixes to merge to both master and develop (e.g. BUG, TST, DOC)

support/<name>

“What is the ‘support’ branch?”

https://github.com/nvie/gitflow/wiki/FAQ

Creating a new release with Git and GitFlow:

git clone ssh://git@github.com/westurner/dotfiles
# git checkout master
# git checkout -h
# git help checkout (man git-checkout)
# git flow [<cmd> -h]
# git-flow [<cmd> -h]

git flow init
## Update versiontag in .git/config to prefix release tags with 'v'
git config --replace-all gitflow.prefix.versiontag v
cat ./.git/config
# [gitflow "prefix"]
# feature = feature/
# release = release/
# hotfix = hotfix/
# support = support/
# versiontag = v
#

## feature/ENH_print_hello_world
git flow feature start ENH_print_hello_world
#git commit, commit, commit
git flow feature
git flow feature finish ENH_print_hello_world   # ENH<TAB>

## release/0.1.0
git flow release start 0.1.0
#git commit (e.g. update __version__, setup.py, release notes)
git flow release finish 0.1.0
git flow release finish 0.1.0
git tag | grep 'v0.1.0'

HubFlow

GitFlow is a named branch workflow for Git with master, develop, feature, release, hotfix, and support branches (git flow).

HubFlow is a fork of GitFlow that adds useful commands for working with Git and GitHub pull requests.

HubFlow branch names and prefixes are configured in .git/config; the defaults are:

HubFlow Branch Names

Branch Name

Description (and Code Labels)

master

Stable trunk (latest release)

develop

Development main line

feature/<name>

New features for the next release (e.g. ENH, PRF)

release/<name>

In-progress release branches (e.g. RLS)

hotfix/<name>

Fixes to merge to both master and develop (e.g. BUG, TST, DOC)

Creating a new release with Git and HubFlow:

git clone ssh://git@github.com/westurner/dotfiles
# git checkout master
# git checkout -h
# git help checkout (man git-checkout)
# git hf [<cmd> -h]
# git-hf [<cmd> -h]

git hf init
## Update versiontag in .git/config to prefix release tags with 'v'
git config --replace-all hubflow.prefix.versiontag v
#cat .git/config # ...
# [hubflow "prefix"]
# feature = feature/
# release = release/
# hotfix = hotfix/
# support = support/
# versiontag = v
#
git hf update
git hf pull
git hf pull -h

## feature/ENH_print_hello_world
git hf feature start ENH_print_hello_world
#git commit, commit
git hf pull
git hf push
#git commit, commit
git hf feature finish ENH_print_hello_world   # ENH<TAB>

## release/0.1.0
git hf release start 0.1.0
## commit (e.g. update __version__, setup.py, release notes)
git hf release finish 0.1.0
git hf release finish 0.1.0
git tag | grep 'v0.1.0'

The GitFlow HubFlow illustrations are very helpful for visualizing and understanding any DVCS workflow: https://datasift.github.io/gitflow/IntroducingGitFlow.html.

GitFlow Release / Master Branch Merge Diagram
GitFlow Hotfix to Master and Develop Branches Merge Diagram
Numbered GitFlow Workflow Diagram

Mercurial

Mercurial (hg) is a distributed revision control system written in Python and C (DVCS, VCS, RCS).

To clone a repository with hg:

hg clone https://www.mercurial-scm.org/repo/hg

Project Templates

cookiecutter

Cookiecutter creates projects (files and directories) from project templates written in Jinja2 for projects written in Python and other languages.

Languages

Lightweight Markup Language

RDoc

RDoc is a tool and a Lightweight Markup Language for generating HTML and command-line documentation for Ruby projects.

To not build RDoc docs when installing a Gem:

gem install --no-rdoc --no-ri
gem install --no-document
gem install -N

ReStructuredText

ReStructuredText (ReST, RST) is a Lightweight Markup Language commonly used for narrative documentation and inline Python, C, Java, etc. docstrings which can be parsed, transformed, and published to valid HTML, ePub, LaTeX, PDF.

Sphinx is built on Docutils, the primary implementation of ReStructuredText.

Pandoc also supports a form of ReStructuredText.

ReStructuredText Directive

Actionable blocks of ReStructuredText

include, contents, and index are all ReStructuredDirectives:

   .. include:: goals.rst

   .. contents:: Table of Contents
    :depth: 3

    .. index:: Example 1
    .. index:: Sphinx +
    .. _example-1:

    Sphinx +1
    ==========
    This refs :ref:`example 1 <example-1>`.

    Similarly, an explicit link to this anchor `<#example-1>`__

    And an explicit link to this section `<#sphinx-1>`__
    (which is otherwise not found in the source text).


    .. index:: Example 2
    .. _example 2:

    Example 2
    ==========

    This links to :ref:`example-1` and :ref:`example 2`.

    (`<#example-1>`__, `<#example-2>`__)

    And this also links to `Example 2`_.

   .. include:: LICENSE

.. note:: ``index`` is a :ref:`Sphinx` Directive,
    which will print an error to the console when building
    but will otherwise silently dropped
    by non-Sphinx ReStructuredText parsers
    like :ref:`Docutils` (GitHub) and :ref:`Pandoc`.
ReStructuredText Role

RestructuredText role extensions

:ref: is a Sphinx RestructuredText Role:

A (between files) link to :ref:`example 2`.

C

C is a third-generation programming language which affords relatively low-level machine access while providing helpful abstractions.

Every Windows kernel is written in C.

The GNU/Linux kernel is written in C and often compiled by GCC or Clang for a particular architecture (see: man uname)

The OS X kernel is written in C.

Libc libraries are written in C.

Almost all of the projects linked here, at some point, utilize code written in C.

C++

C++ is a free and open source third-generation programming language which adds object orientation and a standard library to C.

Standard Template Library

libc++

libc++ (libcxx) is the free and open source LLVM C++ Standard Template Library.

  • Clang (clang++) typically builds with libc++ (libcxx).

Microsoft STL

Microsoft STL is Microsoft’s free and open source implementation of the C++ Standard Template Library.

  • Microsoft Visual C++ typically builds with the Microsoft STL.

Fortran

Fortran (or FORTRAN) is a third-generation programming language frequently used for mathematical and scientific computing.

Some of the SciPy libraries build optimized mathematical Fortran routines.

Haskell

Haskell is a free and open source strongly statically typed purely functional programming language.

Cabal is the Haskell package manager.

Pandoc is written in Haskell.

Go

Go is a free and open source statically-typed C-based third generation language.

Java

Java is a third-generation programming language which is compiled into code that runs in a virtual machine (JVM) written in C for many different operating systems.

JVM

A JVM (“Java Virtual Machine”) runs Java code (classes and JARs).

Javascript

Javascript (JS) is a free and open source third-generation programming language designed to run in an interpreter; now specified as ECMAScript.

All major web browsers support Javascript.

Client-side (web) applications can be written in Javascript.

Server-side (web) applications can be written in Javascript, often with Node.js, NPM, and Bower packages.

Note

Java and JavaScript are two distinctly different languages and developer ecosystems.

ECMAScript

ECMAScript (ES) is an evolving, formally-specified, weakly-typed scripting language from which Javascript and ActionScript are derived.

  • There are multiple versions of ECMAScript (ES):

    • ES1 – ES1997

    • ES2 – ES1998

    • ES3 – ES1999

    • ES5 – ES2009

    • ES6 – ES2015

    • ES7 – ES2016

    • ES8 – ES2017

    • ES9 – ES2018

    • ES10 – ES2019

    • ES.Next

  • Babel compiles ECMAScript (ES6+) to Javascript.

  • Some browsers support various versions (ES7) of ECMAScript.

  • Firefox is built upon the SpiderMonkey ECMAScript engine.

  • Google Chrome, Node.js, and the latest Microsoft Edge are built upon the V8 ECMAEscript engine.

Babel

Babel is a Javascript (ECMAScript) compiler that transforms ES6 (ES2015) and beyond into browser-compatible JS.

  • ReactJS developers commonly compile ES6+ and JSX to JS with Babel.

Node.js

Node.js is a free and open source framework for Javascript applications written in C, C++, and Javascript.

Jinja2

Jinja2 is a free and open source templating engine written in Python.

Sphinx and Salt are two projects that utilize Jinja2.

Perl

Src: git git://perl5.git.perl.org/perl.git

Perl is a free and open source, dynamically typed, C-based third-generation programming language.

Many of the Debian system management tools are or were originally written in Perl.

Python

Python is a free and open source dynamically-typed, C-based third-generation programming language.

As a multi-paradigm language with support for functional and object-oriented code, Python is often utilized for system administration and scientific software development.

The Python community is generously supported by a number of sponsors and the Python Infrastructure Team:

Cython

Cython is a superset of CPython which adds static type definitions; making CPython code faster, in many cases.

SciPy Stack

Python Distributions

PyPy

PyPy is a JIT LLVM compiler for Python code written in RPython – a restricted subset of CPython syntax – which compiles to C, and is often faster than CPython for many types of purposes.

Python 3

Python 3 made a number of incompatible changes, requiring developers to update and review their Python 2 code in order to “port to” Python 3.

Python 2 will be supported in “no-new-features” status for quite some time.

Python 3 Wall of Superpowers tracks which popular packages have been ported to support Python 3: https://python3wos.appspot.com/

There are a number of projects which help bridge the gap between the two language versions:

See also: Anaconda

Tox

Tox is a build automation tool designed to build and test Python projects with multiple language versions and environments in separate virtualenvs.

Run the py27 environment:

tox -v -e py27
tox --help

Ruby

Ruby is a free and open source dynamically-typed programming language.

Vagrant is written in Ruby.

Rust

Rust is a free and open source strongly typed multi-paradigm programming language.

WebAssembly

WebAssembly (wasm) is a safe (sandboxed), efficient low-level Programming Languages (abstract syntax tree) and binary format for the web.

  • WebAssembly is initially derived from asm.js and PNaCL.

  • WebAssembly is an industry-wide effort.

  • LLVM can generate WebAssembly from e.g. C and C++ code.

YAML

YAML (“YAML Ain’t Markup Language”) is a concise data serialization format.

Most Salt states and pillar data are written in YAML. Here’s an example top.sls file:

base:
 '*':
   - openssh
 '*-webserver':
   - webserver
 '*-workstation':
   - gnome
   - i3

Compilers

Interpreter

Binutils

GNU Binutils are a set of utilities for working with assembly and binary.

GCC utilizes GNU Binutils to compile the GNU/Linux kernel and userspace.

GAS, the GNU Assembler (as) assembles ASM code for linking by the GNU linker (ld).

Clang

Clang is a compiler front end for C, C++, and Objective C/++. Clang is part of the LLVM project.

GCC

The GNU Compiler Collection started as a Free and Open Source compiler for C.

  • There are now GCC frontends for many languages, including C++, Fortran, Java, and Go.

  • The C++ GCC frontend binary is called g++.

GNU Linker

The GNU Linker is the GNU implementation of the ld command for linking object files and libraries.

LLVM

LLVM “Low Level Virtual Machine” is a reusable compiler infrastructure with frontends for many languages.

  • Clang is an LLVM frontend for C-based languages like C, C++, CUDA, and OpenCL.

  • There is a WASM LLVM backend: LLVM can produce WebAssembly binaries.

  • The C++ LLVM frontend binary is called clang++.

Operating Systems

POSIX

POSIX (“Portable Operating System Interface”) is a set of standards for Shells, Operating Systems, and APIs.

Linux

GNU/Linux (“Linux”) is a free and open source operating system kernel written in C.

uname -a; echo "Linux"
uname -o; echo "GNU/Linux"

Linux Distributions

A Linux Distribution is a collection of Packages compiled to work with a GNU/Linux kernel and a Libc.

RedHat

RedHat Enterprise Linux (“RHEL”) is a Linux Distribution that is built from RPM packages.

CentOS

CentOS is a Linux Distribution that is built from RPM packages which is derived from RHEL.

Scientific Linux

Scientific Linux is a Linux Distribution that is built from RPM packages which is derived from CentOS. which is derived from RHEL.

Oracle

Oracle Linux is a Linux Distribution that is built from RPM packages which is derived from RHEL.

ChromiumOS

ChromiumOS is a Linux Distribution built on Portage.

Crouton

Crouton (“Chromium OS Universal Chroot Environment”) installs and debootstraps a Linux Distribution (i.e. Debian or Ubuntu) within a ChromiumOS or ChromeOS chroot.

ChromeOS

ChromeOS is a Linux Distribution built on ChromiumOS and Portage.

CoreOS

CoreOS is a Linux Distribution for highly available distributed computing.

CoreOS schedules redundant Docker images with fleet and systemd according to configuration stored in etcd, a key-value store with a D-Bus interface.

  • CoreOS runs on very many platforms

  • CoreOS does not provide a package manager

  • CoreOS schedules Docker

  • CoreOS – Operating System

  • etcd – Consensus and Discovery

  • rkt – Container Runtime

  • fleet – Distributed init system (etcd, systemd)

  • flannel – Networking

Linux Notes

Linux Dual Boot
  • [ ] GRUB chainloader to partition boot record

    • Ubuntu and Fedora GRUB try to autodiscover Windows partitions

OS X

OS X is a UNIX operating system based upon the Mach kernel from NeXTSTEP, which was partially derived from NetBSD and FreeBSD.

OS X GUI support is built from XFree86/X.org X11.

OS X maintains forks of many POSIX BSD and GNU tools like bash, readlink, and find.

Homebrew installs and maintains packages for OS X.

uname; echo "Darwin"

iOS

iOS is a closed source UNIX operating system based upon many components of OS X adapted for phones and then tablets.

  • iOS powers iPhones and iPads

  • You must have a Mac with OS X and XCode to develop and compile for iOS.

OSX Notes

OSX Reinstall
  • [ ] Generate installation media

  • [ ] Reboot to recovery partition

  • [ ] Adjust partitions

  • [ ] Format?

  • [ ] Install OS

  • [ ] (wait)

  • [ ] Manual time/date/language config

  • [ ] Run workstation provis scripts

OSX Fresh Install
  • [ ] Generate / obtain installation media

  • [ ] Boot from installation media

  • [ ] Manual time/date/language config

  • [ ] Run workstation provis scripts

Windows

Microsoft Windows is a NT-kernel based operating system.

  • There used to be a POSIX compatibility mode.

  • Chocolatey maintains a set of NuGet packages for Windows.

Windows Subsystem for Linux

Windows Subsystem for Linux (WSL) is a binary compatibility layer which allows many Linux programs to be run on Windows 10+.

The Windows Subsystem for Linux lets developers run a GNU/Linux environment – including most command-line tools, utilities, and applications – directly on Windows, unmodified, without the overhead of a virtual machine.

  • Windows Subsystem for Linux is not a complete Virtualization solution; but it does allow you to run e.g. Ubuntu or Fedora (and thus e.g. Bash) on a Windows machine.

  • Docker for Windows is one alternative to Windows Subsystem for Linux.

Windows Sysinternals

Windows Sysinternals is a group of tools for working with Windows.

WSUS Offline Update

WSUS Offline Update is a free and open source software tool for generating offline Windows upgrade CDs / DVDs containing the latest upgrades for Windows, Office, and .Net.

  • Bandwidth costs: Windows Updates (WSUS) in GB * n_machines (see also: Debtorrent, Packages)

  • “Slipstreaming” an installation ISO is one alternative way to avoid having to spend hours upgrading a factory reinstalled (“reformatted”) Windows installation

Windows Notes

A few annotated excerpts from this Chocolatey NuGet PowerShell script https://gist.github.com/westurner/10950476#file-cinst_workstation_minimal-ps1

cinst GnuWin
cinst sysinternals      # Process Explorer XP
cinst 7zip
cinst curl
Windows Dual Boot
  • [ ] Windows MBR chain loads to partition GRUB (Linux)

  • [ ] Ubuntu WUBI .exe Linux Installer (XP, 7, 8*)

    • It’s now better to install to a separate partition from a bootable ISO

Configuration Management

Cobbler

Cobbler is a machine image configuration, repository mirroring, and networked booting server with support for DNS, DHCP, TFTP, and PXE.

  • Cobbler can template kickstart files for the RedHat Anaconda installer

  • Cobbler can template Debian preseed files

  • Cobbler can PXE boot an ISO over TFTP (and unattended install)

  • Cobbler can manage a set of DNS and DHCP entries for physical systems

  • Cobbler can batch mirror RPM and DEB repositories (see also: apt-cacher-ng, Nginx)

  • Cobbler-web is a Django WSGI application; usually configured with Apache HTTPD and mod_wsgi.

    • Cobbler-web delegates very many infrastructure privileges

See also: crowbar, OpenStack Ironic bare-metal deployment

Juju

Juju is a Configuration Management tool written in Python which runs Juju Charms written in Python on one or more systems over SSH, for managing one or more physical and virtual machines running Ubuntu.

Puppet

Puppet is a Configuration Management system written in Ruby which runs Puppet Modules written in Puppet DSL or Ruby for managing one or more physical and virtual machines running various operating systems.

Build Automation Tools

GNU Autotools

GNU Autotools (GNU Build System) are a set of tools for software build automation: autoconf, automake, libtool, and gnulib.

  • The traditional ./configure --help; make; make install build workflow comes from the GNU Build System.

  • Autoconf uses a configure.ac configure include file in generating a configure script that checks for platform and software dependencies and caches the results in a config.status script, which generates config.h C header file that caches the results

  • Automake uses a Makefile.am to generate a Makefile.in makefile include file that generates a GNU Make Makefile

  • GNU Coding Standards define a number of standard configuration variables: CC, CFLAGS, CXX, CXXFLAGS, LDFLAGS, CPPFLAGS which tools such as GNU Make automatically add to e.g. GCC (and GNU Linker) build program arguments

$ # autoconf  # configure.ac -> ./configure
$ ./configure --help
$ ./configure --prefix=/usr/local --with-this-or-that
make

Bake

Bake is a free and open source software build automation tool similar in form and function to Make.

  • Bake uses Bakefile files to describe builds.

  • Bakefiles can contain e.g. Bash and Python scripts (instead of Make syntax)

BUILD

A number of tools use (incompatible) BUILD files to describe software builds:

Blaze

Blaze is an internal software build automation tool developed by Google.

  • Blaze was the first build tool to use BUILD files.

  • Bazel is an open source rewrite of Blaze.

Bazel

Bazel is a free and open source software build automation tool developed as a rewrite of Google Blaze.

  • A WORKSPACE or WORKSPACE.bazel file indicates the root of a Bazel workspace.

  • Bazel uses BUILD (or BUILD.bazel) files to describe builds.

  • Buck was released before Bazel was open-sourced.

CMake

CMake is a free and open source software build automation tool.

  • CMake generates build configurations for a number of tools: Unix Makefiles, Ninja, Visual Studio

Grunt

Grunt is a build tool written in Javascript which builds a directed acyclic graph (DAG).

Jake

Jake is a Javascript build tool written in Javascript (for Node.js) similar to Make or Rake.

Make

GNU Make is a classic, ubiquitous software build automation tool designed for file-based source code compilation which builds a directed acyclic graph (DAG).

Bash, Python, and the GNU/Linux kernel are all built with Make.

Make build task chains are represented in a Makefile.

Pros

  • Simple, easy to read syntax

  • Designed to build files on disk (see: .PHONY)

  • Nesting: make -C <path> <taskname>

  • Variable Syntax: $(VARIABLE_NAME) or ${VARIABLE_NAME}

  • Bash completion: make <tab>

  • Python: Initially parseable with disutils.text_file

  • Logging: command names and values print to stdout (unless prefixed with @)

Cons

  • Platform Portability: make is not installed everywhere

  • Global Variables: parametrization with shell scripts

VARIABLE_NAME="value" make test
make test VARIABLE_NAME="value"

# ...
export VARIABLE_NAME="value"
make test

Pants Build

Pants Build is a build tool for JVM [Java, Scala, Android], C++, Go, Haskell, Node, and Python [CPython] software projects.

Virtualization

libcontainer

libcontainer is a library built by Docker to replace LXC.

Libcontainer provides a native Go implementation for creating containers with namespaces, Cgroups, capabilities, and filesystem access controls.

https://github.com/opencontainers/runc/tree/master/libcontainer

Open Container Initiative

The Open Container Initiative (OCI) is a Linux Foundation collaborative project dedicated to developing a working, portable software container specification.

runC

runC is a container abstraction

runc is a CLI tool for spawning and running containers according to the Open Container Initiative specification.

Docker

Docker is an OS virtualization project written in Go which utilizes Linux containers – first LXC now libcontainer / runC – to partition process workloads across one or more host systems.

Dockerfile

A Dockerfile contains the instructions needed to create a docker image.

Docker container

A Docker container is an instance of a Docker Image with configuration.

Docker API

The Docker API is an interface of management commands for provisioning and managing containers.

Docker Machine, Docker Swarm, and Docker Universal Control Plane all implement the Docker API; so the docker client works equally well with each implementation.

Docker Machine

Docker Machine is the container management application which implements the Docker API.

Docker Swarm

Docker Swarm is a cluster management system for Docker containers hosted on one or more Docker Machines

Docker Universal Control Plane

Docker Universal Control Plane is an enterprise-grade cluster management solution with a web dashboard and external authentication which implements the Docker API.

Docker Compose

Docker Compose is a Python application for defining and managing services (Docker containers) and networks with a docker-compose.yml YAML configuration file.

Docker Image

A Docker Image is an archived container filesystem with configuration which is usually defined by a Dockerfile.

Docker Hub

Docker Hub is a cloud-based registry service for Docker Images.

Docker Cloud

Docker Cloud is the hosting service offered by Docker.

  • Docker images build from a Dockerfile

  • A Dockerfile can subclass another Dockerfile (to add, remove, or change configuration)

  • Dockerfile support a limited number of commands

  • Docker is not intended to be a complete configuration management system

  • Ideally, a Docker image requires minimal configuration once built

  • Docker images can be hosted by https://hub.docker.com/

  • docker run -it ubuntu/16.04 downloads the image from https://hub.docker.com/_/ubuntu/, creates a new instance (docker ps), and spawns a root Shell with a UUID name (by default).

  • There are a number of ways to “Schedule” [redundant] persistent containers that launch on boot with Docker

    • Docker Swarm is the Docker-native way to run a cluster of containers. To a client app, Docker Swarm looks just like Docker Machine because it implements the Docker API.

    • Kubernetes is one project which uses Docker to schedule redundant, optionally geodistributed, LXC containers (in “Pods”).

Salt can install and manage docker, docker images and containers:

Cloud Native Computing Foundation

The Cloud Native Computing Foundation (CNCF) is a foundation for cloud and container industry collaboration.

k3s

k3s is a lightweight Kubernetes distribution which runs on x86-64, ARM6, ARM7; only requires 512Mb of RAM; and is distributed as a single Go binary.

  • You’ve heard of k8s? This is k8s - 5.

Kubernetes-Mesos

kubernetes-mesos integrates Kubernetes Docker Pod scheduling with Mesos.

Kubernetes and Mesos are a match made in heaven.

Kubernetes enables the Pod, an abstraction that represents a group of co-located containers, along with Labels for service discovery, load-balancing, and replication control.

Mesos provides the fine-grained resource allocations for pods across nodes in a cluster, and facilitates resource sharing among Kubernetes and other frameworks running on the same cluster.

KVM

KVM is a full virtualization platform with support for Intel VT and AMD-V; which supports running various guest operating systems, each with their own kernel, on a given host machine.

Libcloud

Apache libcloud is a Python library which abstracts and unifies a large number of Cloud APIs for Compute Resources, Object Storage, Load Balancing, and DNS.

Salt salt cloud depends upon libcloud.

Libvirt

Libvirt is a system for platform virtualization with various Linux hypervisors.

LXC

LXC (“Linux Containers”), written in C, builds upon Linux Cgroups to provide containerized OS chroots (all running under the host kernel).

LXC is included in recent Linux kernels.

LXD

LXD, written in Go, builds upon LXC to provide a system-wide daemon and an OpenStack Nova hypervisor plugin.

Mesos

Apache Mesos is a highly-available distributed datacenter operating system, for which there are many different task/process/service schedulers.

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

OpenStack

OpenStack is a platform of infrastructure services for running a cloud datacenter (a private or a public cloud).

OpenStack makes it possible for end-users to create a new virtual machine from the available pool of resources.

rdfs:seeAlso: OpenStack DevStack, Libcloud

Packer

Packer generates machine images for multiple platforms, clouds, and hypervisors from a parameterizable template.

Packer Artifact

Build products: machine image and manifest

Packer Template

JSON build definitions with optional variables and templating

Packer Build

Task defined by a JSON file containing build steps which produce a machine image

Packer Builder

Packer components which produce machine images for one of many platforms:

Packer Provisioner

Packer components for provisioning machine images at build time

  • Shell scripts

  • File uploads

  • ansible

  • chef

  • solo

  • puppet

  • salt

Packer Post-Processor

Packer components for compressing and uploading built machine images

Vagrant

Vagrant is a tool written in Ruby for creating and managing virtual machine instances with CPU, RAM, Storage, and Networking.

vagrant help
vagrant status
vagrant init ubuntu/trusty64
vagrant up
vagrant ssh
$EDITOR Vagrantfile
vagrant provision
vagrant halt
vagrant destroy
vagrantfile

vagrant script defining a team of one or more virtual machines and networks.

create a vagrantfile:

vagrant init [basebox]
cat vagrantfile

start virtual machines and networks defined in the vagrantfile:

vagrant status
vagrant up
Vagrant Box

Vagrant base machine virtual machine image.

There are many baseboxes for various operating systems.

Essentially a virtual disk plus CPU, RAM, Storage, and Networking metadata.

Locally-stored and cached vagrant boxes can be listed with:

vagrant help box
vagrant box list

A running vagrant environment can be packaged into a new box with:

vagrant package

Packer generates VirtualBox Vagrant Boxes with a Post-Processor.

Vagrant Cloud

Vagrant-hosted public Vagrant Box storage.

Install a box from Vagrant cloud:

vagrant init ubuntu/trusty64
vagrant up
vagrant ssh
Vagrant Provider

A driver for running Vagrant Boxes with a hypervisor or in a cloud.

The Vagrant VirtualBox Provider is well-supported.

With Plugins: https://github.com/mitchellh/vagrant/wiki/Available-Vagrant-Plugins

See also: Libcloud.

Vagrant Provisioner

Set of hooks to install and run shell scripts and configuration managment tools over vagrant ssh.

Vagrant up runs vagrant provision on first invocation of vagrant up.

vagrant provision

Note

Vagrant configures a default NFS share mounted at /vagrant.

Note

Vagrant adds a default NAT Adapter as eth0; presumably for DNS, the default route, and to ensure vagrant ssh connectivity.

VirtualBox

Src: svn svn://www.virtualbox.org/svn/vbox/trunk

Oracle VirtualBox is a platform virtualization package for running one or more guest VMs (virtual machines) within a host system.

VirtualBox:

  • runs on many platforms: Linux, OSX, Windows

  • has support for full platform NX/AMD-v virtualization

  • requires matching kernel modules

Vagrant scripts VirtualBox.

Shells

Bash

GNU Bash, the Bourne-again shell, is an open source command-line program written in C for running commands in a text-based terminal.

A few commands to try when learning to shell with Bash:

echo $SHELL; echo "$SHELL"; echo "${SHELL}"
type bash
bash --help
help help
help type
apropos bash
info bash
man bash

man man
info info  # [down arrow] and then [enter] to select, or 'n' for next
  • Bash works with unix command outputs and return codes: a program returns nonzero when there is an error:

    true;  echo $?  # 0
    false; echo $?  # 1
    echo "Hello" && echo " World!"  # Hello World!
    false || echo "World!"          # World!
    
  • Functions: Bash supports functions with arguments that can print to standard out and/or return an integer return code:

    function add_a {
       echo "$1 + $2 = $(( $1 + $2 ))"
    }
    add_b () {
       echo "$1 + $2 = $(( $1 + $2 ))"
    }
    add_xy () {
       echo "$x + $y = $(( $x + $y ))"
    }
    add_a 3 5       # "3 + 5 = 8"
    add_b 3 5       # "3 + 5 = 8"
    
    x=3 y=5 add_xy  # "3 + 5 = 8"
    x=3; y=5;
    add_xy          # "3 + 5 = 8"
    
    output=$(add_a 3 5)
    echo "${output}"
    
    help test
    help [
    help [[
    help return
    
    test "$(add_a 3 5)" == "3 + 5 = 8" && echo 'OK'
    
    test_add_a () {
       if [[ "$(add_a 3 5)" == "3 + 5 = 8" ]]; then
           echo 'OK'
           return 0
       else
           echo 'Test failed'
           return 1
       fi
    }
    test_add_a
    
    help trap
    help exit
    
  • Portability: sh (sh, bash, dash, zsh) shell scripts are mostly compatible; though bash supports some features that other shells do not.

  • Logging: You can configure bash to print commands and arguments as bash executes scripts:

    set -x  # print commands and arguments
    set -v  # print source
    

Bash reads various configuration files at startup time:

/etc/profile
/etc/bash.bashrc
/etc/profile.d/*.sh
${HOME}/.profile        /etc/skel/.profile   # PATH=+$HOME/bin  # umask
${HOME}/.bash_profile   # empty. preempts .profile
${HOME}/.bashrc

Bash and various Operating Systems:

  • Linux: Bash is almost always installed as the default shell on Linux boxes.

  • Mac:

    • MacOS includes Bash 3.2.

    • You can brew install bash to get a more recent version. (Homebrew)

  • Windows:

    • Windows Subsystem for Linux (WSL) installs Linux distributions which include bash.

    • You can also install bash on Windows by installing git with choco install git -y (Chocolatey)

    • You can also install bash on Windows by installing MSYS2 (Mingw) or Cygwin with choco install msys2 or choco install cygwin

While Bash is ubiquitous, shell scripts are loose with quoting; which makes shell scripts flexible but dangerous and thus often avoided in favor of other languages:

## Shell script quoting example 1:

# This prints a newline
echo $(echo "-e a\nb")

# This prints "-e a\nb"
echo "$(echo "-e a\nb")"

This isn’t an issue with e.g. Python (a popular language that’s also useful for system administration).

import subprocess
print(subprocess.check_output(['echo', "-e a\nb"])
print(subprocess.check_output('echo "-e a\nb"', shell=True))

# Though, note that Python subprocess shell=True is a security risk:
# - avoid shell=True
# - pass the command as a list of already-tokenized arguments
# - use something like sarge (or ansible) instead of shell=True

IPython is one of many alternatives to Bash.

IPython

IPython is an interactive REPL and distributed computation framework written in Python.

## Formatting expression output with the Python interpreter
1 + 1
x = 1+1
print("1 + 1 = 2")
print('1 + 1 = %d' % (x))
print('1 + 1 = {0}'.format(x))   # Python 2.7+
print('1 + 1 = {x}'.format(x=x)) # Python 2.7+
print(f'1 + 1 = {x}')            # Python 3.6+
print(f'{1 + 1 = }')             # Python 3.8+

## IPython
!ipython --help                # run `$SHELL -c 'ipython --help'`
!python -m IPython --help      # run `ipython --help`

?                              # print IPython help within IPython

%lsmagic
%<tab>                         # list magic commands and aliases
%paste?                        # help for the %paste magic command
%logstart?                     # help for the %logstart magic command
%logstart -o logoutput.log.py  # log input and output to a file

import json
json?                          # print(json.__doc__)
json??                         # print(inspect.getsource(json))

## IPython shell
!cat ./README.rst; echo $PWD   # run shell commands
lines = !ls -al                # capture shell command output
print(lines[0:])
%run -i -t example.py          # run a script with timing info,
                               # in the local namespace
%run -d example.py             # run a script with pdb
%pdb on                        # automatically run pdb on Exception
  • If a kernel is not specified, IPython uses the ipykernel Jupyter kernel.

  • To use other kernels with IPython, you must install jupyter_console and a kernel:

    pip install jupyter_console  # conda install -y jupyter_console
    
    ipython console --kernel python  # ipykernel
    jupyter console --kernel python  # ipykernel
    # <Ctrl-D> | <Ctrl-C> | "exit()"
    
    pip install bash_kernel  # conda install -y bash_kernel
    jupyter console --kernel bash
    # "exit"
    
    conda install -y -c conda-forge xeus-cling
    jupyter console --kernel xcpp11
    # <Ctrl-D> (<Ctrl-Z> on Windows)
    
    conda install -y nodejs; npm install -g ijavascript; ijsinstall
    jupyter console --kernel javascript
    # <Ctrl-D>
    
    conda install -y nodejs; npm install -g jp-babel; jp-babel-install
    jupyter console --kernel babel
    # <Ctrl-D>
    
    conda install -y nodejs; npm install -g itypescript; its --install=local
    jupyter console --kernel typescript
    # <Ctrl-D>
    
    jupyter kernelspec list
    
  • There are very many Jupyter kernels: https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

  • Jupyter Notebook and Jupyter Lab are built atop IPython: a Jupyter notebook file is a JSON file with an .ipynb extension which contains inputs and text and binary outputs.

PowerShell

Windows PowerShell is a shell for Windows.

Shell Utilities

Awk

AWK is a pattern programming language for matching and transforming text.

Grep

Grep is a commandline utility for pattern-based text matching.

Htop

Htop is a commandline task manager; like top extended.

Pyline

Pyline is an open source POSIX command-line utility for streaming line-based processing in Python with regex and output transform features similar to Grep, Sed, and Awk.

  • Pyline can generate quoted CSV, JSON, HTML, etc.

Pyrpo

Pyrpo is an open source POSIX command-line utility for locating and generating reports from Git, Mercurial, Bazaar, and Subversion repositories.

Sed

GNU Sed is an open source POSIX command-line utility for transforming text.

Note

BSD Sed

Use <Ctrl-V><tab> for explicit tabs (as \t does not work)

Use \\\n or '$'\n for newlines (as \n does not work)

sed -E should be consistent extended regular expressions between GNU Sed (e.g. Linux) and BSD Sed (FreeBSD, OSX).

OR: brew install gnu-sed

See: https://unix.stackexchange.com/questions/101059/sed-behaves-different-on-freebsd-and-on-linux

See: https://superuser.com/questions/307165/newlines-in-sed-on-mac-os-x

Web Shells

IPython Notebook

IPython Notebook (now Jupyter Notebook) is an open source web-based shell written in Python and Javascript for interactive and literate computing with IPython notebooks composed of raw, markdown, or code input and plaintext- or rich- output cells.

  • An IPython notebook (.ipynb) is a JSON document containing input and output for a linear sequence of cells; which can be exported to many output formats (e.g. HTML, RST, LaTeX, PDF); and edited through the web with IPython Notebook.

  • IPython Notebook is a webapp written on tornado, an asynchronous web application framework for Python.

    • seeAlso: westurner/brw (2007-))

  • IPython Notebook supports Markdown syntax for comment cells.

  • IPython Notebook supports more than 40 different IPython kernels for other languages:

    https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages

  • IPython Notebook development has now moved to Jupyter Notebook; which supports IPython kernels (and defaults to the IPython CPython 2 or 3 kernel).

To start IPython Notebook (assuming the _SRC variable as defined in Venv):

pip install ipython[notebook]
# pip install -e git+https://github.com/ipython/ipython@rel-3.2.1#egg=ipython
# https://github.com/ipython/ipython/releases

mkdir $_SRC/notebooks; cd $_SRC/notebooks
ipython notebook

ipython notebook --notebook-dir="${_SRC}/notebooks"

# With HTTPS (TLS/SSL)
ipython notebook \
 --ip=127.0.0.1 \
 --certfile=mycert.pem \
 --keyfile=privkey.pem \
 --port=8888 \
 --browser=web  # (optional) westurner/dotfiles/scripts/web

 # List supported options
 ipython notebook --help

Warning

IPython Notebook runs code and shell commands as the user the process is running as, on a remote or local machine.

Reproducible SciPy Stack IPython Notebook / Jupyter Notebook servers implement best practices like process isolation and privilege separation with e.g. Docker and/or Jupyter Hub.

Note

IPython Notebook is now Jupyter Notebook.

Jupyter Notebook runs Python notebooks with ipykernel, the IPython Python kernel from IPython Notebook

ipython_nose

ipython_nose is an extension for IPython Notebook (and likely Jupyter Notebook) for discovering and running test functions starting with test_ (and unittest.TestCase test classes with names containing Test) with Nose.

  • ipython_nose is not (yet?) uploaded to PyPI

  • to install ipython_nose from GitHub (with Pip and Git):

pip install -e git+https://github.com/taavi/ipython_nose#egg=ipython_nose

See also:

nosebook

nosebook is a tool for finding and running tests in nbformat IPython Notebooks and Jupyter Notebooks with nose.

See also:

Jupyter

Project Jupyter expands upon components like IPython and IPython Notebook to provide a multi-user web-based shell for many languages (Python, Ruby, Java, Haskell, Julia, R).

IPython Jupyter comparison (adapted from http://jupyter.org)

IPython

Jupyter

  • Interactive Python shell

  • Python kernel for Jupyter

  • Interactive Parallel Python

  • Rich REPL Protocol

  • Jupyter Notebook (format, environment, conversion)

  • JupyterHub (multi-user notebook server)

  • JupyterHub authenticators (MediaWiki OAuth, GitHub OAuth)

  • JupyterHub spawners (Docker, Sudo, Remote, Docker Swarm)

Jupyter Notebook

Jupyter Notebook is an open source shell webapp written in Python and Javascript for interactive and literate computing with Jupyter notebooks composed of raw, markdown, or code input and plaintext- or rich- output cells.

  • .ipynb files are Jupyter Notebooks saved as JSON documents .

  • An Jupyter notebook is a document containing {meta, input, and output} records for a linear sequence of cells; which can be exported to many output formats (e.g. HTML, RST, LaTeX, PDF, Python, MyST Markdown); and edited through the web with Jupyter Notebook.

  • Jupyter Notebook is a webapp written on tornado, an asynchronous web application framework for Python.

    • seeAlso: westurner/brw (2007-))

  • Jupyter Notebook supports Markdown syntax for comment cells.

  • Jupyter Notebook supports more than 40 different Jupyter kernels for other languages:

    https://github.com/ipython/ipython/wiki/Jupyter-kernels-for-other-languages

To start IPython Notebook (assuming the _SRC variable as defined in Venv):

pip install ipython[notebook]
# pip install -e git+https://github.com/ipython/ipython@rel-3.2.1#egg=ipython
# https://github.com/ipython/ipython/releases

mkdir $_SRC/notebooks; cd $_SRC/notebooks
ipython notebook

ipython notebook --notebook-dir="${_SRC}/notebooks"

# With HTTPS (TLS/SSL)
ipython notebook \
 --ip=127.0.0.1 \
 --certfile=mycert.pem \
 --keyfile=privkey.pem \
 --port=8888 \
 --browser=web  # (optional) westurner/dotfiles/scripts/web

 # List supported options
 ipython notebook --help

Warning

IPython Notebook runs code and shell commands as the user the process is running as, on a remote or local machine.

Reproducible SciPy Stack IPython Notebook / Jupyter Notebook servers implement best practices like process isolation and privilege separation with e.g. Docker and/or Jupyter Hub.

Note

JupyterLab (a mostly-rewrite) adds e.g. tabs and undo (and a new extension API) to Jupyter Notebook.

JupyterLab

JupyterLab is an open-source web-based tabbed IDE written in Python, Javascript, and TypeScript for working with <jupyter notebooks> Jupyter Notebook, terminals, text editing, undo, extensions.

  • You can edit Jupyter notebooks with JupyterLab.

  • Installing JupyterLab also installs Jupyter Notebook. (which doesn’t support tabs or the new extension API)

  • A few UI differences between JupyterLab and Jupyter Notebook:

    • JupyterLab has tabbed editing: you can open files, notebooks, and terminals in tabs

    • JupyterLab has a sidebar with a file selector pane

  • Installing JupyterLab does not install any SciPy Stack or other packages.

Install JupyterLab

Install JupyterLab With Pip:

python -m pip install jupyterlab

Install JupyterLab with Conda:

conda install -c conda-forge -y jupyterlab

Hosting JupyterLab

You can host JupyterLab yourself:

Hosted JupyterLab

There are many providers of hosted JupyterLab and Jupyter Notebook; where they run Jupyter in a shell or a VM on their servers for you and you connect over your internet connection.

JupyterHub

JupyterHub makes it easy to serve Jupyter Notebook and/or Jupyter Lab for multiple users on one or more servers.

  • JupyterHub spawns individual Jupyter Notebook / JupyterLab server instances for logged-in users.

  • JupyterHub enables users to log-in with Authenticator backends: system users, LDAP, SSO, OAuth (e.g. Google accounts)

  • If so configured, JupyterHub can launch additional servers to serve one or more Notebook/Lab Docker containers and then shut those down when they’re idle or, for example, when a course session is complete.

nbconvert

nbconvert is the code that converts (transforms) an .ipynb notebook (nbformat JSON) file into an output representation (e.g. HTML, HTML slides (reveal.js), LaTeX, PDF, ePub, Mobi).

  • nbconvert is included with Jupyter Notebook and JupyterLab

    pip install nbconvert
    # pip install -e git+https://github.com/jupyter/nbconvert@master#egg=nbconvert
    
    jupyter nbconvert --to html mynotebook.ipynb
    
reveal.js

reveal.js is a Javascript and HTML library for slide presentations served from an HTML file.

  • Reveal.js slides can be in a 1-dimensional or a 2-dimensional arrangement.

  • You can generate reveal.js slides from Jupyter notebooks in two ways: with nbconvert --to slides or with the GUI: “File” > “Export Notebok As…” > “Export Notebook to reveal.js slides”

    jupyter nbconvert --to slides mynotebook.ipynb
    

    Note

    Presentation content that doesn’t fit on a slide is hidden and unscrollable: only put a slide worth of data in each cell for a Jupyter reveal.js presentation.

    Alternatives to presenting notebooks as reveal.js slides:

    • Increase the browser font size (Jupyter Notebook)

    • “View” > “Presentation Mode” (JupyterLab)

    • Select a keyboard shortcut set use the “Select Cell Below” / “Select Cell Above” keyboard shortcuts to highlight cells and scroll them into view

      • Press “<Escape>”

      • Press “j” to “Select Cell Below”

      • Press “k” to “Select Cell Above”

  • The RISE extension also generates reveal.js slides.

RISE

RISE is a Jupyter Notebook and JupyterLab extension that generates live reveal.js presentations from Jupyter notebooks.

  • Install the RISE extension

  • Click the RISE button to generate a live reveal.js slide presentation wherein you can execute cells on the slides with “Ctrl-Enter” and “Shift-Enter” just like you can in the Notebook interface.

nbformat

The Jupyter Notebook (.ipynb) format is a versioned JSON format for storing metadata and input/output sequences.

Usually, when the nbformat changes, notebooks are silently upgraded to the new version on the next save.

Note

nbformat v3 and above add a kernelspec attribute to the nbformat JSON, because .ipynb files can now contain code for languages other than Python.

  • nbformat does not specify any schema for the user-supplied metadata dict (TODO: nbmeta), so JSON that conforms to an externally managed JSON-LD @context would work.

nbviewer

(nbviewer) is an application for serving read-only versions of Jupyter notebooks from HTTP URLs.

  • When you enter a URL, GitHub username, GitHub username/repo, or Gist ID into the text box at https://nbviewer.jupyter.org/ and click ‘Go!’ (or press Enter), nbviewer nbconverts the notebook to HTML or shows a file browser and branch/tag selector for the git repo.

  • You do not need to look up the raw GitHub URL for the notebook, because nbviewer automatically rewrites the GitHub /blob/ file URL to a raw.githubusercontent.com URL.

  • GitHub now also renders static .ipynb files, CSV, SVG, and PDF. However, GitHub does not execute any JS in the notebook due to security concerns (XSS)

  • GitLab renders Jupyter notebooks with JS.

runipy

runipy runs Jupyter notebooks from a Shell commandline, generates HTML reports, and can write errors to stderr.

Jupyter notebook manual test review process:

# - run Jupyter Notebook server
!jupyter notebook
# - Browser
#     - navigate to / upload / drag and drop the notebook
        !web http://localhost:8888   # or https://
#     - (optional) click 'TODO Restart Kernel'
#     - (optional) click 'Cell' > 'All Output' > 'Clear'
#     - click 'Cell' > 'Run All'
#     - [wait] <Jupyter Kernel runs notebook>
#     - visually seek for the first ERRoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb

Jupyter notebook TODO review process:

# - run Jupyter Notebook server
!jupyter notebook
# - Browser
#     - navigate to / upload / drag and drop the notebook
        !web http://localhost:8888   # or https://
#     - (optional) click 'TODO Restart Kernel'
#     - (optional) click 'Cell' > 'All Output' > 'Clear'
#     - click 'Cell' > 'Run All'
#     - [wait] <Jupyter Kernel runs notebook>
#     - visually seek for the first ERRoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb

Jupyter notebook runipy review process:

# - runipy the Jupyter notebook
!runipy mynotebook.ipynb
# - review stdout and stderr from runipy
# - review in browser (optional; recommended)
#     - navigate to the converted HTML
        !web ./mynotebook.ipynb.html
#     - visually seek for the first WEEoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb*
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb*

Google Colab

Google Colab is a hosted Jupyter Notebook system.

  • Colab has a number of packages installed in the default environment. If you want additional packages, you need to !pip install them once when you first open the notebook.

  • Colab is forked from a previous version of Jupyter Notebook, and so does not have some newer Jupyter Notebook or any Jupyter Lab features.

  • ipywidgets are not yet implemented on Colab.

  • Colab saves to Google Drive.

  • Colab instances are free and can use some GPU time if needed.

  • There is a Colab Pro.

  • Google AI Platform Notebooks hosts JupyterLab notebooks: https://cloud.google.com/ai-platform-notebooks

Dotfiles

Dotfiles are userspace shell configuration in files that are often prefixed with “dot” (e.g. ~/.bashrc for Bash)

Venv

Venv is a tool for making working with Virtualenv, Virtualenvwrapper, Bash, ZSH, Vim, and IPython within a project context very easy.

Venv defines standard Filesystem Hierarchy Standard and Python paths, environment variables, and aliases for routinizing workflow.

Venv paths and cdaliases

var name

description

cdaliases

Bash: cdhelp

IPython: %cdhelp

Vim: :Cdhelp

example path

HOME

user home directory

Bash/ZSH: cdh, cdhome

IPython: %cdh, %cdhome

Vim: :Cdh, :Cdhome

~/

__WRK

workspace root

cdwrk (ibid.)

~/-wrk

WORKON_HOME

virtualenvs root

cdwh, cdworkonhome, cdve

~/-wrk/-ve27

CONDA_ENVS_PATH

condaenvs root

cdch, cdcondahome

~/-wrk/-ce27

VIRTUAL_ENV

virtualenv root

cdv, cdvirtualenv

~/-wrk/-ve27/dotfiles

_BIN

virtualenv executables

cdb, cdbin

~/-wrk/-ve27/dotfiles/bin

_ETC

virtualenv configuration

cd, cdetc

~/-wrk/-ve27/dotfiles/etc

_LIB

virtualenv lib directory

cdl, cdlib

~/-wrk/-ve27/dotfiles/lib

_LOG

virtualenv log directory

cdlog

~/-wrk/-ve27/dotfiles/var/log

_SRC

virtualenv source repositories

cds, cdsrc

~/-wrk/-ve27/dotfiles/src

_WRD

virtualenv working directory

cdw, cdwrd

~/-wrk/-ve27/dotfiles/src/dotfiles

To generate this venv config:

python -m dotfiles.venv.ipython_config --print-bash dotfiles
venv.py --print-bash dotfiles
venv --print-bash dotfiles docs
venv --print-bash dotfiles ~/path
venv --print-bash ~/-wrk/-ve27/dotfiles ~/path

To generate a default venv config with a prefix of /:

venv --print-bash --prefix=/

To launch an interactive shell within a venv:

venv --run-bash dotfiles
venv -xb dotfiles

Note

pyvenv is the Virtualenv -like functionality now included in Python >= 3.3 (python3 -m venv)

Python pyvenv docs: https://docs.python.org/3/library/venv.html

Virtualenv

Virtualenv is a tool for creating reproducible Python environments.

Virtualenv sets the shell environment variable $VIRTUAL_ENV when active.

Virtualenv installs a copy of Python, Setuptools, and Pip when a new virtualenv is created.

A virtualenv is activated by source-ing ${VIRTUAL_ENV}/bin/activate.

Paths within a virtualenv are more-or-less FHS standard paths, which makes virtualenv structure very useful for building chroot and container overlays.

A standard virtual environment:

bin/           # pip, easy_install, console_scripts
bin/activate   # source bin/activate to work on a virtualenv
include/       # (symlinks to) dev headers (python-dev/python-devel)
lib/           # libraries
lib/python2.7/distutils/
lib/python2.7/site-packages/  # pip and easy_installed packages
local/         # symlinks to bin, include, and lib
src/           # editable requirements (source repositories)

# also useful
etc/           # configuration
var/log        # logs
var/run        # sockets, PID files
tmp/           # mkstemp temporary files with permission bits
srv/           # local data

Virtualenvwrapper wraps virtualenv.

echo $PATH; echo $VIRTUAL_ENV
python -m site; pip list

virtualenv example               # mkvirtualenv example
source ./example/bin/activate    # workon example

echo $PATH; echo $VIRTUAL_ENV
python -m site; pip list

ls -altr $VIRTUAL_ENV/lib/python*/site-packages/**  # lssitepackages -altr

Note

Venv extends Virtualenv and Virtualenvwrapper.

Note

Python 3.3+ now also contain a script called venv, which performs the same functions and works similarly to virtualenv: https://docs.python.org/3/library/venv.html.

Virtualenvwrapper

Virtualenvwrapper is a tool which extends virtualenvwrapper.

Virtualenvwrapper provides a number of useful shell commands and python functions for working with and within virtualenvs, as well as project event scripts (e.g. postactivate, postmkvirtualenv) and two filesystem configuration variables useful for structuring development projects of any language within virtualenvs: $PROJECT_HOME and $WORKON_HOME.

Virtualenvwrapper is sourced into the shell:

# pip install --user --upgrade virtualenvwrapper
source ~/.local/bin/virtualenvwrapper.sh

# sudo apt-get install virtualenvwrapper
source /etc/bash_completion.d/virtualenvwrapper

Note

Venv extends Virtualenv and Virtualenvwrapper.

echo $PROJECT_HOME; echo ~/workspace             # venv: ~/-wrk
cd $PROJECT_HOME                                 # venv: cdp; cdph
echo $WORKON_HOME;  echo ~/.virtualenvs          # venv: ~/-wrk/-ve27
cd $WORKON_HOME                                  # venv: cdwh; cdwrk

mkvirtualenv example
workon example                                   # venv: we example

cdvirtualenv; cd $VIRTUAL_ENV                    # venv: cdv
echo $VIRTUAL_ENV; echo ~/.virtualenvs/example   # venv: ~/-wrk/-ve27/example

mkdir src ; cd src/                              # venv: cds; cd $_SRC

pip install -e git+https://github.com/westurner/dotfiles#egg=dotfiles

cd src/dotfiles; cd $VIRTUAL_ENV/src/dotfiles    # venv: cdw; cds dotfiles
head README.rst

                                                 # venv: cdpylib
cdsitepackages                                   # venv: cdpysite
lssitepackages

deactivate
rmvirtualenv example

lsvirtualenvs; ls -d $WORKON_HOME                # venv: lsve; lsve 'ls -d'

Window Managers

Compiz

Compiz is a window compositing layer for X11 which adds lots of cool and productivity-enhancing visual capabilities.

Compiz works with Gnome, KDE, and Qt applications.

f.lux

f.lux is a userspace utility for gradually adjusting the blue color channel throughout the day; or as needed.

  • A similar effect can be accomplished with the X11 xgamma command (e.g. for Linux platforms where the latest f.lux is not yet available). A few keybindings from an i3wm configuration here:

    # [...] #L105
    set $xgamma_reset    xgamma -gamma 1.0
    set $xgamma_soft     xgamma -bgamma 0.6 -ggamma 0.9 -rgamma 0.9
    set $xgamma_soft_red xgamma -bgamma 0.4 -ggamma 0.6 -rgamma 0.9
    # [...] #L200
    ## Start, stop, and reset xflux
    #  <alt> [         -- start xflux
    bindsym $mod+bracketleft    exec --no-startup-id $xflux_start
    #  <alt> ]         -- stop xflux
    bindsym $mod+bracketright   exec --no-startup-id $xflux_stop
    #  <alt><shift> ]  -- reset gamma to 1.0
    bindsym $mod+Shift+bracketright  exec --no-startup-id $xgamma_reset
    #  <alt><shift> [  -- xgamma -bgamma 0.6 -ggamma 0.9 -rgamma 0.9
    bindsym $mod+Shift+bracketleft exec --no-startup-id $xgamma_soft
    #  <alt><shift> \  -- xgamma -bgamma -0.4 -ggamma 0.4 -rgamma 0.9
    bindsym $mod+Shift+p exec --no-startup-id $xgamma_soft_red
    

i3wm

i3wm is a tiling window manager for X11 (Linux) with extremely-configurable Vim-like keyboard shortcuts.

i3wm works with Gnome, KDE, and Qt applications.

Qt

Qt is a Graphical User Interface toolkit for developing applications with Android, iOS, OS X, Windows, Embedded Linux, and X11.

Wayland

Wayland is a display server protocol for GUI window management.

Wayland is an alternative to X11 servers like XFree86 and X.org.

The reference Wayland implementation, Weston, is written in C.

X11

Src: git git://anongit.freedesktop.org/git/xorg/

X Window System (X, X11) is a display server protocol for window management (drawing windows on the screen).

Most UNIX and Linux systems utilize XFree86 or the newer X.org X11 window managers.

Gnome, KDE, i3wm, OS X, and Compiz build upon X11.

Browsers

Chromium

The Chromium Projects include the Chromium Browser and ChromiumOS.

Chrome DevTools

How to open Chrome (and Firefox) DevTools:

  • Right-click > “Inspect Element”

  • Linux: <ctrl><shift>i

  • OSX: <option><command>i

DevTools Emulation

pbm

  • backup and organize { Chrome , Chromium } Bookmarks JSON in an offline batch

  • date-based transforms

  • quicklinks

  • starred bookmarks (with trailing ##)

Firefox Android

Firefox Android Extensions

Internet Explorer

Internet Explorer is the web browser included with Windows.

See also: Microsoft Edge

Microsoft Edge

Microsoft Edge will be replacing Internet Explorer.

Opera

Opera is a multi-platform web browser written in C++.

  • Opera is now based on Blink.

  • Opera was based on WebKit.

  • Opera developed and open sourced celery: a distributed task worker composed workflow process API written in Python; with support for many message browsers: https://github.com/celery

Safari iOS

Browser Extensions

Accessibility Extensions

Tiësto

The Tiësto Chrome Theme is a Dark Theme for Chrome.

Safety Extensions

uBlock

_repo="chrisaljoudi/ublock"
curl -Ls "https://api.github.com/repos/${_repo}/releases" > ./releases.json
cat releases.json \
    | grep browser_download_url \
    | pyline 'w and w[1][1:-1]' \
    | pyline --regex \
        '.*download/(.*)/(uBlock.(firefox.xpi|chromium.zip))$' \
        'rgx and rgx.group(1,2)'

Content Extensions

Hypothesis

Hypothesis can also be included as a sidebar on a site:

<script async defer src="//hypothes.is/embed.js"></script>

Zotero

Zotero archives and tags resources with bibliographic metadata.

  • Zotero is really helpful for research.

  • Browsers other than Firefox connect to Zotero Standalone

  • Zotero can store a full-page archive of a given resource (e.g. HTML, PDF)

  • Zotero can store and synchronize data on Zotero’s servers with Zotero File Storage

  • Zotero can store and synchronize data over WebDAV

  • Zotero can export a collection of resources’ bibliographic metadata in one of many citation styles (“CSL”) (e.g. MLA, APA, [Journal XYZ])

  • Zotero can export a collection of resources’ bibliographic metadata as RDF

  • There are a number of plugins and integrations with Zotero:

    https://www.zotero.org/support/plugins

[ ] Zotero and Schema.org RDFa

> How would I go about adding HTML + RDFa [1] and/or HTML + Microdata [2] export templates with Schema.org classes and properties to Zotero?

Development Extensions

Requirify

Requirify adds NPM modules to the local namespace (e.g. from Chrome DevTools JS console).

> require() npm modules in the browser console

Local-requirify

Require local NPM modules with Requirify

Vim Extensions

Vimium

Vimium is a Chrome Extension which adds Vim-like functionality.

Vimium shortcuts

function

vimium shortcut

help

?

jump to link in current/New tab

f / F

copy link to clipboard

yf

open clipboard link in current/New tab

p / P

Vimperator

Vimperator connects a JS shell with VIM command interpretation to the Firefox API, with Vim-like functionality.

  • vimperatorrc can configure settings in about:config

  • Vimperator stopped working after Firefox 57

Web Servers

Apache HTTPD

Apache HTTPD is a scriptable, industry-mainstay HTTP server written in C and C++.

BusyBox HTTPD

busybox httpd --help
busybox httpd -p 8082

See also: Python http.server

Netcat web server

# Serve a file over HTTP then close
{ printf 'HTTP/1.0 200 OK\r\nContent-Length: %d\r\n\r\n' "$(wc -c < some.file)"; cat some.file; } | nc -l 8082

# Serve the date over HTTP then start another server
while true ; do nc -l -p 8082 -c 'echo -e "HTTP/1.1 200 OK\n\n $(date -Is)"'; done &

# Make an HTTP request with netcat
printf "GET / HTTP/1.0\r\nHost: localhost\r\n\r\n" | nc localhost 8082

# Make an HTTP request with curl
curl localhost:8082
curl -v localhost:8082

# Make an HTTP request with wget
wget -O - localhost:8082
wget -d -O - localhost:8082
from urllib.request import urlopen
resp = urlopen("http://localhost:8082")
assert resp.code == 200
assert resp.headers.get_content_type() == 'text/plain'
body = resp.read()
print(body)

ncat

nc --help
ncat --help

Nginx

Nginx is a scriptable, lightweight HTTP server written in C.

Python http.server

python -m http.server --help
python -m http.server --directory . 8082
python -m http.server --directory . --cgi 8082

See also: Pgs

Documentation Tools

Docutils

Docutils is a Python library which ‘parses” ReStructuredText lightweight markup language into a doctree (~DOM) which can be serialized into HTML, ePub, MOBI, LaTeX, man pages, Open Document files, XML, JSON, and a number of other formats.

Pandoc

Pandoc is a “universal” markup converter written in Haskell which can convert between HTML, BBCode, Markdown, MediaWiki Markup, ReStructuredText, HTML, and a number of other formats.

Pgs

pgs is an open source web application written in Python for serving static files from a Git branch, or from the local filesystem.

pgs -p "${_WRD}/_build/html" -r gh-pages -H localhost -P 8082
  • pgs is written with the one-file Bottle web framework

  • compared to python -m SimpleHTTPServer localhost:8000 / python3 -m http.server localhost:8000 pgs has WSGI, the ability to read from a Git branch without real Git bindings, and caching HTTP headers based on Git or filesystem mtimes.

  • pgs does something like Nginx try_files $.html

Sphinx

Sphinx is a tool for working with ReStructuredText documentation trees and rendering them into HTML, PDF, LaTeX, ePub, and a number of other formats.

Sphinx extends Docutils with a number of useful markup behaviors which are not supported by other ReStructuredText parsers.

Most other ReStructuredText parsers do not support Sphinx directives; so, for example,

Sphinx Builder

A Sphinx Builder transforms ReStructuredText into various output forms:

  • HTML

  • LaTeX

  • PDF

  • ePub

  • MOBI

  • JSON

  • OpenDocument (OpenOffice)

  • Office Open XML (MS Word)

See: Sphinx Builders

Sphinx ReStructuredText

Sphinx extends ReStructuredText with roles and directives which only work with Sphinx.

Sphinx Directive

Sphinx extensions of Docutils ReStructuredText directives.

Most other ReStructuredText parsers do not support Sphinx directives.

.. toctree::

   readme
   installation
   usage

See: Sphinx Directives

Sphinx Role

Sphinx extensions of Docutils ReStructuredText roles

Most other ReStructured

.. _anchor-name:

A link to :ref:`anchor <anchor-name>`.

Tinkerer

Tinkerer is a very simple static blogging website generation tool written in Python which extends Sphinx and generates HTML from ReStructuredText.

Static HTML pages generated with Tinkerer do not require a serverside application, and can be easily hosted with GitHub Pages or any other web hosting service.

Backup Tools

Backup Ninja

Backup Ninja is an open source backup utility written in /etc/backup.d

  • BackupNinja supports rdiff-backup, Duplicity, and rsync.

  • BackupNinja can create and burn CD/DVD images.

  • BackupNinja can backup a number of relational databases (MySQL, PostgreSQL), maildirs, SVN repositories, Trac instances, and LDAP.

bup

Bup (backup) is a backup system based on Git packfiles and rolling checksums.

[Bup is a very] efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).

  • AFAIU, like Git, Bup does not preserve file permissions, Access Control Lists, or extended attributes (though some archive formats and snapshot images do).

Clonezilla

Clonezilla is an open source Linux distribution which is bootable from a CD/DVD/USB (a LiveCD, LiveDVD, LiveUSB) or PXE which contains a number of tools for disk imaging, disk cloning, filesystem backup and recovery; and a server Linux distribution for serving disk images to one or more computers over a LAN.

  • Clonezilla contains FSArchiver, partclone, partimage, and rsync.

  • Clonezilla can backup and restore very many (if not most) filesystems.

  • Clonezilla supports MBR, GPT, and uEFI.

  • Clonezilla can restore a networked multicast group (e.g. lab) of machines to a system image (saving TCP overhead when sharing the same multi-gigabyte / terabyte image to zero or more machines); and boot them with PXE and/or Wake-on-Lan.

    • bup, debtorrent

  • Clonezilla can backup to disk, ssh, samba, NFS, WebDAV

  • drbl-winroll helps with restoring Windows images

  • SystemRescueCD also contains partimage.

  • Cobbler also supports PXE boot from images.

Duplicity

Duplicity is an open source incremental file directory backup utility with GnuPG encryption, signatures, versions, and a number of actions for redundantly storing backups.

  • Duplicity can push offsite backups to/over a number of protocols and services (e.g. SSH/SCP/SFTP, S3, Google Cloud Storage, Rackspace Cloudfiles (OpenStack Swift)).

  • Duplicity stores data with tar archives and rdiff

  • rdiff-backup is similar to Duplicity.

FSArchiver

FSAchiver is an open source filesystem backup (disk cloning) utility which can preserve file permissions, labels, and extended attributes.

  • FSArchiver can backup a filesysmet to a new or within an existing filesystem.

  • FSArchiver has special support for LVM.

  • FSArchiver supports password-based encryption.

partclone

partclone is an open source utility for making compressed backups of the used blocks of partitions with each specific filesystem driver.

partimage

Partimage is an open source utility for making complete sector-for-sector compressed backups of partitions over the network or to a local device.

rclone

Rclone is an open source utility for managing files on cloud storages like local disk, SFTP, WebDAV, Dropbox, and Google Drive.

  • Rclone supports very many cloud storages

rsync

rsync is an open-source file backup utility which can be used to make incremental backups using file deltas over the network or the local system.

  • rsync may appear to be stalled when it is actually calculating the full set of initial relative differences in order to minimize the amount of data transfer.

Note

rsync does not preserve file permissions by default.

To preserve file permissions with rsync:

man rsync

rsync -a    # rsync -rlptgoD
  rsync -r  # recursive (traverse into directories)
  rsync -l  # copy symlinks as links
  rsync -p  # preserve file permissions
  rsync -t  # preserve modification times
  rsync -g  # preserve group
  rsync -o  # preserve owner (requires superuser)
  rsync -D  # rsync --devices --specials
    rsync --devices   # preserve device files (requires superuser)
    rsync --specials  # preserve special files
rsync -A  # preserve file ACLs
rsync -X  # preserve file extended attributes

rsync -aAX  # rsync -a -A -X

rsync -v  # verbose
rsync -P  # rsync --partial --progress
  rsync --partial     # keep partially downloaded files
  rsync --progress    # show *per-file* progress and xfer speed

Note

rsync is picky about paths and trailing slashes.

# setUp
mkdir -p A/one B/one  # TODO
echo 'A' > A/one; echo 'B' > B/one
# tests
rsync A B
rsync A B/  --> B/A
rsync A/ B
rsync A/ B/

rdiff

rdiff is the open source relative delta algorithm of rsync.

rdiff-backup

rdiff-backup is an open source incremental file directory backup utility.

  • Like rsync, rdiff-backup transmits file deltas instead of entire files.

  • Unlike rsync, rdiff-backup manages reverting to previous revisions.

SystemRescueCD

SystemRescueCD is a Linux distribution which is bootable from a CD/DVD/USB (a LiveCD) which contains a number of helpful utilities for system maintenance.

Standards

CSS

CSS (Cascading Style Sheets) define the presentational aspects of HTML and a number of mobile and desktop web framworks.

  • CSS is designed to ensure separation of data and presentation. With javascript, the separation is then data, code, and presentation.

Filesystem Hierarchy Standard

The Filesystem Hierarchy Standard (FHS) is a well-worn industry-supported system file naming structure.

JSON

JSON is an object representation in Javascript syntax which is now supported by libraries written in many languages.

A list of objects with key and value attributes in JSON syntax:

[
{ "key": "language", "value": "Javascript" },
{ "key": "version", "value": 1 },
{ "key": "example", "value": true }
]

Machine-generated JSON is often not very readable, because it doesn’t contain extra spaces or newlines. The Python JSON library contains a utility for parsing and indenting (“prettifying”) JSON from the commandline

cat example.json | python -m json.tool

JSON5

JSON5 is JSON extended with support for a number of additional features: comments, trailing commas, IEEE 754 +/- infinity and NaN, hexadecimal numbers, leading and trailing decimal points, single-quoted strings, multiline strings, and escaped characters.

  • Regular JSON libraries do not support JSON5.

{
// comment
key:   [0, +1, 2., .3, NaN, +inf, -inf, 0xF, 'thing1', "thing2"],
"str": "this is a \
multi-line string", // trailing comma
}

JSON-lines

JSON-lines (newline-delimited JSON) is an informal spec for line-based processing of JSON e.g. for streaming records and unix pipes.

{"key": "red", "value": 1}
{"key": "green", "value": 2}

JSON-LD

JSON-LD is a web standard for Linked Data in JSON.

An example from the JSON-LD Playground (https://goo.gl/xxZ410):

{
   "@context": {
    "gr": "http://purl.org/goodrelations/v1#",
    "pto": "http://www.productontology.org/id/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "foaf:page": {
      "@type": "@id"
    },
    "gr:acceptedPaymentMethods": {
      "@type": "@id"
    },
    "gr:hasBusinessFunction": {
      "@type": "@id"
    },
    "gr:hasCurrencyValue": {
      "@type": "xsd:float"
    }
   },
   "@id": "http://example.org/cars/for-sale#tesla",
   "@type": "gr:Offering",
   "gr:name": "Used Tesla Roadster",
   "gr:description": "Need to sell fast and furiously",
   "gr:hasBusinessFunction": "gr:Sell",
   "gr:acceptedPaymentMethods": "gr:Cash",
   "gr:hasPriceSpecification": {
    "gr:hasCurrencyValue": "85000",
    "gr:hasCurrency": "USD"
   },
   "gr:includes": {
    "@type": [
      "gr:Individual",
      "pto:Vehicle"
    ],
    "gr:name": "Tesla Roadster",
    "foaf:page": "http://www.teslamotors.com/roadster"
   }
}

MessagePack

MessagePack (msgpack) is a data interchange format with implementations in many languages.

Text Editors

Gedit

Gedit is an open source text editor written in C and Python (GTK, GtkSourceView, and Gnome) that’s available for Linux, OS X, and Windows).

  • Gedit supports tabbed editing.

  • Gedit plugins are written in Python.

  • Gedit is the default Gnome text editor; where it’s called “Text Editor”.

Notepad++

Notepad++ is an open source text editor written in C++ for Windows which has tabbed editing.

IDEs

An IDE (Integrated Development Environment) is a software tool for developing software.

  • Most IDEs are source code Text Editors.

  • Some IDEs are visual development tools for various types of not code trees and graphs.

  • IDEs have a concept of a project, which may be defined in a config file in the current working directory or otherwise selected through the GUI.

  • An IDE has some sort of language server that understands the source code at a deeper level than syntax in order to do cool things like code completion and code refactorings like renaming a method in every file in the project.

  • https://en.wikipedia.org/wiki/Comparison_of_integrated_development_environments

Emacs

GNU Emacs is an open source text editor written in Emacs Lisp and C that’s available for Linux, OS X, and Windows.

org-mode

Org-mode is an open source document editing mode originally written in Emacs Lisp for Emacs that’s now available in some form for a number of editors including Vim.

  • Org-mode makes it really easy to work with outlines in plain text documents.

  • The org-mode wikipedia page lists a number of org-mode implementations for other editors.

org-babel

  • Babel makes it possible to execute source code in org-mode

  • Babel is also the name of an ECMAScript compiler

  • Jupyter Notebook with Jupytext and/or emacs and vim plugins for working with Jupyter are similar to Babel org-mode.

VSCode

VSCode (Visual Studio Code) is an open source programmer’s text editor written in TypeScript, Javascript, and CSS that’s available for Windows, Mac, and Linux.

  • VSCode extensions are written in Javascript.

  • VSCode has collaborative editing features with multiple cursors.

  • VSCode and MS Visual Studio are different projects.

  • VSCode supports many of the Visual Studio keyboard shortcuts.

  • There is an official Vim extension for VSCode.

  • In VSCode, Ctrl+Space opens the context-sensitive Intellisense Code Completion

  • In VScode, Ctrl-p opens the quick open dialogue

  • IN VScode, Ctrl-Shift-p opens the command palette (which lists “all available commands based on your current context”)

You can install VSCode by downloading from the Download page or with Chocolatey:

choco install vscode

Vim

ViM (VI-iMproved) is an open source text editor written in C that’s available on very many platforms.

  • Vim help can be accessed with :help and :help help (Press <esc>, Type :help help, Press Enter)

  • Vi is almost always installed on Linux and BSD boxes.

  • Vi is often included with Busybox.

  • Vi and Vim are installed with OS X.

  • Vi and Vim are installed by default with many Linux Distributions

  • Vim runs in a terminal, over SSH, and with a GUI window manager (Gvim, Macvim)

  • Vim configuration is written in the vim language.

  • Vim reads a few vimrc configuration files in sequence (:help vimrc)

  • GVim is Vim for Gnome window manager

  • GVim reads a few vimrc configuration files in sequence (:help gvimrc)

  • MacVim is Vim for OS X

  • One way to write changes and exit vim: :wq! (Press <esc>, Type :wq!, Press Enter)

  • There are many plugins for vim.

  • NERDTree is an example of a vim plugin: https://github.com/scrooloose/nerdtree (:help nerdtree)

  • SpaceVim and westurner/dotvim include the NERDtree plugin

  • Vim keyboard shortcuts are calling mappings.

  • Vim mappings are defined in a vimrc file.

  • Examples of vim mappings: \e opens NERDTree, \E opens NERDTree to the current file

  • Vim mappings can be defined for different vim modes: :map \e (command mode), :imap \e (insert mode) (:help modes)

  • Press i or a while in command mode to enter insert or append mode (:help vim-modes)

  • Press <Esc> to return to command mode

Browser Extensions with vim-style keyboard shortcuts:

A number of web apps support vim-style keyboard shortcuts like j and k for up and down:

  • GMail (? for help)

  • Facebook (? for help)

  • Twitter (? for help)

SpaceVim

SpaceVim is a set of plugins, configuration defaults, and keybindings for Vim.