Tools

See Also: https://westurner.github.io/wiki/projects#tools

Packages

A software package is an archive of files with a manifest that lists the files included. Often, the manifest contains file checksums and a signature.

Many packaging tools make a distinction between source and/or binary packages.

Some packaging tools provide configuration options for:

  • Scripts to run when packaging
  • Scripts to run at install time
  • Scripts to run at uninstal time
  • Patches to apply to the “vanilla” source tree, as might be obtained from a version control repository

There is a package maintainer whose responsibilities include:

  • Testing new upstream releases
  • Vetting changes from release to release
  • Repackaging upstream releases
  • Signing new package releases

Packaging lag refers to how long it takes a package maintainer to repackage upstream releases for the target platform(s).

Anaconda

Anaconda is a maintained distribution of Conda packages for many languages; especially Python.

Note

https://en.wikipedia.org/wiki/Anaconda_(installer) (1999) is the installer for RPM-based Linux distributions; which is also written in Python (and C).

APT

APT (“Advanced Packaging Tool”) is the core of Debian package management.

  • An APT package repository serves DEB packages created with Dpkg.

  • An APT package repository can be accessed from a local filesystem or over a network protocol (“apt transports”) like HTTP, HTTPS, RSYNC, FTP, and BitTorrent (debtorrent).

    An example of APT usage (e.g. to maintain an updated Ubuntu Linux system):

apt-get update
apt-get upgrade
apt-get dist-upgrade

apt-cache show bash
apt-get install bash

apt-get --help
man apt-get
man sources.list

AUR

AUR (Arch User Repository) contains PKGBUILD packages which can be installed by pacman.

Bower

Bower is “a package manager for the web” (JavaScript packages) built on NPM.

Conda

Conda is a package build, environment, and distribution system written in Python to install packages written in any language.

  • Conda was originally created for the Anaconda Python Distribution, which installs packages written in Python, R, JavaScript, Ruby, C, Fortran
  • Conda packages are basically tar archives with build, and optional link/install and uninstall scripts.
  • conda-build generates conda packages from conda recipes with a meta.yaml, a build.sh, and/or a build.bat.
  • Conda recipes reference and build from a source package URI OR a VCS URI and revision; and/or custom build.sh or build.bat scripts.
  • conda skeleton can automatically create conda recipes from PyPI (Python), CRAN (R), and CPAN (Perl)
  • conda skeleton-generated recipes can be updated with additional metadata, scripts, and source URIs (as separate patches or consecutive branch commits of e.g. a conda-recipes repository in order to get a diff of the skeleton recipe and the current recipe).
  • Conda (and Anaconda) packages are hosted by https://binstar.org, which hosts free public and paid private Conda packages.
    • Anaconda Server is an internal “Private, Secure Package Repository” that “supports over 100 different repositories, including PyPI, CRAN, conda, and the Anaconda repository.”

To create a fresh conda env:

# Python 2.7
conda create -n science --yes python readline conda-env

# Python 3.X
conda create -n science3 --yes python=3 readline conda-env

Work on a conda env:

source activate exmpl2
conda list
source deactivate

conda-env writes to and creates environments from environment.yml files which list conda and Pip packages.

Work with conda envs and environment.yml files:

# Install conda-env globally (in the "root" conda environment)
conda install -n root conda-env

# Create a conda environment with ``conda-create`` and install conda-env
conda create -n science python=3 readline conda-env pip

# Install some things with conda (and envs/science/bin/pip)
# https://github.com/westurner/notebooks/blob/gh-pages/install.sh
conda search pandas; conda info pandas
conda install blaze dask bokeh odo \
              sqlalchemy hdf5 h5py \
              scikit-learn statsmodels \
              beautiful-soup lxml html5lib pandas qgrid \
              ipython-notebook
pip install -e git+https://github.com/rdflib/rdflib@master#egg=rdflib
pip install arrow sarge structlog

# Export an environment.yml
#source deactivate
conda env export -n science | tee environment.yml

# Create an environment from an environment.yml
conda env create -n projectname -f ./environment.yml

To install a conda package from a custom channel:

conda install -c pydanny cookiecutter   # OR pip install cookiecutter

The conda-forge custom channel packages are built with continuous integration on multiple platforms:

Sources:

See also: Anaconda, conda-forge (conda-smithy)

conda-forge

# create a conda package recipe from a pypi package
cd $VIRTUAL_ENV/src
conda skeleton pypi jupyterthemes
ls -ld jupyterthemes/
edit jupyterthemes/meta.yaml
# - git repo tags || pypi releases

# create a conda-forge feedstock from a conda recipe
## https://github.com/conda-forge/conda-smithy#making-a-new-feedstock
cd $VIRTUAL_ENV/src
ls -ld jupyterthemes
conda-smithy init jupyterhemes
ls jupyterthemes-feedstock/

# build a conda-forge feedstock with docker
# FROM condaforge/linux-anvil
cat ./scripts/run_docker_build.sh
./scripts/run_docker_build.sh
./ci_support/run_docker_build.sh

DEB

DEB is the Debian software package format.

DEB packages are built with Dpkg and often hosted in an APT package repository.

dnf

dnf is a an open source package manager written in Python.

  • dnf was introduced in Fedora 18.
  • dnf is the default package manager in Fedora 22; replacing Yum.
    • [ ] yum errors if TODO package is installed (* Salt provider)
    • [ ] repoquery redirects with an error to dnf repoquery
    • See dnf help (and man dnf)
  • dnf integrates with the Anaconda system installer.
  • dnf supports Delta RPM packages (DRPM), which often significantly reduce the required amount of network transfer required to regularly retrieve and upgrade to the latest repository packages.

ebuild

ebuild is a software package definition format.

  • ebuilds are like special Bash scripts.
  • ebuilds have USE flags for specifying build features.
  • Gentoo is built from ebuild package definitions stored in Gentoo Portage.
  • Portage packages are built from ebuilds.
  • The emerge Portage command installs ebuilds.

fpm

fpm (effing package management) is a tool for building many types of software packages from many other types of software packages (e.g. DEB. RPM, Python Packages); often more easily than working with the actual package manager.

  • fpm package source types include: dir rpm gem python empty tar deb cpan npm osxpkg pear pkgin virtualenv zip.
  • fpm target package types include: rpm deb solaris puppet dir osxpkg p5p puppet sh tar zip

Homebrew

Homebrew is a package manager (brew) for OS X.

NPM

NPM is a JavaScript package manager created for Node.js.

  • an NPM package is defined by a package.json JSON file.
  • NPM packages are installed with the npm CLI utility.
  • Bower builds upon NPM.

NuGet

NuGet is an open source package manager for Windows.

pacman

Pacman is an open source package manager which installs .pkg.tar.xz files for Arch Linux.

Pants Build

Pants Build is a build tool for JVM [Java, Scala, Android], C++, Go, Haskell, and Python [CPython] software projects.

PEX

PEX (Python Executable) is a ZIP-based software package archive format with an executable header.

Ports

A Ports collection contains Sources (e.g. archived releases and patch sets) and Makefiles designed to compile software Packages for particular operating systems distributions’ kernel and standard libraries usually for a particular platform.

RPM

RPM (RPM Package Manager, RedHat Package Manager) is a package format and a set of commandline utilities written in C and Perl.

  • RPM packages can be installed with rpm, Yum, dnf.

  • RPM pacage can be built with tools like rpmbuild and fpm

  • Python packages can be built into RPM packages with setuptools’ bdist_rpm, fpm

  • List contents of RPM packages (archives) with e.g. less and lesspipe:

    less ~/path/to/local.rpm   # requires lesspipe to be configured
    
  • RPM Packages are served by and retrieved from repositories by tools like Yum and dnf:

    • Local: directories of RPM packages and metadata
    • Network: HTTP, HTTPS, rsync, FTP
    • dnf supports Delta RPM packages (DRPM), which often significantly reduce the required amount of network transfer required to regularly retrieve and upgrade to the latest repository packages.

Note

There’s not yet a debtorrent for RPM, Yum, dnf.

Python Packages

A Python Package is a collection of source code and package data files.

  • Python packages have dependencies: they depend on other packages
  • Python packages can be served from a package index
  • PyPI is the community Python Package Index
  • A Python package is an archive of files (.zip (.egg, .whl), .tar, .tar.gz,) containing a setup.py file containing a version string and metadata that is meant for distribution.
  • An source dist (sdist) package contains source code (every file listed in or matching a pattern in a MANIFEST.in text file).
  • A binary dist (bdist, bdist_egg, bdist_wheel) is derived from an sdist and may be compiled and named for a specific platform.
  • sdists and bdists are defined by a setup.py file which contains a call to a distutils.setup() or setuptools.setup() function.
  • The arguments to the setup.py function are things like version, author, author_email, and homepage; in addition to package dependency strings required for the package to work (install_requires), for tests to run (tests_require), and for optional things to work (extras_require).
  • A package dependency string can specify an exact version (==) or a greater-than (>=) or less-than (<=) requirement for each package.
  • Package names are looked up from an index server (--index), such as PyPI, and or an HTML page (--find-links) containing URLs containing package names, version strings, and platform strings.
  • easy_install (Setuptools) and Pip can install packages from: the local filesystem, a remote index server, or a local index server.
  • easy_install and pip read the install_requires (and extras_require) attributes of setup.py files contained in packages in order to resolve a dependency graph (which can contain cycles) and install necessary packages.

.

Note

JSON-LD for package metadata and environment build metadata could be helpful.

Distutils

Distutils is a collection of tools for common packaging needs.

  • Distutils is included in the Python standard library.

Setuptools

Setuptools is a Python package for working with other Python Packages.

  • Setuptools builds upon Distutils

  • Setuptools is widely implemented

  • Most Python packages are installed by setuptools (by Pip)

  • Setuptools can be installed by downloading ez_setup.py and then running python ez_setup.py; or, setuptools can be installed with a system package manager (apt, yum)

  • Setuptools installs a script called easy_install which can be used to install packages from the local filesystem, a remote index server, a local index server, or an HTML page

  • easy_install pip installs Pip from PyPI

  • Like easy_install, Pip installs python packages, with a number of additional configuration options

  • Setuptools can build RPM and DEB packages from python packages, with some extra configuration:

    python setup.py bdist_rpm --help
    python setup.py --command-packages=stdeb.command bdist_deb --help
    

Pip

Pip is a tool for installing, upgrading, and uninstalling Python packages.

pip help
pip help install
pip --version

sudo apt-get install python-pip
pip install --upgrade pip

pip install libcloud
pip install -r requirements.txt
pip uninstall libcloud
  • Pip stands upon Distutils and Setuptools.
  • Pip retrieves, installs, upgrades, and uninstalls packages.
  • Pip can list installed packages with pip freeze (and pip list).
  • Pip can install packages as ‘editable’ packages (pip install -e) from version control repository URLs which must begin with vcs+, end with #egg=<usuallythepackagename>, and may contain an @vcstag tag (such as a branch name or a version tag).
  • Pip installs packages as editable by first cloning (or checking out) the code to ./src (or ${VIRTUAL_ENV}/src if working in a Virtualenv) and then running setup.py develop.
  • Pip configuration is in ${HOME}/.pip/pip.conf.
  • Pip can maintain a local cache of downloaded packages, which can lessen the load on package servers during testing.
  • Pip skips reinstallation if a package requirement is already satisfied.
  • Pip requires the --upgrade and/or --force-reinstall options to be added to the pip install command in order to upgrade or reinstall.
  • At the time of this writing, the latest stable pip version is 1.5.6.

Warning

With Python 2, pip is preferable to Setuptools’s easy_install because pip installs backports.ssl_match_hostname in order to validate HTTPS certificates (by making sure that the certificate hostname matches the hostname from which the DNS resolved to).

Cloning packages from source repositories over ssh:// or https://, either manually or with pip install -e avoids this concern.

There is also a tool called Peep which requires considered-good SHA256 checksums to be specified for every dependency listed in a requirements.txt file.

For more information, see: http://legacy.python.org/dev/peps/pep-0476/#python-versions

Pip Requirements File

Plaintext list of packages and package URIs to install.

Requirements files may contain version specifiers (pip >= 1.5)

Pip installs Pip Requirement Files:

pip install -r requirements.txt
pip install --upgrade -r requirements.txt
pip install --upgrade --user --force-reinstall -r requirements.txt

An example requirements.txt file:

# install pip from the default index (PyPI)
pip
--index=https://pypi.python.org/simple --upgrade pip

# Install pip 1.5 or greater from PyPI
pip >= 1.5

# Git clone and install pip as an editable develop egg
-e git+https://github.com/pypa/pip@1.5.X#egg=pip

# Install a source distribution release from PyPI
# and check the MD5 checksum in the URL
https://pypi.python.org/packages/source/p/pip/pip-1.5.5.tar.gz#md5=7520581ba0687dec1ce85bd15496537b

# Install a source distribution release from Warehouse
https://warehouse.python.org/packages/source/p/pip/pip-1.5.5.tar.gz

# Install an additional requirements.txt file
-r requirements/more-requirements.txt

Peep

Peep works just like Pip, but requires SHA256 checksum hashes to be specified for each package in requirements.txt file.

Warehouse

Warehouse is the “Next Generation Python Package Repository”.

All packages uploaded to PyPI are also available from Warehouse.

Wheel

  • Wheel is a newer, PEP-based standard (.whl) with a different metadata format, the ability to specify (JSON) digital signatures for a package within the package, and a number of additional speed and platform-consistency advantages.
  • Wheels can be uploaded to PyPI.
  • Wheels are generally faster than traditional Python packages.

Packages available as wheels are listed at http://pythonwheels.com/.

RubyGems

RubyGems is a package manager for Ruby packages (“Gems”).

Yum

Yum is a tool for installing, upgrading, and uninstalling RPM packages.

Version Control Systems

Version Control Systems (VCS) — or Revision Control Systems (RCS) — are designed to solve various problems in change management.

  • VCS store code in a repository.
  • Changes to one or more files are called changesets, commits, or revisions
  • Changesets are comitted or checked into to a repository.
  • Changesets are checked out from a repository
  • Many/most VCS differentiate between the repository and a working directory, which is currently checked out to a specific changeset identified by a revision identifier; possibly with uncommitted local changes.
  • A branch is forked from a line of development and then merged back in.
  • Most projects designate a main line of development referred to as a trunk, master, or default branch.
  • Many projects work with feature and release branches, which, ideally, eventually converge by being merged back into trunk. (see: HubFlow for an excellent example of branching)
  • Traditional VCS are centralized on a single point-of-failure.
  • Some VCS have a concept of locking to prevent multiple peoples’ changes from colliding
  • Distributed Version Control Systems (DVCS) (can) clone all revisions of every branch of a repository every time. *
  • DVCS changesets are pushed to a different repository
  • DVCS changesets are pulled from another repository into a local clone or copy of a repository
  • Teams working with DVCS often designate a central repository hosted by a project forge service like SourceForge, GNU Savannah, GitHub, or BitBucket.
  • Contributors send patches which build upon a specific revision, which can be applied by a maintainer with commit access permissions.
  • Contributors fork a new branch from a specific revision, commit changes, and then send a pull request, which can be applied by a maintainer with commit access permissions.

CVS

CVS (cvs) is a centralized version control system (VCS) written in C.

CVS predates most/many other VCS.

Subversion

Apache Subversion (svn) is a centralized revision control system (VCS) written in C.

To checkout a revision of a repository with svn:

svn co http://svn.apache.org/repos/asf/subversion/trunk subversion

Bazaar

GNU Bazaar (bzr) is a distributed revision control system (DVCS, RCS, VCS) written in Python and C.

http://launchpad.net hosts Bazaar repositories; with special support from the bzr tool in the form of lp: URIs like lp:bzr.

To clone a repository with bzr:

bzr branch lp:bzr

Git

Git (git) is a distributed version control system for tracking a branching and merging repository of file revisions written in C (DVCS, VCS, RCS).

To clone a repository with git:

git clone https://github.com/git/git

GitFlow

GitFlow is a named branch workflow for Git with master, develop, feature, release, hotfix, and support branches (git flow).

Gitflow branch names and prefixes are configured in .git/config; the defaults are:

GitFlow Branch Names
Branch Name Description (and Code Labels)
master Stable trunk (latest release)
develop Development main line
feature/<name> New features for the next release (e.g. ENH, PRF)
release/<name> In-progress release branches (e.g. RLS)
hotfix/<name> Fixes to merge to both master and develop (e.g. BUG, TST, DOC)
support/<name>

“What is the ‘support’ branch?”

https://github.com/nvie/gitflow/wiki/FAQ

Creating a new release with Git and GitFlow:

git clone ssh://git@github.com/westurner/dotfiles
# git checkout master
# git checkout -h
# git help checkout (man git-checkout)
# git flow [<cmd> -h]
# git-flow [<cmd> -h]

git flow init
## Update versiontag in .git/config to prefix release tags with 'v'
git config --replace-all gitflow.prefix.versiontag v
cat ./.git/config
# [gitflow "prefix"]
# feature = feature/
# release = release/
# hotfix = hotfix/
# support = support/
# versiontag = v
#

## feature/ENH_print_hello_world
git flow feature start ENH_print_hello_world
#git commit, commit, commit
git flow feature
git flow feature finish ENH_print_hello_world   # ENH<TAB>

## release/0.1.0
git flow release start 0.1.0
#git commit (e.g. update __version__, setup.py, release notes)
git flow release finish 0.1.0
git flow release finish 0.1.0
git tag | grep 'v0.1.0'

HubFlow

HubFlow is a fork of GitFlow that adds extremely useful commands for working with Git and GitHub pull requests.

HubFlow branch names and prefixes are configured in .git/config; the defaults are as follows:

HubFlow Branch Names
Branch Name Description (and Code Labels)
master Stable trunk (latest release)
develop Development main line
feature/<name> New features for the next release (e.g. ENH, PRF)
release/<name> In-progress release branches (e.g. RLS)
hotfix/<name> Fixes to merge to both master and develop (e.g. BUG, TST, DOC)

Creating a new release with Git and HubFlow:

git clone ssh://git@github.com/westurner/dotfiles
# git checkout master
# git checkout -h
# git help checkout (man git-checkout)
# git hf [<cmd> -h]
# git-hf [<cmd> -h]

git hf init
## Update versiontag in .git/config to prefix release tags with 'v'
git config --replace-all hubflow.prefix.versiontag v
#cat .git/config # ...
# [hubflow "prefix"]
# feature = feature/
# release = release/
# hotfix = hotfix/
# support = support/
# versiontag = v
#
git hf update
git hf pull
git hf pull -h

## feature/ENH_print_hello_world
git hf feature start ENH_print_hello_world
#git commit, commit
git hf pull
git hf push
#git commit, commit
git hf feature finish ENH_print_hello_world   # ENH<TAB>

## release/0.1.0
git hf release start 0.1.0
## commit (e.g. update __version__, setup.py, release notes)
git hf release finish 0.1.0
git hf release finish 0.1.0
git tag | grep 'v0.1.0'

The GitFlow HubFlow illustrations are very helpful for visualizing and understanding any DVCS workflow: https://datasift.github.io/gitflow/IntroducingGitFlow.html.

GitFlow Release / Master Branch Merge Diagram
GitFlow Hotfix to Master and Develop Branches Merge Diagram
Numbered GitFlow Workflow Diagram

Mercurial

Mercurial (hg) is a distributed revision control system written in Python and C (DVCS, VCS, RCS).

To clone a repository with hg:

hg clone http://selenic.com/hg

Project Templates

cookiecutter

Cookiecutter creates projects (files and directories) from project templates written in Jinja2 for projects written in Python and other languages.

Languages

Lightweight Markup Language

CommonMark

CommonMark is one effort to standardize Markdown.

RDoc

RDoc is a tool and a Lightweight Markup Language for generating HTML and command-line documentation for Ruby projects.

To not build RDoc docs when installing a Gem:

gem install --no-rdoc --no-ri
gem install --no-document
gem install -N

ReStructuredText

ReStructuredText (ReST, RST) is a Lightweight Markup Language commonly used for narrative documentation and inline Python, C, Java, etc. docstrings which can be parsed, transformed, and published to valid HTML, ePub, LaTeX, PDF.

Sphinx is built on Docutils, the primary implementation of ReStructuredText.

Pandoc also supports a form of ReStructuredText.

ReStructuredText Directive

Actionable blocks of ReStructuredText

include, contents, and index are all ReStructuredDirectives:

   .. include:: goals.rst

   .. contents:: Table of Contents
    :depth: 3

    .. index:: Example 1
    .. index:: Sphinx +
    .. _example-1:

    Sphinx +1
    ==========
    This refs :ref:`example 1 <example-1>`.

    Similarly, an explicit link to this anchor `<#example-1>`__

    And an explicit link to this section `<#sphinx-1>`__
    (which is otherwise not found in the source text).


    .. index:: Example 2
    .. _example 2:

    Example 2
    ==========

    This links to :ref:`example-1` and :ref:`example 2`.

    (`<#example-1>`__, `<#example-2>`__)

    And this also links to `Example 2`_.

   .. include:: LICENSE

.. note:: ``index`` is a :ref:`Sphinx` Directive,
    which will print an error to the console when building
    but will otherwise silently dropped
    by non-Sphinx ReStructuredText parsers
    like :ref:`Docutils` (GitHub) and :ref:`Pandoc`.
ReStructuredText Role

RestructuredText role extensions

:ref: is a Sphinx RestructuredText Role:

A (between files) link to :ref:`example 2`.

C

C is a third-generation programming language which affords relatively low-level machine access while providing helpful abstractions.

Every Windows kernel is written in C.

The GNU/Linux kernel is written in C and often compiled by GCC or Clang for a particular architecture (see: man uname)

The OS X kernel is written in C.

Libc libraries are written in C.

Almost all of the projects linked here, at some point, utilize code written in C.

C++

C++ is a free and open source third-generation programming language which adds object orientation and a standard library to C.

  • C++ is an ISO specification: C++98, C++03, C++11 (C++0x), C++14, [ C++17 ]

Fortran

Fortran (or FORTRAN) is a third-generation programming language frequently used for mathematical and scientific computing.

Some of the SciPy libraries build optimized mathematical Fortran routines.

Haskell

Haskell is a free and open source strongly statically typed purely functional programming language.

Cabal is the Haskell package manager.

Pandoc is written in Haskell.

Go

Go is a free and open source statically-typed C-based third generation language.

Java

Java is a third-generation programming language which is compiled into code that runs in a virtual machine (JVM) written in C for many different operating systems.

JVM

A JVM (“Java Virtual Machine”) runs Java code (classes and JARs).

JavaScript

JavaScript is a free and open source third-generation programming language designed to run in an interpreter; now specified as ECMAScript.

All major web browsers support Javascript.

Client-side (web) applications can be written in Javascript.

Server-side (web) applications can be written in Javascript, often with Node.js, NPM, and Bower packages.

Note

Java and JavaScript are two distinctly different languages and developer ecosystems.

Node.js

Node.js is a free and open source framework for JavaScript applications written in C, C++, and JavaScript.

Jinja2

Jinja2 is a free and open source templating engine written in Python.

Sphinx and Salt are two projects that utilize Jinja2.

Perl

Src: git git://perl5.git.perl.org/perl.git

Perl is a free and open source, dynamically typed, C-based third-generation programming language.

Many of the Debian system management tools are or were originally written in Perl.

Python

Python is a free and open source dynamically-typed, C-based third-generation programming language.

As a multi-paradigm language with support for functional and object-oriented code, Python is often utilized for system administration and scientific software development.

The Python community is generously supported by a number of sponsors and the Python Infrastructure Team:

Cython

Cython is a superset of CPython which adds static type definitions; making CPython code faster, in many cases.

SciPy Stack

Python Distributions

PyPy

PyPy is a JIT LLVM compiler for Python code written in RPython – a restricted subset of CPython syntax – which compiles to C, and is often faster than CPython for many types of purposes.

Python 3

Python 3 made a number of incompatible changes, requiring developers to update and review their Python 2 code in order to “port to” Python 3.

Python 2 will be supported in “no-new-features” status for quite some time.

Python 3 Wall of Superpowers tracks which popular packages have been ported to support Python 3: https://python3wos.appspot.com/

There are a number of projects which help bridge the gap between the two language versions:

See also: Anaconda

Tox

Tox is a build automation tool designed to build and test Python projects with multiple language versions and environments in separate virtualenvs.

Run the py27 environment:

tox -v -e py27
tox --help

Ruby

Ruby is a free and open source dynamically-typed programming language.

Vagrant is written in Ruby.

Rust

Rust is a free and open source strongly typed multi-paradigm programming language.

WebAssembly

WebAssembly (wasm) is a safe (sandboxed), efficient low-level Programming Languages (abstract syntax tree) and binary format for the web.

  • WebAssembly is initially derived from asm.js and PNaCL.
  • WebAssembly is an industry-wide effort.

YAML

YAML (“YAML Ain’t Markup Language”) is a concise data serialization format.

Most Salt states and pillar data are written in YAML. Here’s an example top.sls file:

base:
 '*':
   - openssh
 '*-webserver':
   - webserver
 '*-workstation':
   - gnome
   - i3

Compilers

Interpreter

Binutils

GNU Binutils are a set of utilities for working with assembly and binary.

GCC utilizes GNU Binutils to compile the GNU/Linux kernel and userspace.

GAS, the GNU Assembler (as) assembles ASM code for linking by the GNU linker (ld).

Clang

Clang is a compiler front end for C, C++, and Objective C/++. Clang is part of the LLVM project.

GCC

The GNU Compiler Collection started as a Free and Open Source compiler for C.

There are now GCC frontends for many languages, including C++, Fortran, Java, and Go.

LLVM

LLVM “Low Level Virtual Machine” is a reusable compiler infrastructure with frontends for many languages.

Operating Systems

POSIX

POSIX (“Portable Operating System Interface”) is a set of standards for Shells, Operating Systems, and APIs.

Linux

GNU/Linux (“Linux”) is a free and open source operating system kernel written in C.

uname -a; echo "Linux"
uname -o; echo "GNU/Linux"

Linux Distributions

A Linux Distribution is a collection of Packages compiled to work with a GNU/Linux kernel and a Libc.

ChromiumOS

ChromiumOS is a Linux Distribution built on Portage.

Crouton

Crouton (“Chromium OS Universal Chroot Environment”) installs and debootstraps a Linux Distribution (i.e. Debian or Ubuntu) within a ChromiumOS or ChromeOS chroot.

ChromeOS

ChromeOS is a Linux Distribution built on ChromiumOS and Portage.

CoreOS

CoreOS is a Linux Distribution for highly available distributed computing.

CoreOS schedules redundant Docker images with fleet and systemd according to configuration stored in etcd, a key-value store with a D-Bus interface.

  • CoreOS runs on very many platforms
  • CoreOS does not provide a package manager
  • CoreOS schedules Docker
  • CoreOS – Operating System
  • etcd – Consensus and Discovery
  • rkt – Container Runtime
  • fleet – Distributed init system (etcd, systemd)
  • flannel – Networking

Linux Notes

Linux Dual Boot
  • [ ] GRUB chainloader to partition boot record
    • Ubuntu and Fedora GRUB try to autodiscover Windows partitions

OS X

OS X is a UNIX operating system based upon the Mach kernel from NeXTSTEP, which was partially derived from NetBSD and FreeBSD.

OS X GUI support is built from XFree86/X.org X11.

OS X maintains forks of many POSIX BSD and GNU tools like bash, readlink, and find.

Homebrew installs and maintains packages for OS X.

uname; echo "Darwin"

iOS

iOS is a closed source UNIX operating system based upon many components of OS X adapted for phones and then tablets.

  • iOS powers iPhones and iPads
  • You must have a Mac with OS X and XCode to develop and compile for iOS.

OSX Notes

OSX Reinstall
  • [ ] Generate installation media
  • [ ] Reboot to recovery partition
  • [ ] Adjust partitions
  • [ ] Format?
  • [ ] Install OS
  • [ ] (wait)
  • [ ] Manual time/date/language config
  • [ ] Run workstation provis scripts
OSX Fresh Install
  • [ ] Generate / obtain installation media
  • [ ] Boot from installation media
  • [ ] Manual time/date/language config
  • [ ] Run workstation provis scripts

Windows

Microsoft Windows is a NT-kernel based operating system.

  • There used to be a POSIX compatibility mode.
  • Chocolatey maintains a set of NuGet packages for Windows.

Windows Sysinternals

Windows Sysinternals is a group of tools for working with Windows.

WSUS Offline Update

WSUS Offline Update is a free and open source software tool for generating offline Windows upgrade CDs / DVDs containing the latest upgrades for Windows, Office, and .Net.

  • Bandwidth costs: Windows Updates (WSUS) in GB * n_machines (see also: Debtorrent, Packages)
  • “Slipstreaming” an installation ISO is one alternative way to avoid having to spend hours upgrading a factory reinstalled (“reformatted”) Windows installation

Windows Notes

A few annotated excerpts from this Chocolatey NuGet PowerShell script https://gist.github.com/westurner/10950476#file-cinst_workstation_minimal-ps1

cinst GnuWin
cinst sysinternals      # Process Explorer XP
cinst 7zip
cinst curl
Windows Dual Boot
  • [ ] Windows MBR chain loads to partition GRUB (Linux)
  • [ ] Ubuntu WUBI .exe Linux Installer (XP, 7, 8*)
    • It’s now better to install to a separate partition from a bootable ISO

Configuration Management

Cobbler

Cobbler is a machine image configuration, repository mirroring, and networked booting server with support for DNS, DHCP, TFTP, and PXE.

  • Cobbler can template kickstart files for the RedHat Anaconda installer
  • Cobbler can template Debian preseed files
  • Cobbler can PXE boot an ISO over TFTP (and unattended install)
  • Cobbler can manage a set of DNS and DHCP entries for physical systems
  • Cobbler can batch mirror RPM and DEB repositories (see also: apt-cacher-ng, Nginx)
  • Cobbler-web is a Django WSGI application; usually configured with Apache HTTPD and mod_wsgi.
    • Cobbler-web delegates very many infrastructure privileges

See also: crowbar, OpenStack Ironic bare-metal deployment

Grunt

Grunt is a build tool written in JavaScript which builds a directed acyclic graph (DAG).

Jake

Jake is a JavaScript build tool written in JavaScript (for Node.js) similar to Make or Rake.

Juju

Juju is a Configuration Management tool written in Python which runs Juju Charms written in Python on one or more systems over SSH, for managing one or more physical and virtual machines running Ubuntu.

Make

GNU Make is a classic, ubiquitous software build tool designed for file-based source code compilation which builds a directed acyclic graph (DAG).

Bash, Python, and the GNU/Linux kernel are all built with Make.

Make build task chains are represented in a Makefile.

Pros

  • Simple, easy to read syntax
  • Designed to build files on disk (see: .PHONY)
  • Nesting: make -C <path> <taskname>
  • Variable Syntax: $(VARIABLE_NAME) or ${VARIABLE_NAME}
  • Bash completion: make <tab>
  • Python: Initially parseable with disutils.text_file
  • Logging: command names and values print to stdout (unless prefixed with @)

Cons

  • Platform Portability: make is not installed everywhere
  • Global Variables: parametrization with shell scripts
VARIABLE_NAME="value" make test
make test VARIABLE_NAME="value"

# ...
export VARIABLE_NAME="value"
make test

Pants

See: Pants Build

Puppet

Puppet is a Configuration Management system written in Ruby which runs Puppet Modules written in Puppet DSL or Ruby for managing one or more physical and virtual machines running various operating systems.

Salt

Salt is a Configuration Management system written in Python which runs Salt Formulas written in YAML, Jinja2, Python for managing one or more physical and virtual machines running various operating systems.

Salt runs modules defined by states over a transport. Salt transports include:

  • ZeroMQ Transport (TCP, msgpack) (libzmq, (default)
  • TCP Transport
  • RAET: Reliable Asynchronous Event Transport (UDP, msgpack) (libsodium, libnacl)
  • salt-ssh runs salt states over SSH
Salt Top File

A Salt Top File (top.sls) defines the Root of a Salt Environment.

The Top File contains:

Salt Environment

A Salt Environment is a folder of Salt States with a top.sls Salt Top File.

A Salt Master and/or a (standalone) Salt Minion maintain a Salt Environment.

Salt Bootstrap

The Salt Bootstrap script (bootstrap-salt.sh) is a shell script installer for a salt master and/or salt minion.

Salt Bootstrap can install from source (git), from (mostly) Python packages served from e.g. PyPI, with Pip, OS Packages (e.g. DEB, RPM).

Salt Minion

A Salt Minion is a daemon process which executes the Salt Modules defined by Salt States on the local machine.

Can run as a background daemon. Can retrieve and execute states from a salt master

Can execute local states in a standalone minion setup:

salt-call --local grains.items
Salt Minion ID

Machine ID value uniquely identifying a minion instance to a Salt Master.

By default the minion ID is set to the FQDN

python -c 'import socket; print(socket.getfqdn())'

The minion ID can be set explicitly in two ways:

  • /etc/salt/minion.conf:

    id: devserver-123.example.org
    
  • /etc/salt/minion_id:

    $ hostname -f > /etc/salt/minion_id
    $ cat /etc/salt/minion_id
    devserver-123.example.org
    
Salt Master

Server daemon which compiles pillar data for and executes commands on Salt Minions:

salt '*' grains.items
Salt SSH

Execute salt commands and states over SSH without a minion process:

salt-ssh '*' grains.items
Salt Grains

Static system information keys and values

  • hostname
  • operating system
  • ip address
  • interfaces

Show grains on the local system:

salt-call --local grains.items
Salt Modules

Remote execution functions for files, packages, services, commands.

Can be called with salt-call

Salt States

Salt states are graphs of nodes, edges, and attributes which are templated and compiled into ordered sequences of system configuration steps.

  • Salt states can be expressed as .sls YAML files (transformed by the sls Salt Renderer) parsed by salt.states.<state>.py.

Salt States files are processed as Jinja2 templates (by default); they can access system-specific grains and pillar data at compile time.

Salt Formulas

Salt Formulas are reusable packages of salt states and example pillar configuration data.

Salt Renderers

A Salt Renderer is a transformation function (e.g. a templating engine (default: Jinja2)) for transforming / preprocessing Salt States, Salt Pillar files, and really any text document.

Salt Pillar

A Salt Pillar is composed of nested key value pillar over interface for storing and making available global and host-specific values for minions: values like hostnames, usernames, and keys.

  • Pillar configuration must be kept separate from states (e.g. users, keys) but works the same way.
  • In a master/minion configuration, minions do not have access to the whole pillar.
Salt Cloud

Salt Cloud can provision cloud image, instance, and networking services with various cloud providers (Libcloud):

Virtualization

libcontainer

libcontainer is a library built by Docker to replace LXC.

Libcontainer provides a native Go implementation for creating containers with namespaces, Cgroups, capabilities, and filesystem access controls.

https://github.com/opencontainers/runc/tree/master/libcontainer

Open Container Initiative

The Open Container Initiative (OCI) is a Linux Foundation collaborative project dedicated to developing a working, portable software container specification.

runC

runC is a container abstraction

runc is a CLI tool for spawning and running containers according to the Open Container Initiative specification.

Docker

Docker is an OS virtualization project written in Go which utilizes Linux containers – first LXC now libcontainer / runC – to partition process workloads across one or more host systems.

Dockerfile
A Dockerfile contains the instructions needed to create a docker image.
Docker container
A Docker container is an instance of a Docker Image with configuration.
Docker API

The Docker API is an interface of management commands for provisioning and managing containers.

Docker Machine, Docker Swarm, and Docker Universal Control Plane all implement the Docker API; so the docker client works equally well with each implementation.

Docker Machine
Docker Machine is the container management application which implements the Docker API.
Docker Swarm
Docker Swarm is a cluster management system for Docker containers hosted on one or more Docker Machines
Docker Universal Control Plane
Docker Universal Control Plane is an enterprise-grade cluster management solution with a web dashboard and external authentication which implements the Docker API.
Docker Compose
Docker Compose is a Python application for defining and managing services (Docker containers) and networks with a docker-compose.yml YAML configuration file.
Docker Image
A Docker Image is an archived container filesystem with configuration which is usually defined by a Dockerfile.
Docker Hub
Docker Hub is a cloud-based registry service for Docker Images.
Docker Cloud
Docker Cloud is the hosting service offered by Docker.
  • Docker images build from a Dockerfile
  • A Dockerfile can subclass another Dockerfile (to add, remove, or change configuration)
  • Dockerfile support a limited number of commands
  • Docker is not intended to be a complete configuration management system
  • Ideally, a Docker image requires minimal configuration once built
  • Docker images can be hosted by https://hub.docker.com/
  • docker run -it ubuntu/16.04 downloads the image from https://hub.docker.com/_/ubuntu/, creates a new instance (docker ps), and spawns a root Shell with a UUID name (by default).
  • There are a number of ways to “Schedule” [redundant] persistent containers that launch on boot with Docker
    • Docker Swarm is the Docker-native way to run a cluster of containers. To a client app, Docker Swarm looks just like Docker Machine because it implements the Docker API.
    • Kubernetes is one project which uses Docker to schedule redundant, optionally geodistributed, LXC containers (in “Pods”).

Salt can install and manage docker, docker images and containers:

Cloud Native Computing Foundation

The Cloud Native Computing Foundation (CNCF) is a foundation for cloud and container industry collaboration.

Kubernetes-Mesos

kubernetes-mesos integrates Kubernetes Docker Pod scheduling with Mesos.

Kubernetes and Mesos are a match made in heaven.

Kubernetes enables the Pod, an abstraction that represents a group of co-located containers, along with Labels for service discovery, load-balancing, and replication control.

Mesos provides the fine-grained resource allocations for pods across nodes in a cluster, and facilitates resource sharing among Kubernetes and other frameworks running on the same cluster.

KVM

KVM is a full virtualization platform with support for Intel VT and AMD-V; which supports running various guest operating systems, each with their own kernel, on a given host machine.

Libcloud

Apache libcloud is a Python library which abstracts and unifies a large number of Cloud APIs for Compute Resources, Object Storage, Load Balancing, and DNS.

Salt salt cloud depends upon libcloud.

Libvirt

Libvirt is a system for platform virtualization with various Linux hypervisors.

LXC

LXC (“Linux Containers”), written in C, builds upon Linux Cgroups to provide containerized OS chroots (all running under the host kernel).

LXC is included in recent Linux kernels.

LXD

LXD, written in Go, builds upon LXC to provide a system-wide daemon and an OpenStack Nova hypervisor plugin.

Mesos

Apache Mesos is a highly-available distributed datacenter operating system, for which there are many different task/process/service schedulers.

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

OpenStack

OpenStack is a platform of infrastructure services for running a cloud datacenter (a private or a public cloud).

OpenStack makes it possible for end-users to create a new virtual machine from the available pool of resources.

rdfs:seeAlso: OpenStack DevStack, Libcloud

Packer

Packer generates machine images for multiple platforms, clouds, and hypervisors from a parameterizable template.

Packer Artifact
Build products: machine image and manifest
Packer Template
JSON build definitions with optional variables and templating
Packer Build
Task defined by a JSON file containing build steps which produce a machine image
Packer Builder

Packer components which produce machine images for one of many platforms:

Packer Provisioner

Packer components for provisioning machine images at build time

  • Shell scripts
  • File uploads
  • ansible
  • chef
  • solo
  • puppet
  • salt
Packer Post-Processor
Packer components for compressing and uploading built machine images

Vagrant

Vagrant is a tool written in Ruby for creating and managing virtual machine instances with CPU, RAM, Storage, and Networking.

vagrant help
vagrant status
vagrant init ubuntu/trusty64
vagrant up
vagrant ssh
$EDITOR Vagrantfile
vagrant provision
vagrant halt
vagrant destroy
vagrantfile

vagrant script defining a team of one or more virtual machines and networks.

create a vagrantfile:

vagrant init [basebox]
cat vagrantfile

start virtual machines and networks defined in the vagrantfile:

vagrant status
vagrant up
Vagrant Box

Vagrant base machine virtual machine image.

There are many baseboxes for various operating systems.

Essentially a virtual disk plus CPU, RAM, Storage, and Networking metadata.

Locally-stored and cached vagrant boxes can be listed with:

vagrant help box
vagrant box list

A running vagrant environment can be packaged into a new box with:

vagrant package

Packer generates VirtualBox Vagrant Boxes with a Post-Processor.

Vagrant Cloud

Vagrant-hosted public Vagrant Box storage.

Install a box from Vagrant cloud:

vagrant init ubuntu/trusty64
vagrant up
vagrant ssh
Vagrant Provider

A driver for running Vagrant Boxes with a hypervisor or in a cloud.

The Vagrant VirtualBox Provider is well-supported.

With Plugins: https://github.com/mitchellh/vagrant/wiki/Available-Vagrant-Plugins

See also: Libcloud.

Vagrant Provisioner

Set of hooks to install and run shell scripts and configuration managment tools over vagrant ssh.

Vagrant up runs vagrant provision on first invocation of vagrant up.

vagrant provision

Note

Vagrant configures a default NFS share mounted at /vagrant.

Note

Vagrant adds a default NAT Adapter as eth0; presumably for DNS, the default route, and to ensure vagrant ssh connectivity.

VirtualBox

Src: svn svn://www.virtualbox.org/svn/vbox/trunk

Oracle VirtualBox is a platform virtualization package for running one or more guest VMs (virtual machines) within a host system.

VirtualBox:

  • runs on many platforms: Linux, OSX, Windows
  • has support for full platform NX/AMD-v virtualization
  • requires matching kernel modules

Vagrant scripts VirtualBox.

Shells

Bash

GNU Bash, the Bourne-again shell.

type bash
bash --help
help help
help type
apropos bash
info bash
man bash
  • Designed to work with unix command outputs and return codes

  • Functions

  • Portability: sh (sh, bash, dash, zsh) shell scripts are mostly compatible

  • Logging:

    set -x  # print commands and arguments
    set -v  # print source
    

Bash Configuration:

/etc/profile
/etc/bash.bashrc
/etc/profile.d/*.sh
${HOME}/.profile        /etc/skel/.profile   # PATH=+$HOME/bin  # umask
${HOME}/.bash_profile   # empty. preempts .profile

Linux/Mac/Windows: Almost Always / Bash 3.2 / Cygwin/Mingwin

IPython

IPython is an interactive REPL and distributed computation framework written in Python.

1 + 1
x = 1+1
print("1 + 1 = %d" (x))

# IPython
?                              # help
%lsmagic
%<tab>                         # list magic commands and aliases
%logstart?                     # help for the %logstart magic command
%logstart -o logoutput.log.py  # log input and output to a file
import json
json?                          # print(json.__doc__)
json??                         # print(inspect.getsource(json))

# IPython shell
!cat ./README.rst; echo $PWD   # run shell commands
lines = !ls -al                # capture shell command output
print(lines[0:])
%run -i -t example.py          # run a script with timing info,
                               # in the local namespace
%run -d example.py             # run a script with pdb
%pdb on                        # automatically run pdb on Exception

PowerShell

Windows PowerShell is a shell for Windows.

Shell Utilities

Awk

AWK is a pattern programming language for matching and transforming text.

Grep

Grep is a commandline utility for pattern-based text matching.

Htop

Htop is a commandline task manager; like top extended.

Pyline

Pyline is an open source POSIX command-line utility for streaming line-based processing in Python with regex and output transform features similar to Grep, Sed, and Awk.

  • Pyline can generate quoted CSV, JSON, HTML, etc.

Pyrpo

Pyrpo is an open source POSIX command-line utility for locating and generating reports from Git, Mercurial, Bazaar, and Subversion repositories.

Sed

GNU Sed is an open source POSIX command-line utility for transforming text.

Note

BSD Sed

Use <Ctrl-V><tab> for explicit tabs (as \t does not work)

Use \\\n or '$'\n for newlines (as \n does not work)

sed -E should be consistent extended regular expressions between GNU Sed (e.g. Linux) and BSD Sed (FreeBSD, OSX).

OR: brew install gnu-sed

See: https://unix.stackexchange.com/questions/101059/sed-behaves-different-on-freebsd-and-on-linux

See: https://superuser.com/questions/307165/newlines-in-sed-on-mac-os-x

Web Shells

IPython Notebook

IPython Notebook is a web-based shell for interactive and literate computing with IPython notebooks.

  • An IPython notebook (.ipynb) is a JSON document containing input and output for a linear sequence of cells; which can be exported to many output formats (e.g. HTML, RST, LaTeX, PDF); and edited through the web with IPython Notebook.

  • IPython Notebook supports Markdown syntax for comment cells.

  • IPython Notebook supports more than 40 different IPython kernels for other languages:

    https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages

  • IPython Notebook development has now moved to Jupyter Notebook; which supports IPython kernels (and defaults to the IPython CPython 2 or 3 kernel).

To start IPython Notebook (assuming the _SRC variable as defined in Venv):

pip install ipython[notebook]
# pip install -e git+https://github.com/ipython/ipython@rel-3.2.1#egg=ipython
# https://github.com/ipython/ipython/releases

mkdir $_SRC/notebooks; cd $_SRC/notebooks
ipython notebook

ipython notebook --notebook-dir="${_SRC}/notebooks"

# With HTTPS (TLS/SSL)
ipython notebook \
 --ip=127.0.0.1 \
 --certfile=mycert.pem \
 --keyfile=privkey.pem \
 --port=8888 \
 --browser=web  # (optional) westurner/dotfiles/scripts/web

 # List supported options
 ipython notebook --help

Warning

IPython Notebook runs code and shell commands as the user the process is running as, on a remote or local machine.

Reproducible SciPy Stack IPython Notebook / Jupyter Notebook servers implement best practices like process isolation and privilege separation with e.g. Docker and/or Jupyter Hub.

Note

IPython Notebook is now Jupyter Notebook.

Jupyter Notebook runs Python notebooks with the IPython CPython kernel (from IPython Notebook).

ipython_nose

ipython_nose is an extension for IPython Notebook (and likely Jupyter Notebook) for discovering and running test functions starting with test_ (and unittest.TestCase test classes with names containing Test) with Nose.

  • ipython_nose is not (yet?) uploaded to PyPI
  • to install ipython_nose from GitHub (with Pip and Git):
pip install -e git+https://github.com/taavi/ipython_nose#egg=ipython_nose

See also:

nosebook

nosebook is a tool for finding and running tests in nbformat IPython Notebooks and Jupyter Notebooks with nose.

See also:

Jupyter

Project Jupyter expands upon components like IPython and IPython Notebook to provide a multi-user web-based shell for many languages (Python, Ruby, Java, Haskell, Julia, R).

IPython Jupyter comparison (adapted from http://jupyter.org)
IPython Jupyter
  • Interactive Python shell
  • Python kernel for Jupyter
  • Interactive Parallel Python
  • Rich REPL Protocol
  • Jupyter Notebook (format, environment, conversion)
  • JupyterHub (multi-user notebook server)
  • JupyterHub authenticators (MediaWiki OAuth, GitHub OAuth)
  • JupyterHub spawners (Docker, Sudo, Remote, Docker Swarm)

Jupyter Notebook

Jupyter Notebook is the latest IPython Notebook.

The Jupyter HTML Notebook is a web-based notebook environment for interactive computing.

Warning

Jupyter Notebook runs code and shell commands as the user the process is running as, on a remote or local machine.

Reproducible SciPy Stack IPython Notebook / Jupyter Notebook servers implement best practices like process isolation and privilege separation with e.g. Docker and/or Jupyter Hub.

Jupyter Drive

Jupyter Drive adds support to Jupyter Notebook for reading and writing nbformat notebook .ipynb files to and from Google Drive.

nbconvert

nbconvert is the code that converts (transforms) an .ipynb notebook (nbformat JSON) file ( into an output representation (e.g. HTML, slides (reveal.js), LaTeX, PDF, ePub, Mobi).

  • nbconvert is included with IPython

  • nbconvert is part of Project Jupyter

    pip install nbconvert
    # pip install -e git+https://github.com/jupyter/nbconvert@master#egg=nbconvert
    
    ipython nbconvert --to html mynotebook.ipynb
    jupyter nbconvert --to html mynotebook.ipynb
    
  • reveal.js is an HTML presentation framework for slides in a 1D or 2D arrangement.

    • Presentation content that doesn’t fit on the slide is hidden and unscrollable (only put a slide worth of data in each cell for a Jupyter reveal.js presentation).

      jupyter nbconvert --to slides mynotebook.ipynb
      
  • RISE does live reveal.js notebook presentations

    https://github.com/damianavila/RISE

nbformat

The Jupyter Notebook (.ipynb) format is a versioned JSON format for storing metadata and input/output sequences.

Usually, when the nbformat changes, notebooks are silently upgraded to the new version on the next save.

Note

nbformat v3 and above add a kernelspec attribute to the nbformat JSON, because .ipynb files can now contain code for languages other than Python.

  • nbformat does not specify any schema for the user-supplied metadata dict (TODO) that can be edited so, JSON that conforms to an externally managed JSON-LD @context would work.

nbviewer

Jupyter Notebook Viewer (nbviewer) is an application for serving read-only versions of .ipynb files which have HTTP URLs.

GitHub now also renders static .ipynb files, CSV, SVG, and PDF.

runipy

runipy runs Jupyter notebooks from a Shell commandline, generates HTML reports, and can write errors to stderr.

Jupyter notebook manual test review process:

# - run Jupyter Notebook server
!jupyter notebook
# - Browser
#     - navigate to / upload / drag and drop the notebook
        !web http://localhost:8888   # or https://
#     - (optional) click 'TODO Restart Kernel'
#     - (optional) click 'Cell' > 'All Output' > 'Clear'
#     - click 'Cell' > 'Run All'
#     - [wait] <Jupyter Kernel runs notebook>
#     - visually seek for the first ERRoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb

Jupyter notebook TODO review process:

# - run Jupyter Notebook server
!jupyter notebook
# - Browser
#     - navigate to / upload / drag and drop the notebook
        !web http://localhost:8888   # or https://
#     - (optional) click 'TODO Restart Kernel'
#     - (optional) click 'Cell' > 'All Output' > 'Clear'
#     - click 'Cell' > 'Run All'
#     - [wait] <Jupyter Kernel runs notebook>
#     - visually seek for the first ERRoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb

Jupyter notebook runipy review process:

# - runipy the Jupyter notebook
!runipy mynotebook.ipynb
# - review stdout and stderr from runipy
# - review in browser (optional; recommended)
#     - navigate to the converted HTML
        !web ./mynotebook.ipynb.html
#     - visually seek for the first WEEoring cell (scroll)
#     - review the notebook
        for (i, o) in notebook_cells:
            human.manually_review((i, o))
# - Compare the files on disk with the most recent commit (HEAD)
!git status && git diff
!git diff mynotebook.ipynb*
# - Commit the changes
!git-add-commit "TST: mynotebook: tests for #123" ./mynotebook.ipynb*

RISE

Reveal.js - Jupyter/IPython Slideshow Extension, also known as live_reveal

Dotfiles

Dotfiles are userspace shell configuration in files that are often prefixed with “dot” (e.g. ~/.bashrc for Bash)

Venv

Venv is a tool for making working with Virtualenv, Virtualenvwrapper, Bash, ZSH, Vim, and IPython within a project context very easy.

Venv defines standard Filesystem Hierarchy Standard and Python paths, environment variables, and aliases for routinizing workflow.

var name description

cdaliases

Bash: cdhelp

IPython: %cdhelp

Vim: :Cdhelp

example path
HOME user home directory

Bash/ZSH: cdh, cdhome

IPython: %cdh, %cdhome

Vim: :Cdh, :Cdhome

~/
__WRK workspace root cdwrk (ibid.) ~/-wrk
WORKON_HOME virtualenvs root cdwh, cdworkonhome, cdve ~/-wrk/-ve27
CONDA_ENVS_PATH condaenvs root cdch, cdcondahome ~/-wrk/-ce27
VIRTUAL_ENV virtualenv root cdv, cdvirtualenv ~/-wrk/-ve27/dotfiles
_BIN virtualenv executables cdb, cdbin ~/-wrk/-ve27/dotfiles/bin
_ETC virtualenv configuration cd, cdetc ~/-wrk/-ve27/dotfiles/etc
_LIB virtualenv lib directory cdl, cdlib ~/-wrk/-ve27/dotfiles/lib
_LOG virtualenv log directory cdlog ~/-wrk/-ve27/dotfiles/var/log
_SRC virtualenv source repositories cds, cdsrc ~/-wrk/-ve27/dotfiles/src
_WRD virtualenv working directory cdw, cdwrd ~/-wrk/-ve27/dotfiles/src/dotfiles

To generate this venv config:

python -m dotfiles.venv.ipython_config --print-bash dotfiles
venv.py --print-bash dotfiles
venv --print-bash dotfiles docs
venv --print-bash dotfiles ~/path
venv --print-bash ~/-wrk/-ve27/dotfiles ~/path

To generate a default venv config with a prefix of /:

venv --print-bash --prefix=/

To launch an interactive shell within a venv:

venv --run-bash dotfiles
venv -xb dotfiles

Note

pyvenv is the Virtualenv -like functionality now included in Python >= 3.3 (python3 -m venv)

Python pyvenv docs: https://docs.python.org/3/library/venv.html

Virtualenv

Virtualenv is a tool for creating reproducible Python environments.

Virtualenv sets the shell environment variable $VIRTUAL_ENV when active.

Virtualenv installs a copy of Python, Setuptools, and Pip when a new virtualenv is created.

A virtualenv is activated by source-ing ${VIRTUAL_ENV}/bin/activate.

Paths within a virtualenv are more-or-less FHS standard paths, which makes virtualenv structure very useful for building chroot and container overlays.

A standard virtual environment:

bin/           # pip, easy_install, console_scripts
bin/activate   # source bin/activate to work on a virtualenv
include/       # (symlinks to) dev headers (python-dev/python-devel)
lib/           # libraries
lib/python2.7/distutils/
lib/python2.7/site-packages/  # pip and easy_installed packages
local/         # symlinks to bin, include, and lib
src/           # editable requirements (source repositories)

# also useful
etc/           # configuration
var/log        # logs
var/run        # sockets, PID files
tmp/           # mkstemp temporary files with permission bits
srv/           # local data

Virtualenvwrapper wraps virtualenv.

echo $PATH; echo $VIRTUAL_ENV
python -m site; pip list

virtualenv example               # mkvirtualenv example
source ./example/bin/activate    # workon example

echo $PATH; echo $VIRTUAL_ENV
python -m site; pip list

ls -altr $VIRTUAL_ENV/lib/python*/site-packages/**  # lssitepackages -altr

Note

Venv extends Virtualenv and Virtualenvwrapper.

Note

Python 3.3+ now also contain a script called venv, which performs the same functions and works similarly to virtualenv: https://docs.python.org/3/library/venv.html.

Virtualenvwrapper

Virtualenvwrapper is a tool which extends virtualenvwrapper.

Virtualenvwrapper provides a number of useful shell commands and python functions for working with and within virtualenvs, as well as project event scripts (e.g. postactivate, postmkvirtualenv) and two filesystem configuration variables useful for structuring development projects of any language within virtualenvs: $PROJECT_HOME and $WORKON_HOME.

Virtualenvwrapper is sourced into the shell:

# pip install --user --upgrade virtualenvwrapper
source ~/.local/bin/virtualenvwrapper.sh

# sudo apt-get install virtualenvwrapper
source /etc/bash_completion.d/virtualenvwrapper

Note

Venv extends Virtualenv and Virtualenvwrapper.

echo $PROJECT_HOME; echo ~/workspace             # venv: ~/-wrk
cd $PROJECT_HOME                                 # venv: cdp; cdph
echo $WORKON_HOME;  echo ~/.virtualenvs          # venv: ~/-wrk/-ve27
cd $WORKON_HOME                                  # venv: cdwh; cdwrk

mkvirtualenv example
workon example                                   # venv: we example

cdvirtualenv; cd $VIRTUAL_ENV                    # venv: cdv
echo $VIRTUAL_ENV; echo ~/.virtualenvs/example   # venv: ~/-wrk/-ve27/example

mkdir src ; cd src/                              # venv: cds; cd $_SRC

pip install -e git+https://github.com/westurner/dotfiles#egg=dotfiles

cd src/dotfiles; cd $VIRTUAL_ENV/src/dotfiles    # venv: cdw; cds dotfiles
head README.rst

                                                 # venv: cdpylib
cdsitepackages                                   # venv: cdpysite
lssitepackages

deactivate
rmvirtualenv example

lsvirtualenvs; ls -d $WORKON_HOME                # venv: lsve; lsve 'ls -d'

Window Managers

Compiz

Compiz is a window compositing layer for X11 which adds lots of cool and productivity-enhancing visual capabilities.

Compiz works with Gnome, KDE, and Qt applications.

f.lux

f.lux is a userspace utility for gradually adjusting the blue color channel throughout the day; or as needed.

  • A similar effect can be accomplished with the X11 xgamma command (e.g. for Linux platforms where the latest f.lux is not yet available). A few keybindings from an I3wm configuration here:

    # [...] #L105
    set $xgamma_reset    xgamma -gamma 1.0
    set $xgamma_soft     xgamma -bgamma 0.6 -ggamma 0.9 -rgamma 0.9
    set $xgamma_soft_red xgamma -bgamma 0.4 -ggamma 0.6 -rgamma 0.9
    # [...] #L200
    ## Start, stop, and reset xflux
    #  <alt> [         -- start xflux
    bindsym $mod+bracketleft    exec --no-startup-id $xflux_start
    #  <alt> ]         -- stop xflux
    bindsym $mod+bracketright   exec --no-startup-id $xflux_stop
    #  <alt><shift> ]  -- reset gamma to 1.0
    bindsym $mod+Shift+bracketright  exec --no-startup-id $xgamma_reset
    #  <alt><shift> [  -- xgamma -bgamma 0.6 -ggamma 0.9 -rgamma 0.9
    bindsym $mod+Shift+bracketleft exec --no-startup-id $xgamma_soft
    #  <alt><shift> \  -- xgamma -bgamma -0.4 -ggamma 0.4 -rgamma 0.9
    bindsym $mod+Shift+p exec --no-startup-id $xgamma_soft_red
    

i3wm

i3wm is a tiling window manager for X11 (Linux) with extremely-configurable Vim-like keyboard shortcuts.

i3wm works with Gnome, KDE, and Qt applications.

Qt

Qt is a Graphical User Interface toolkit for developing applications with Android, iOS, OS X, Windows, Embedded Linux, and X11.

Wayland

Wayland is a display server protocol for GUI window management.

Wayland is an alternative to X11 servers like XFree86 and X.org.

The reference Wayland implementation, Weston, is written in C.

X11

Src: git git://anongit.freedesktop.org/git/xorg/

X Window System (X, X11) is a display server protocol for window management (drawing windows on the screen).

Most UNIX and Linux systems utilize XFree86 or the newer X.org X11 window managers.

Gnome, KDE, I3wm, OS X, and Compiz build upon X11.

Browsers

Chromium

The Chromium Projects include the Chromium Browser and ChromiumOS.

Chrome DevTools

  • Right-click > “Inspect Element”
  • OSX: <option> + <command> + i

DevTools Emulation

pbm

  • backup and organize { Chrome , Chromium } Bookmarks JSON in an offline batch
  • date-based transforms
  • quicklinks
  • starred bookmarks (with trailing ##)

Firefox Android

Firefox Android Extensions

Internet Explorer

Internet Explorer is the web browser included with Windows.

See also: Microsoft Edge

Microsoft Edge

Microsoft Edge will be replacing Internet Explorer.

Safari iOS

Browser Extensions

Accessibility Extensions

Tiësto

The Tiësto Chrome Theme is a Dark Theme for Chrome.

Safety Extensions

uBlock

_repo="chrisaljoudi/ublock"
curl -s "https://api.github.com/repos/${_repo}/releases" > ./releases.json
cat releases.json \
    | grep browser_download_url \
    | pyline 'w and w[1][1:-1]' \
    | pyline --regex \
        '.*download/(.*)/(uBlock.(firefox.xpi|chromium.zip))$' \
        'rgx and rgx.group(1,2)'

Content Extensions

Hypothesis

Hypothesis can also be included as a sidebar on a site:

<script async defer src="//hypothes.is/embed.js"></script>

Zotero

Zotero archives and tags resources with bibliographic metadata.

  • Zotero is really helpful for research.

  • Browsers other than Firefox connect to Zotero Standalone

  • Zotero can store a full-page archive of a given resource (e.g. HTML, PDF)

  • Zotero can store and synchronize data on Zotero’s servers with Zotero File Storage

  • Zotero can store and synchronize data over WebDAV

  • Zotero can export a collection of resources’ bibliographic metadata in one of many citation styles (“CSL”) (e.g. MLA, APA, [Journal XYZ])

  • Zotero can export a collection of resources’ bibliographic metadata as RDF

  • There are a number of plugins and integrations with Zotero:

    https://www.zotero.org/support/plugins

[ ] Zotero and Schema.org RDFa

> How would I go about adding HTML + RDFa [1] and/or HTML + Microdata [2] export templates with Schema.org classes and properties to Zotero?

Development Extensions

Requirify

Requirify adds NPM modules to the local namespace (e.g. from Chrome DevTools JS console).

> require() npm modules in the browser console

Local-requirify

Require local NPM modules with Requirify

Vim Extensions

Vimium

Vimium is a Chrome Extension which adds Vim-like functionality.

function vimium shortcut
help ?
jump to link in current/New tab f / F
copy link to clipboard yf
open clipboard link in current/New tab p / P
 

Vimperator

Vimperator connects a JS shell with VIM command interpretation to the Firefox API, with Vim-like functionality.

  • vimperatorrc can configure settings in about:config

Documentation Tools

Docutils

Docutils is a Python library which ‘parses” ReStructuredText lightweight markup language into a doctree (~DOM) which can be serialized into HTML, ePub, MOBI, LaTeX, man pages, Open Document files, XML, JSON, and a number of other formats.

Pandoc

Pandoc is a “universal” markup converter written in Haskell which can convert between HTML, BBCode, Markdown, MediaWiki Markup, ReStructuredText, HTML, and a number of other formats.

Pgs

pgs is an open source web application written in Python for serving static files from a Git branch, or from the local filesystem.

pgs -p "${_WRD}/_build/html" -r gh-pages -H localhost -P 8082
  • pgs is written with the one-file Bottle web framework

  • compared to python -m SimpleHTTPServer localhost:8000 / python3 -m http.server localhost:8000 pgs has WSGI, the ability to read from a Git branch without real Git bindings, and caching HTTP headers based on Git or filesystem mtimes.

  • pgs does something like Nginx try_files $.html

Sphinx

Sphinx is a tool for working with ReStructuredText documentation trees and rendering them into HTML, PDF, LaTeX, ePub, and a number of other formats.

Sphinx extends Docutils with a number of useful markup behaviors which are not supported by other ReStructuredText parsers.

Most other ReStructuredText parsers do not support Sphinx directives; so, for example,

Sphinx Builder

A Sphinx Builder transforms ReStructuredText into various output forms:

  • HTML
  • LaTeX
  • PDF
  • ePub
  • MOBI
  • JSON
  • OpenDocument (OpenOffice)
  • Office Open XML (MS Word)

See: Sphinx Builders

Sphinx ReStructuredText
Sphinx extends ReStructuredText with roles and directives which only work with Sphinx.
Sphinx Directive

Sphinx extensions of Docutils ReStructuredText directives.

Most other ReStructuredText parsers do not support Sphinx directives.

.. toctree::

   readme
   installation
   usage

See: Sphinx Directives

Sphinx Role

Sphinx extensions of Docutils ReStructuredText roles

Most other ReStructured

.. _anchor-name:

A link to :ref:`anchor <anchor-name>`.

Tinkerer

Tinkerer is a very simple static blogging website generation tool written in Python which extends Sphinx and generates HTML from ReStructuredText.

Static HTML pages generated with Tinkerer do not require a serverside application, and can be easily hosted with GitHub Pages or any other web hosting service.

Backup Tools

Backup Ninja

Backup Ninja is an open source backup utility written in /etc/backup.d

  • BackupNinja supports rdiff-backup, Duplicity, and rsync.
  • BackupNinja can create and burn CD/DVD images.
  • BackupNinja can backup a number of relational databases (MySQL, PostgreSQL), maildirs, SVN repositories, Trac instances, and LDAP.

bup

Bup (backup) is a backup system based on Git packfiles and rolling checksums.

[Bup is a very] efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).
  • AFAIU, like Git, Bup does not preserve file permissions, Access Control Lists, or extended attributes (though some archive formats and snapshot images do).

Clonezilla

Clonezilla is an open source Linux distribution which is bootable from a CD/DVD/USB (a LiveCD, LiveDVD, LiveUSB) or PXE which contains a number of tools for disk imaging, disk cloning, filesystem backup and recovery; and a server Linux distribution for serving disk images to one or more computers over a LAN.

  • Clonezilla contains FSArchiver, partclone, partimage, and rsync.
  • Clonezilla can backup and restore very many (if not most) filesystems.
  • Clonezilla supports MBR, GPT, and uEFI.
  • Clonezilla can restore a networked multicast group (e.g. lab) of machines to a system image (saving TCP overhead when sharing the same multi-gigabyte / terabyte image to zero or more machines); and boot them with PXE and/or Wake-on-Lan.
    • bup, debtorrent
  • Clonezilla can backup to disk, ssh, samba, NFS, WebDAV
  • drbl-winroll helps with restoring Windows images
  • SystemRescueCD also contains partimage.
  • Cobbler also supports PXE boot from images.

Duplicity

Duplicity is an open source incremental file directory backup utility with GnuPG encryption, signatures, versions, and a number of actions for redundantly storing backups.

  • Duplicity can push offsite backups to/over a number of protocols and services (e.g. SSH/SCP/SFTP, S3, Google Cloud Storage, Rackspace Cloudfiles (OpenStack Swift)).
  • Duplicity stores data with tar archives and rdiff
  • rdiff-backup is similar to Duplicity.

FSArchiver

FSAchiver is an open source filesystem backup (disk cloning) utility which can preserve file permissions, labels, and extended attributes.

  • FSArchiver can backup a filesysmet to a new or within an existing filesystem.
  • FSArchiver has special support for LVM.
  • FSArchiver supports password-based encryption.

partclone

partclone is an open source utility for making compressed backups of the used blocks of partitions with each specific filesystem driver.

partimage

Partimage is an open source utility for making complete sector-for-sector compressed backups of partitions over the network or to a local device.

rsync

rsync is an open-source file backup utility which can be used to make incremental backups using file deltas over the network or the local system.

  • rsync may appear to be stalled when it is actually calculating the full set of initial relative differences in order to minimize the amount of data transfer.

Note

rsync does not preserve file permissions by default.

To preserve file permissions with rsync:

man rsync

rsync -a    # rsync -rlptgoD
  rsync -r  # recursive (traverse into directories)
  rsync -l  # copy symlinks as links
  rsync -p  # preserve file permissions
  rsync -t  # preserve modification times
  rsync -g  # preserve group
  rsync -o  # preserve owner (requires superuser)
  rsync -D  # rsync --devices --specials
    rsync --devices   # preserve device files (requires superuser)
    rsync --specials  # preserve special files
rsync -A  # preserve file ACLs
rsync -X  # preserve file extended attributes

rsync -aAX  # rsync -a -A -X

rsync -v  # verbose
rsync -P  # rsync --partial --progress
  rsync --partial     # keep partially downloaded files
  rsync --progress    # show *per-file* progress and xfer speed

Note

rsync is picky about paths and trailing slashes.

# setUp
mkdir -p A/one B/one  # TODO
echo 'A' > A/one; echo 'B' > B/one
# tests
rsync A B
rsync A B/  --> B/A
rsync A/ B
rsync A/ B/

rdiff

rdiff is the open source relative delta algorithm of rsync.

rdiff-backup

rdiff-backup is an open source incremental file directory backup utility.

  • Like rsync, rdiff-backup transmits file deltas instead of entire files.
  • Unlike rsync, rdiff-backup manages reverting to previous revisions.

SystemRescueCD

SystemRescueCD is a Linux distribution which is bootable from a CD/DVD/USB (a LiveCD) which contains a number of helpful utilities for system maintenance.

Standards

CSS

CSS (Cascading Style Sheets) define the presentational aspects of HTML and a number of mobile and desktop web framworks.

  • CSS is designed to ensure separation of data and presentation. With javascript, the separation is then data, code, and presentation.

Filesystem Hierarchy Standard

The Filesystem Hierarchy Standard (FHS) is a well-worn industry-supported system file naming structure.

JSON

JSON is an object representation in JavaScript syntax which is now supported by libraries for many languages.

A list of objects with key and value attributes in JSON syntax:

[
{ "key": "language", "value": "Javascript" },
{ "key": "version", "value": 1 },
{ "key": "example", "value": true },
]

Machine-generated JSON is often not very readable, because it doesn’t contain extra spaces or newlines. The Python JSON library contains a utility for parsing and indenting (“prettifying”) JSON from the commandline

cat example.json | python -m json.tool

JSON-LD

JSON-LD is a web standard for Linked Data in JSON.

An example from the JSON-LD Playground (http://goo.gl/xxZ410):

{
   "@context": {
    "gr": "http://purl.org/goodrelations/v1#",
    "pto": "http://www.productontology.org/id/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "foaf:page": {
      "@type": "@id"
    },
    "gr:acceptedPaymentMethods": {
      "@type": "@id"
    },
    "gr:hasBusinessFunction": {
      "@type": "@id"
    },
    "gr:hasCurrencyValue": {
      "@type": "xsd:float"
    }
   },
   "@id": "http://example.org/cars/for-sale#tesla",
   "@type": "gr:Offering",
   "gr:name": "Used Tesla Roadster",
   "gr:description": "Need to sell fast and furiously",
   "gr:hasBusinessFunction": "gr:Sell",
   "gr:acceptedPaymentMethods": "gr:Cash",
   "gr:hasPriceSpecification": {
    "gr:hasCurrencyValue": "85000",
    "gr:hasCurrency": "USD"
   },
   "gr:includes": {
    "@type": [
      "gr:Individual",
      "pto:Vehicle"
    ],
    "gr:name": "Tesla Roadster",
    "foaf:page": "http://www.teslamotors.com/roadster"
   }
}

MessagePack

MessagePack (msgpack) is a data interchange format with implementations in many languages.