what to automatize?

everything

more precisely, repetitve tasks | scripting: writing relatively short and simple code to automatize an otherwise manual process

in a software development context

style guide compliance
code smell finding
code quality measurement
review
building
testing
deployment

linting

a linter in modern editors behaves like a spell checker in a word processor
- gives immediate feedback on syntax errors, styling issues or bad practices
can detect some code smells
traditionally linters were developed for languages, then linter plugins for editors
- so language support in editors were not evident
- Language Server Protocol (LSP) was developed (originally at Microsoft), providing a common interface for linters

auto formatting

there are automatic code formatters for more and more languages
- that can reformat the source code to align with the style guide
usually triggered by saving the file
usually configurable to align with organization / project specific rules
some examples:
- Python: ruff (among others)
- Ruby: rubocop
- Go: gofmt

well configured editor

can help keeping the feedback loop fast
- auto formatter
- linter
- running test
decrease cost
- immediate feedback
- less refactoring (later), less review finding
- developers can focus on non-automatable tasks

importance of personal preferences!

that is why I previously said that trust in your IDE

version control systems

scrum development team
multiple task on the sprint backlog
developers start to work on different tasks
- the time to complete a task varies
work items need to be merged to the common code base / repository
version control systems are used to solve this issue

the figures are based on Figure 2.2 of the TortoiseSVN documentation

lock-modify-unlock solution

one developer can edit a file at the same time – not very effective

the figures are based on Figure 2.3 of the TortoiseSVN documentation

copy-modify-merge solution

the figures are based on Figure 2.4 of the TortoiseSVN documentation

copy-modify-merge solution #2

example: git, mercurial

the figures are based on Figure 2.5 of the TortoiseSVN documentation

centralized vs. distributed version control system

example: subversion

example: git, mercurial

the figures are based on Version control concepts and best practices - by Michael Ernst (Ernst, 2012)

feature branching

copy-modify-merge version tracking gave a viable solution for parallel development
- but separating the “workspace” is still beneficial
each developed feature has its own branch, which is merged to the mainline after completion

when to make a commit?

when you completed a unit of work
when you have changes you may want to undo

source: When to make a Git Commit (McCreary, 2017)

branching strategies

branching is more than just separating workspace, work-in-progress code from released
- also for managing stable (released) versions
- and bugfixing though multiple versions

A successful Git branching model (Driessen, 2010)
- introduced in 2010
- a criticism: A succesful Git branching model considered harmful (Judin, 2016)
a comparison of branching strategies (Shipp, 2019)

continuous integration (CI)

Continuous Integration is a software development practice where each member of a team merges their changes into a codebase together with their colleagues changes at least daily.

– Martin Fowler (Fowler, 2024)

emerged from extreme programming
considered an agile approach
gives immediate feedback
- the integration (merging) will fail if two branches are not compatible
- and build the integrated software
also gives opportunity to do testing on the built software…

continuous integration environment

What happens here is that the developer pushes a change to the version control server, which can be(depends on the configuration) a trigger. Based on the trigger the CI environment start a runner. A runner can be, for example, a linux instance running in a virtual machine or in a container. In the newly started environment, the changes are fetched. Considering git as a version control system, fetching means clone and checkout operations. As the CI environment is “empty”, in a sense that it does not contain any of the software code, the code repository has to be cloned every time. Then, the right version, defined by the trigger should be selected using the git checkout command. After the software version to be tested is available, the build script should be executed. Optionally with static code analysis, tests, and so on, then the reports are generated.

If the build or the tests did not succeed a notification is sent to the developer(s), and any other stakeholder who are subscribed. If the build (and tests) succeeded, notification are usually only sent if the previous build failed, preventing unnecessary noise.

build script

traditionally called build script
responsible not only for building the software
but also for running tests, generating reports
- code coverage
and even for packaging the software

example: build script of the course website

scheduled build

The trigger can be also be a scheduler, so a build / test running can be executed without explicit change of a new push.

nightly build

scheduled build during night time
typically includes a smoke test
building a the latest version of a software, on a daily basis
originally scheduled to night time because for large software a full build (with all tests) could take hours

also known as daily build

continuous deployment

continuous integration ensures everyone integrates their code to the mainline (Fowler, 2024)
“Continuous Deployment means the product is automatically released to production whenever it passes all the automated tests in the deployment pipeline.” – Martin Fowler (Fowler, 2024)

continuous deployment environment

extension of a continuous integration environment
deployment is another stage in the build script
same triggers as in a CI environment (not just the scheduler)

blue–green deployment (Wikipedia contributors, 2024a)

two servers are maintained (“blue” and “green”)
- expensive
at a given time, only one server is handling public request
the other can be accessed only from a private network
changes applied to the non-live server and verified
when verified, the non-live server is swapped with the live server

shadow deployment

two servers are maintained (“live” and “shadow”)
for testing the performance and stability requirements
- on success, the release can be deployed to the live server as well
specialized strategy, complex and (relatively) expensive to set up

canary deployment

deployment in an incremental fashion
starts with a small number of users
and continues until 100% is reached
allows to test updates in live environment
- on small groups of users
- before deploying to many users
- may involve telemetry

A/B testing is more of a testing approach than a deployment technique, but it works similarly to canary deployment. It involves reviewing two versions of updates in small set of users to identify which version perform better. (Kazim, 2023)

devops

software development + IT operations
- collaboration
agile mindset, set of principles (GitLab, 2022)
- automation of the SDLC
- collaboration and communication
- continuous improvement
- focus on user needs with short feedback loops
relies on automatization, CI and CD
to build, test and release better software
- frequently, reliably, rapidly

further reading: 11 DevOps Principles and Practices to Master: Pro Advice - by Fernando Doglio

what tools to use?

CI and CD became a fundamental part of software development
- got integrated into services like GitHub, GitLab, BitBucket, JetBrains Space
some solutions:
- Jenkins
  - open source, self hosted
- GitHub Actions
  - integrated to code hosting, free options
  - example: course repository
- GitLab Pipelines
  - integrated to code hosting, free options
- Travis CI
  - free for open source projects
- Circle CI
  - free options

Jenkins logo – The Jenkins project http://jenkins-ci.org/ | CC BY-SA
GitHub icon – Font Awesome Free 5.4.1 by @fontawesome - https://fontawesome.com | CC BY
GitLab icon by 292Jacob | CC BY-SA

automatized review

using CI environment
do static code analysis
- analyzing the code without execution
- searching for syntax errors, styling issues, bad practices or code smells
run test suite
generate review report from the findings

should not replace human reviewing

just decrease the work by automatizing trivial tasks

automatized review – example

CI services integrated into the code hosting / developer platforms
code changes can ba annotated by automatized review findings
- usually at a pull request level
a bit slower feedback than running static analysis or test locally

source: go-gitea/gitea at GitHub, pull request #31768

vulnerability alerts

Common Vulnerabilities and Exposures (CVE)
- a dictionary of common names (i.e., CVE Identifiers) for publicly known information security vulnerabilities (Wikipedia contributors, 2024b)
- Apple’s “goto fail” issue is officially called CVE-2014-1266
GitHub Dependabot
- uses package manager
- e.g., cargo (Rust), npm (JS), nuget (C#), maven (Java), poetry (Python)
- checks dependencies for vulnerabilities

[tool.poetry.dependencies]
python = "^3.12"
numpy = "^1.26.3"
pandas = "^2.2"
geopandas = "^1.0"
networkx = "^3.2.1"
osmnx = "^1.6.0"
matplotlib = "^3.8.2"
seaborn = "^0.13.0"
contextily = "^1.3.0"
opencv-python = "^4.9.0"
pyaml = "^23.9.7"
pyogrio = "^0.7"
pyarrow = "^15.0.0"
scipy = "^1.12.0"
haversine = "^2.8.1"
mapclassify = "^2.6.1"
openpyxl = "^3.1.2"
ecomplexity = "^0.5.2"
structlog = "^24.1.0"
h3 = "^3.7.7"
pandarallel = "^1.6.5"
jinja2 = "^3.1.4"
tabulate = "^0.9.0"

Python dependencies managed by poetry

GitLab also has a similar solution

dependabot example

remote: Resolving deltas: 100% (5/5), completed with 4 local objects.
remote: 
remote: GitHub found 1 vulnerability on pintergreg/software-engineering's default branch (1 high). To find out more, visit:
remote:      https://github.com/pintergreg/software-engineering/security/dependabot/1

command line warning after git push

dependencies

choose carefully the software packages / components your software will depend on
use well maintained software modules
unmaintained modules have potential vulnerabilities
aim for loose coupling regarding the dependency
- makes it easier to replace if needed

Dependency by Randall Munroe | CC BY-NC 2.5

I always advocate that do not reinvent the wheel. If there is a common task, it is probably better to use a library that already implements the function. On the other hand, always be careful which package you depend on. Unmaintained modules have potential vulnerabilities. Aim for loose coupling regarding the dependency, which makes it easier to replace if needed. Also, think over depending on a huge package (with multiple dependencies itself) for just a single function.

daily work

select a task from backlog
read and understand it
create a feature branch

write code, possibly using TDD
local unit testing, checks
open a pull request (initiate merge)

pull requests used to manage review
trigger for CI to run automated tests, static code analysis, generate reports, etc.
while your work is reviewed, start working on another task / review other’s work

if everything is fine, the task is done
eventually the change will be in production at the end users
as a part of a release
- a release can group together multiple changes
- then, scrum start a new iteration, or kanban continues as always
using some kind of deployment strategy

release versioning

a software release is identified by a version number
often seen as an arbitrary number

pre-releases

alpha: incomplete feature-wise, external release is uncommon for proprietary software
- whitebox testing
beta: the software is feature-complete but contains several known or unknown bugs
- blackbox testing
rc: release candidate, final touches
- highest level testing

odd number for development (4.1), even for stable (4.2)
Chromium: 131.0.6778.69
after GNOME 3.38, the “3.” was dropped and GNOME 40 vas released
- Java 1.6, 1.7, ~~1.8~~, 8, 9, 10…
Linux 5.19, 6.0
- “So, as is hopefully clear to everybody, the major version number change is more about me running out of fingers and toes than it is about any big fundamental changes.” – Linux Torvalds
since version 3, TeX has used an idiosyncratic version numbering system (Wikipedia contributors, 2024c)
- where updates have been indicated by adding an extra digit at the end of the decimal, so that the version number asymptotically approaches π
- last version is 3.141592653 (released in 2021)

semantic versioning

major version when you make incompatible API changes
- a way of communicating changes
minor version when you add functionality in a backward compatible manner
patch version when you make backward compatible bug fixes

additional labels for pre-release and build metadata are available as extensions to the major.minor.patch format | from semver.org

optional reading: Why I don’t like SemVer anymore

calendar versioning

format examples:

YYYY.MINOR.PATCH
- micro is used instead of patch
YYYY.MM.MINOR.PATCH

“CalVer is a versioning convention based on your project’s release calendar, instead of arbitrary numbers.” | calver.org

ZeroVer: 0-based versioning

“Your software’s major version should never exceed the first and most important number in computing: zero.” | 0ver.org

e.g.: 0.4.1
popular among open source software projects
- some may reach 1.0.0 eventually

semver: “If your software is being used in production, it should probably already be 1.0.0.”

Fibonacci releases

KDE / Plasma 6.1 series release schedule
version	type	release date	delta
6.1.0	Release	Tue 2024-06-18	0
6.1.1	Bugfix Release	Tue 2024-06-25	1
6.1.2	Bugfix Release	Tue 2024-07-02	1
6.1.3	Bugfix Release	Tue 2024-07-16	2
6.1.4	Bugfix Release	Tue 2024-08-06	3
6.1.5	Bugfix Release	Tue 2024-09-10	5

KDE / Plasma 6.1 series release schedule

interruption

the greatest “enemy” of a developer is interruption
the code is one thing, the logic behind it is another
- takes time to understand
context switching is costly
- switching between tasks
that is why it is advisable to define small tasks during the sprint planning
- 1–4 hours, but ideally closer to 1
- preventing interruption
- a programmer probably get one uninterrupted 2-hour session in a day (Parnin, 2013)

the cost of interruption

according to a study, the average lost time per major interruption is 23 minutes (Mark et al., 2008)
- for developers, it could be worse
- according to another study it is at least 15 minutes (Parnin, 2013)
“getting back to the exact state of mind you were at right before an interruption is nearly impossible” (To, 2018)
interruptions can be planned and unplanned

© Ash Lamb used with the author’s permission — © Ash Lamb
used with the author’s permission

source: The Cost of Interruption for Software Developers – by Steven To (To, 2018)

planned and unplanned interruptions

unplanned

someone asks about something or to do something
- usually a small task
  - informal review, advice, etc.
mitigation
- wear headphones (in open offices)
- notify in advance

planned

meetings, including standup
standup is usually the first thing in a workday, not to divide the work time until lunch
a wrongly placed meeting can be even worse than an unplanned interruption
- you have to keep in mind that you have a meeting, cannot start anyting serious
mitigation
- schedule small, easy tasks before meeting

source: The Cost of Interruption for Software Developers – by Steven To (To, 2018)

techniques to minimize context switching

time blocking
- divide workday into blocks
time batching
- do similar tasks in a batch
prioritize tasks
tackle the biggest task first in the morning
turn off notifications
adopt asynchronous communication
- e-mail, documentation, ADR

source: The high price of context switching for developers & ways to avoid it (Pande, 2021)

references

Driessen, V. (2010). A successful git branching model. https://nvie.com/posts/a-successful-git-branching-model .

Ernst, M. (2012). Version control concepts and best practices. https://homes.cs.washington.edu/~mernst/advice/version-control.html .

Fowler, M. (2024). Continuous integration. https://martinfowler.com/articles/continuousIntegration.html .

GitLab. (2022). 4 must-know DevOps principles. https://about.gitlab.com/blog/2022/02/11/4-must-know-devops-principles .

Judin, J. (2016). A succesful git branching model considered harmful. https://barro.github.io/2016/02/a-succesful-git-branching-model-considered-harmful .

Kazim, W. (2023). What is software deployment? Process and best practices. https://learn.g2.com/software-deployment .

Mark, G., Gudith, D., & Klocke, U. (2008). The cost of interrupted work: More speed and stress. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 107–110.

McCreary, J. (2017). When to make a git commit. https://dev.to/gonedark/when-to-make-a-git-commit .

Pande, N. (2021). The high price of context switching for developers & ways to avoid it. https://pacohq.com/blog/guide/the-high-price-of-context-switching-for-developers/ .

Parnin, C. (2013). Programmer, interrupted. 2013 IEEE Symposium on Visual Languages and Human Centric Computing, 171–172.

Shipp, S. (2019). War of the git flows. https://dev.to/scottshipp/war-of-the-git-flows-3ec2 .

To, S. (2018). The cost of interruption for software developers. https://www.brightdevelopers.com/the-cost-of-interruption-for-software-developers .

Wikipedia contributors. (2024a). Blue–green deployment — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Blue%E2%80%93green_deployment&oldid=1249842339.

Wikipedia contributors. (2024b). Common vulnerabilities and exposures — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Common_Vulnerabilities_and_Exposures&oldid=1256072917.

Wikipedia contributors. (2024c). TeX — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=TeX&oldid=1253226188.

handout