software design and architecture stack
hierarchy in style guides
- language level:
- Python: PEP 8 or pep8.org
- Ruby: Ruby Style Guide
- Rust The Rust Style Guide
- etc.
- organization level:
not just style guides, also best practices
write idiomatic code
- a prog. language implements a prog. paradigm
- a paradigm defines a certain “way” of writing code
- using different abstractions / building blocks
- promoting a given concept
- some languages implements multiple paradigms
- and languages have their own way of doing things
- languages have pros and cons for a given problem
just as in the case of natural languages, you ought to use a language properly
write idiomatic code
for (i = 0; i < 10; i++) {
console.log(i);
}
[...Array(10).keys()].forEach(i => {
console.log(i);
});
i = 0
while i < 10:
print(i)
i += 1
for i in range(10):
print(i)
for i in 0..9 do
puts i
end
(0..9).each do |i|
puts i
end
(0..9).each {|i| puts i}
clean code
Clean Code: A Handbook of Agile Software Craftsmanship
by Robert C. Martin (2009) (Martin, 2009)
meaningful names
this section is based on the book Clean Code (chapter 2) by Robert C. Martin (Martin, 2009)
with own examples
use intention-revealing names
int d; // elapsed time in days
the definition is only available ad the declaration
int elapsedTimeInDays;
the definition is available at every usage
multi-word names
camelCase
int elapsedTimeInDays;
- C (local variable)
- Java (variable, method)
UpperCamelCase (PascalCase)
public class DataCollector {}
- Java (class)
- Rust (Type, Enum)
snake_case
elapsed_time_in_days = 17
- Python
- Rust (variable, function)
a study states, camelCase is faster to type but snake_case is faster to read (Sharif & Maletic, 2010)
read the style guide
avoid disinformation
Do not refer to a grouping of accounts as an
accountList
unless it’s actually aList
(Martin, 2009).
better to use accounts
, it does not depend on the collection name
inconsistent spelling is also disinformation
disinformative names would be the use of lower-case
L
or uppercaseO
(Martin, 2009)
- they can look almost like the one and zero, respectively – use the right font
- PEP8 (Python style guide) forbids to use them
make meaningful distinctions
It is not sufficient to add number series or noise words, even though the compiler is satisfied. If names must be different, then they should also mean something different (Martin, 2009).
def calculate_distance(data: pd.DataFrame) -> pd.Series:
# do something
def calculate_distance2(data: pd.DataFrame) -> pd.Series:
# do something else
def calculate_eucledian_distance(data: pd.DataFrame) -> pd.Series:
# ...
def calculate_levenshtein_distance(data: pd.DataFrame) -> pd.Series:
# ...
make meaningful distinctions / noise words
Noise words are another meaningless distinction. Imagine that you have a
Product
class. If you have another calledProductInfo
orProductData
, you have made the names different without making them mean anything different (Martin, 2009).
use pronounceable names
If you can’t pronounce it, you can’t discuss it without sounding like an idiot (Martin, 2009).
- Should
etid
be an integer? - Should
elapsed_time_in_days
be an integer?
could be especially important for non-native speakers as some words are more difficult to pronounce
use searchable names
Single-letter names can ONLY be used as local variables inside short methods. The length of a name should correspond to the size of its scope (Martin, 2009).
it’s OK to do this:
for i in range(10):
print(i)
it’s NOT OK in a large scope:
int d; // elapsed time in days
names for classes, functions
- a class is a model / blueprint of something
- the name should be a noun
- e.g.,
User
,Activity
- e.g.,
- an object is an instance of a class
- still a noun
- e.g.,
user = User()
- a function does something
- the name should contain a verb
- in imperative
- e.g.,
aggregate_activity
activity_aggregation
avoid encodings
with modern IDEs it is pointless to put type or role markers into names
Hungarian notation
- invented by Charles Simonyi at Microsoft
- adding a prefix to a name that gives information about type, length, or scope
def fnFactorial(iNum):
if iNum == 1:
return iNum
else:
return iNum * fnFactorial(iNum - 1)
source: (Bhargav, 2024)
interface IShapeArea // I is also a prefix
{
void area();
}
interface ShapeArea
{
void area();
}
avoid mental mapping
Readers shouldn’t have to mentally translate your names into other names they already know (Martin, 2009).
don’t pun or use humor
- no inside jokes
- no colloquialisms or slang
- be objective and professional
Say what you mean. Mean what you say (Martin, 2009).
pick one word per concept
it’s confusing to have
fetch
,retrieve
, andget
as equivalent methods of different classes (Martin, 2009)
it also helps to search for the term
add meaningful context
Imagine that you have variables named firstName, lastName, street, houseNumber, city, state, and zipcode. Taken together it’s pretty clear that they form an address. But what if you just saw the state variable being used alone in a method? (Martin, 2009)
- adding a prefix?
- e.g.,
addrCity
,addrStreet
,addrState
- e.g.,
- as notations are discouraged, use an
Address
class instead to add context
functions
this section is based on the book Clean Code (chapter 3) by Robert C. Martin (Martin, 2009)
with own examples
functions should be as small as possible
Functions should hardly ever be 20 lines long (Martin, 2009)
- shorter functions are easier to understand
do one thing (single responsibility principle)
import sqlite3
import pandas as pd
con = sqlite3.connect("data.db")
data = pd.read_sql(activity_query, con)
records = []
for woy in range(36, 40):
for dow in range(1, 8):
records.append([woy, dow, 0])
empty = pd.DataFrame.from_records(
records, columns=["week_of_year", "day_of_week", "count"]
)
data = (
pd.concat([data, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
activity = pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
res = con.execute(progress_query)
progress = res.fetchone()[0]
SELECT
CAST(
strftime('%W', timestamp)
AS INTEGER
) AS week_of_year,
CAST(
strftime('%u', timestamp)
AS INTEGER
) AS day_of_week,
count(*) AS count
FROM activity
WHERE
user_id = 42 AND
week_of_year > 35 AND
week_of_year < 40
GROUP BY
week_of_year,
day_of_week;
SELECT
lesson / 50.0 AS progress
FROM activity
WHERE
user_id = 42 AND
result = 'success'
ORDER BY lesson DESC
LIMIT 1;
debug tables
week_of_year | day_of_week | count |
---|---|---|
36 | 2 | 1 |
38 | 5 | 1 |
39 | 6 | 2 |
queried user activity
day_of_week | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
week_of_year | |||||||
36 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
37 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
38 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
39 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
pivoted user activity table
week_of_year | day_of_week | count |
---|---|---|
36 | 1 | 0 |
36 | 2 | 0 |
… | … | … |
36 | 7 | 0 |
37 | 1 | 0 |
… | … | … |
37 | 7 | 0 |
38 | 1 | 0 |
… | … | … |
38 | 5 | 0 |
… | … | … |
39 | 6 | 0 |
39 | 7 | 0 |
empty activity table
the inverse scope law of function names
The longer the scope of a function, the shorter its name should be. Functions that are called locally from a few nearby places should have long descriptive names, and the longest function names should be given to those functions that are called from just one place.
“longer scope”: more general part of a code
function arguments
- do not use more than three (Martin, 2009)
- what if you’d need more?
- wrap it into an object
- do not use flags
- “Flag arguments are ugly […] loudly proclaiming that this function does more than one thing (Martin, 2009).”
def build_empty_dataframe(start, end, cols):
records = []
for woy in range(start, end + 1):
for dow in range(1, 8):
records.append([woy, dow, 0])
return pd.DataFrame.from_records(
records, columns=cols
)
def query_progress(as_percentage: bool):
res = con.execute(progress_query)
progress = res.fetchone()[0]
if as_percentage:
return progress * 100
else:
return progress
function as interface
DataFrame.to_csv(
path_or_buf=None, *,
sep=',',
na_rep='',
float_format=None,
columns=None,
header=True,
index=True,
index_label=None,
mode='w',
encoding=None,
compression='infer',
quoting=None,
quotechar='"',
lineterminator=None,
chunksize=None,
date_format=None,
doublequote=True,
escapechar=None,
decimal='.',
errors='strict',
storage_options=None
)
no side effects
Side effects are lies. Your function promises to do one thing, but it also does other hidden things (Martin, 2009).
– Robert C. Martin
an operation, function or expression is said to have a side effect if it modifies some state variable value(s) outside its local environment, that is to say has an observable effect besides returning a value (the main effect) to the invoker of the operation (Wikipedia contributors, 2022).
side effect example
class Something:
foo = 0
def increase(self, by):
self.foo += by
def decrease(self, by):
self.foo -= by
something = Something()
print(something.foo) # 0
something.increase(2)
print(something.foo) # 2
smth = {"foo": 0}
def increase(what, by):
return what + by
def decrease(what, by):
return what - by
print(smth["foo"]) # 0
increase(smth["foo"], 2) # 2
print(smth["foo"]) # 0
smth["foo"] = increase(smth["foo"], 2)
print(smth["foo"]) # 2
prefer exceptions to returning error codes
- in unix-like systems processes still return 0 if the execution was successful
- but returning error codes in functions are discouraged
FileNotFoundException
is better thanERRCODE_26375
- meaningful name
- no mental mapping
- exception handling syntactically more readable
comments
this section is based on the book Clean Code (chapter 4) by Robert C. Martin (Martin, 2009)
with own examples
separating comments
# connect to the database
con = sqlite3.connect("data.db")
# query activity data
data = pd.read_sql(activity_query, con)
# create empty dataframe
records = []
for woy in range(36, 40):
for dow in range(1, 8):
records.append([woy, dow, 0])
empty = pd.DataFrame.from_records(records, columns=["week_of_year", "day_of_week", "count"])
# combine empty and sparse dataframe
data = (
pd.concat([data, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
# pivot dataframe
activity = pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
separated functions
def create_empty_dataframe(start_week, end_week):
records = []
for woy in range(start_week, end_week+1):
for dow in range(1, 8):
records.append([woy, dow, 0])
return pd.DataFrame.from_records(
records, columns=["week_of_year", "day_of_week", "count"]
)
def fill_empty_with_activities(empty, activities):
return (
pd.concat([activities, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
def pivot_dataframe(data):
return pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
these functions do one thing
separated functions - usage
con = sqlite3.connect("data.db")
activities = pd.read_sql(activity_query, con)
empty = create_empty_dataframe(36, 39)
data = fill_empty_with_activities(emty, activities)
activities_matrix = pivot_dataframe(data)
only the comments remained, which can be read as a prose
more bad comments
journal comment
# 2024-10-17 -- Add idiomatic coding examples
# 2024-10-18 -- Add meaningful names section
the version tracker keeps better journal
noise comments
# creates an empty dataframe
def create_empty_dataframe(start_week, end_week):
# ...
don’t write something that is already in the code
closing brace comments
for (i = 0; i < 10; i++) {
console.log(i);
} // for
modern editors can find (end display) the block endings
Apollo 11 - Colossus 2A
P21VSAVE DLOAD # SAVE CURRENT BASE VECTOR
TAT
STOVL P21TIME # ..TIME
RATT1
STOVL P21BASER # ..POS B-29 OR B-27
VATT1
STORE P21BASEV # ..VEL B-7 OR B-5
ABVAL SL*
0,2
STOVL P21VEL # /VEL/ FOR N73 DSP
RATT
UNIT DOT
VATT # U(R).(V)
DDV ASIN # U(R).U(V)
P21VEL
STORE P21GAM # SIN-1 U(R).U(V), -90 TO +90
SXA,2 SET
P21ORIG # 0 = EARTH 2 = MOON
P21FLAG
source, GitHub repository, more about the Apollo Guidance Computer: (Slavin, 2015)
good comments
legal comments
some open source licences should be included to the beginning of the files
informative comments
import re
timestamp = "2024-10-22 09:30:42"
# matches for timestamps in the format of: YYYY-MM-DD HH:MM:SS
re.match(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", timestamp)
TODOs – good or bad?
# TODO: this allows invalid month, day, hour, minute and second values
re.match(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", timestamp)
editors can collect TODO
(and FIXME
)
annotations and warn about them
documentation
def fizzbuzz(i: int) -> str:
"""Fizzbuzz is a game for children to teach them about division.
It is also a common coding practice.
Parameters
----------
i : int
Input number tested against division by 3, 5 and 15.
Returns
-------
str
`Fizz` if input divisible by 3, `Buzz` if divisible by 5 and `FizzBuzz` if both.
"""
result = ""
if i % 15 == 0:
result += "FizzBuzz"
elif i % 3 == 0:
result += "Fizz"
elif i % 5 == 0:
result += "Buzz"
else:
result = str(i)
return result
doctest
def fizzbuzz(i: int) -> str:
"""
>>> fizzbuzz(3)
'Fizz'
>>> fizzbuzz(5)
'Buzz'
>>> fizzbuzz(12)
'Fizz'
>>> fizzbuzz(15)
'FizzBuzz'
>>> fizzbuzz(17)
'17'
"""
result = ""
if i % 15 == 0:
result += "FizzBuzz"
elif i % 3 == 0:
result += "Fizz"
elif i % 5 == 0:
result += "Buzz"
else:
result = str(i)
return result
references
Bhargav, N. (2024). Hungarian notation. https://www.baeldung.com/cs/hungarian-notation .
Martin, R. C. (2009). Clean code: A handbook of agile software craftsmanship. Pearson Education.
Sharif, B., & Maletic, J. I. (2010). An eye tracking study on camelcase and under_score identifier styles. 2010 IEEE 18th International Conference on Program Comprehension, 196–205.
Slavin, T. (2015). Coding the apollo guidance computer (AGC). https://kidscodecs.com/coding-the-apollo-guidance-computer-agc/ .
Stemmler, K. (2019). How to learn software design and architecture. https://khalilstemmler.com/articles/software-design-architecture/full-stack-software-design .
Wikipedia contributors. (2022). Side effect (computer science) — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Side_effect_(computer_science)&oldid=1063806709.