Gergő Pintér, PhD
gergo.pinter@uni-corvinus.hu
not just style guides, also best practices
just as in the case of natural languages, you ought to use a language properly
Clean Code: A Handbook of Agile Software Craftsmanship
by Robert C. Martin (2009) [2]
this section is based on the book Clean Code (chapter 2) by Robert C. Martin [2]
with own examples
a study states, camelCase is faster to type but snake_case is faster to read [3]
read the style guide
Do not refer to a grouping of accounts as an
accountList
unless it’s actually aList
[2].
better to use accounts
, it does not depend on the
collection name
inconsistent spelling is also disinformation
disinformative names would be the use of lower-case
L
or uppercaseO
[2]
It is not sufficient to add number series or noise words, even though the compiler is satisfied. If names must be different, then they should also mean something different [2].
Noise words are another meaningless distinction. Imagine that you have a
Product
class. If you have another calledProductInfo
orProductData
, you have made the names different without making them mean anything different [2].
If you can’t pronounce it, you can’t discuss it without sounding like an idiot [2].
etid
be an integer?elapsed_time_in_days
be an integer?could be especially important for non-native speakers as some words are more difficult to pronounce
User
, Activity
user = User()
aggregate_activity
activity_aggregation
with modern IDEs it is pointless to put type or role markers into names
Hungarian notation
source: [4]
Readers shouldn’t have to mentally translate your names into other names they already know [2].
Say what you mean. Mean what you say [2].
it’s confusing to have
fetch
,retrieve
, andget
as equivalent methods of different classes [2]
it also helps to search for the term
Imagine that you have variables named firstName, lastName, street, houseNumber, city, state, and zipcode. Taken together it’s pretty clear that they form an address. But what if you just saw the state variable being used alone in a method? [2]
addrCity
, addrStreet
,
addrState
Address
class
instead to add contextthis section is based on the book Clean Code (chapter 3) by Robert C. Martin [2]
with own examples
Functions should hardly ever be 20 lines long [2]
import sqlite3
import pandas as pd
con = sqlite3.connect("data.db")
data = pd.read_sql(activity_query, con)
records = []
for woy in range(36, 40):
for dow in range(1, 8):
records.append([woy, dow, 0])
empty = pd.DataFrame.from_records(
records, columns=["week_of_year", "day_of_week", "count"]
)
data = (
pd.concat([data, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
activity = pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
res = con.execute(progress_query)
progress = res.fetchone()[0]
week_of_year | day_of_week | count |
---|---|---|
36 | 2 | 1 |
38 | 5 | 1 |
39 | 6 | 2 |
day_of_week | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
week_of_year | |||||||
36 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
37 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
38 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
39 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
week_of_year | day_of_week | count |
---|---|---|
36 | 1 | 0 |
36 | 2 | 0 |
… | … | … |
36 | 7 | 0 |
37 | 1 | 0 |
… | … | … |
37 | 7 | 0 |
38 | 1 | 0 |
… | … | … |
38 | 5 | 0 |
… | … | … |
39 | 6 | 0 |
39 | 7 | 0 |
The longer the scope of a function, the shorter its name should be. Functions that are called locally from a few nearby places should have long descriptive names, and the longest function names should be given to those functions that are called from just one place.
“longer scope”: more general part of a code
DataFrame.to_csv(
path_or_buf=None, *,
sep=',',
na_rep='',
float_format=None,
columns=None,
header=True,
index=True,
index_label=None,
mode='w',
encoding=None,
compression='infer',
quoting=None,
quotechar='"',
lineterminator=None,
chunksize=None,
date_format=None,
doublequote=True,
escapechar=None,
decimal='.',
errors='strict',
storage_options=None
)
Side effects are lies. Your function promises to do one thing, but it also does other hidden things [2].
– Robert C. Martin
an operation, function or expression is said to have a side effect if it modifies some state variable value(s) outside its local environment, that is to say has an observable effect besides returning a value (the main effect) to the invoker of the operation [5].
FileNotFoundException
is better than
ERRCODE_26375
# connect to the database
con = sqlite3.connect("data.db")
# query activity data
data = pd.read_sql(activity_query, con)
# create empty dataframe
records = []
for woy in range(36, 40):
for dow in range(1, 8):
records.append([woy, dow, 0])
empty = pd.DataFrame.from_records(records, columns=["week_of_year", "day_of_week", "count"])
# combine empty and sparse dataframe
data = (
pd.concat([data, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
# pivot dataframe
activity = pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
def create_empty_dataframe(start_week, end_week):
records = []
for woy in range(start_week, end_week+1):
for dow in range(1, 8):
records.append([woy, dow, 0])
return pd.DataFrame.from_records(
records, columns=["week_of_year", "day_of_week", "count"]
)
def fill_empty_with_activities(empty, activities):
return (
pd.concat([activities, empty])
.drop_duplicates(subset=["week_of_year", "day_of_week"], keep="first")
.sort_values(["week_of_year", "day_of_week"])
.reset_index(drop=True)
)
def pivot_dataframe(data):
return pd.pivot(
data, index=["week_of_year"], columns=["day_of_week"], values=["count"]
).values
these functions do one thing
con = sqlite3.connect("data.db")
activities = pd.read_sql(activity_query, con)
empty = create_empty_dataframe(36, 39)
data = fill_empty_with_activities(emty, activities)
activities_matrix = pivot_dataframe(data)
only the comments remained, which can be read as a prose
journal comment
the version tracker keeps better journal
noise comments
don’t write something that is already in the code
P21VSAVE DLOAD # SAVE CURRENT BASE VECTOR
TAT
STOVL P21TIME # ..TIME
RATT1
STOVL P21BASER # ..POS B-29 OR B-27
VATT1
STORE P21BASEV # ..VEL B-7 OR B-5
ABVAL SL*
0,2
STOVL P21VEL # /VEL/ FOR N73 DSP
RATT
UNIT DOT
VATT # U(R).(V)
DDV ASIN # U(R).U(V)
P21VEL
STORE P21GAM # SIN-1 U(R).U(V), -90 TO +90
SXA,2 SET
P21ORIG # 0 = EARTH 2 = MOON
P21FLAG
source, GitHub repository, more about the Apollo Guidance Computer: [6]
legal comments
some open source licences should be included to the beginning of the files
informative comments
def fizzbuzz(i: int) -> str:
"""Fizzbuzz is a game for children to teach them about division.
It is also a common coding practice.
Parameters
----------
i : int
Input number tested against division by 3, 5 and 15.
Returns
-------
str
`Fizz` if input divisible by 3, `Buzz` if divisible by 5 and `FizzBuzz` if both.
"""
result = ""
if i % 15 == 0:
result += "FizzBuzz"
elif i % 3 == 0:
result += "Fizz"
elif i % 5 == 0:
result += "Buzz"
else:
result = str(i)
return result
def fizzbuzz(i: int) -> str:
"""
>>> fizzbuzz(3)
'Fizz'
>>> fizzbuzz(5)
'Buzz'
>>> fizzbuzz(12)
'Fizz'
>>> fizzbuzz(15)
'FizzBuzz'
>>> fizzbuzz(17)
'17'
"""
result = ""
if i % 15 == 0:
result += "FizzBuzz"
elif i % 3 == 0:
result += "Fizz"
elif i % 5 == 0:
result += "Buzz"
else:
result = str(i)
return result
comments
this section is based on the book Clean Code (chapter 4) by Robert C. Martin [2]
with own examples