Gergő Pintér, PhD
gergo.pinter@uni-corvinus.hu
the turtle and rabbit figures by Delapouite under CC BY 3.0 via game-icons.net
def fizzbuzz(i: int) -> str:
"""
>>> fizzbuzz(3)
'Fizz'
>>> fizzbuzz(5)
'Buzz'
>>> fizzbuzz(15)
'FizzBuzz'
>>> fizzbuzz(17)
'17'
"""
result = ""
if i % 15 == 0:
result += "FizzBuzz"
elif i % 3 == 0:
result += "Fizz"
elif i % 5 == 0:
result += "Buzz"
else:
result = str(i)
return result
doctest in Python
source: [3]
The terms ‘unit test’ and ‘integration test’ have always been rather murky, even by the slippery standards of most software terminology.
– Martin Fowler [4]
in most of my examples a unit will be represented by a method
code/fizzbuzz.py
parts of a unit test
parts of a unit test
def query_progress(user_id: int) -> float:
# establish database connection
con = sqlite3.connect("data.db")
# build query
progress_query = f"""
SELECT
lesson / 50.0 AS progress
FROM activity
WHERE
user_id = {user_id} AND
result = 'success'
ORDER BY
lesson DESC
LIMIT 1
;
"""
# execute query
res = con.execute(progress_query)
progress = res.fetchone()[0]
return progress
architectural styles provides patterns to separate the business logic from the persistence layer
unit testing usually targets the business logic
def query_last_finished_lesson(
user_id: int
) -> float:
# establish database connection
con = sqlite3.connect("data.db")
# build query
query = f"""
SELECT lesson
FROM activity
WHERE
user_id = {user_id} AND
result = 'success'
ORDER BY lesson DESC
LIMIT 1;
"""
# execute query
res = con.execute(query)
return res.fetchone()[0]
def calculate_progress(
finished: int, total: int
) -> float:
return finished / total
def calculate_user_progress(
user_id: int, total: int
) -> float:
f = query_last_finished_lesson(user_id)
return calculate_progress(f, total)
def establish_database_connection(
path: str = "data.db"
) -> sqlite3.Connection:
return sqlite3.connect(path)
there is no open standard for categories
these are from the book xUnit test patterns: Refactoring test code – by Gerard Meszaros [6]
The simplest, most primitive type of test double. Dummies contain no implementation and are mostly used when required as parameter values, but not otherwise utilized. Nulls can be considered dummies, but real dummies are derivations of interfaces or base classes without any implementation at all.
– Mark Seemann [5]
provides static input
A step up from dummies, stubs are minimal implementations of interfaces or base classes. Methods returning void will typically contain no implementation at all, while methods returning values will typically return hard-coded values.
– Mark Seemann [5]
A test spy is similar to a stub, but besides giving clients an instance on which to invoke members, a spy will also record which members were invoked so that unit tests can verify that members were invoked as expected.
– Mark Seemann [5]
One form of this might be an email service that records how many messages it was sent.
– Martin Fowler [7]
or keeping track of the test user (of the learning app) and give back values according to the input parameter
A fake contains more complex implementations, typically handling interactions between different members of the type it’s inheriting. While not a complete production implementation, a fake may resemble a production implementation, albeit with some shortcuts.
– Mark Seemann [5]
when you add logic for the test double, that might be tested as well
require 'sinatra'
def generate_progress
rand.round(2)
end
def generate_activity_matrix
result = []
(1..4).each do |_w|
daily = []
(1..7).each {|_d| daily.push rand(10)}
result.push daily
end
result
end
get '/user-statistics' do
data = {}
data['name'] = 'Marvin'
data['id'] = 42
data['registration'] = '2019-10-02'
data['progress'] = generate_progress
data['activity'] = generate_activity_matrix
return data.to_json
end
A mock is dynamically created by a mock library (the others are typically produced by a test developer using code). The test developer never sees the actual code implementing the interface or base class, but can configure the mock to provide return values, expect particular members to be invoked, and so on. Depending on its configuration, a mock can behave like a dummy, a stub, or a spy.
– Mark Seemann [5]
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.
– Martin Fowler [8]
fizzbuzz.py
test_fizzbuzz.py
NameError: name ‘fizzbuzz’ is not defined
TypeError: fizzbuzz() takes 0 positional arguments but 1 was given
AssertionError: assert None == ‘Fizz’
passed
AssertionError: assert ‘Fizz’ == ‘Buzz’ (5)
passed
AssertionError: assert ‘Fizz’ == ‘FizzBuzz’ (15)
passed
AssertionError: assert None == ‘17’ (17)
passed
there is not much to improve on the code,
except that according to the PEP8 Python style guide the ‘star import’
is not allowed; it should be import fizzbuzz
As the tests get more specific, the code gets more generic.
– Robert C. Martin, The Cycles of TDD [9]
source: Robert C. Martin, The Transformation Priority Premise [10]
experiment-driven testing
task: get day from a date string like Nov 08, 13:11
Title (one line describing the story)
Narrative:
As a [role]
I want [feature]
So that [benefit]
Acceptance Criteria: (presented as Scenarios)
Scenario 1: Title
Given [context]
And [some more context]...
When [event]
Then [outcome]
And [another outcome]...
Scenario 2: ...
taken from [11] by Daniel Terhorst-North | CC-BY 4.0
Title (one line describing the story)
Narrative:
As a [role]
I want [feature]
So that [benefit]
Acceptance Criteria: (presented as Scenarios)
Scenario 1: Title
Given [context]
And [some more context]...
When [event]
Then [outcome]
And [another outcome]...
Scenario 2: ...
taken from [11] by Daniel Terhorst-North | CC-BY 4.0
test format like BDD, example from [12]:
Given Book that has not been checked out
And User who is registered on the system
When User checks out a book
Then Book is marked as checked out
beautifully crafted library with no documentation is damn near worthless […]
So how do we solve this problem? Write your Readme first.
– by Tom Preston-Werner [13]
readme ~ user manual, but brief, concise
source: Readme Driven Development – by Tom Preston-Werner [13]
code/fizzbuzz.py
code/test_fizzbuzz.py
test coverage: 70%
test coverage: 90%
test coverage: 100%
four control flow branch, all of them needs to be tested
it is hard to objectively measure the quality of code
from progress import calculate_progress
def test_progress():
total = 50
for i in range(total + 1):
expected = i / total
actual = calculate_progress(i, total, False)
assert actual == expected
def test_progress_percentage():
total = 50
for i in range(total + 1):
expected = i / total * 100
actual = calculate_progress(i, total, True)
assert actual == expected
test coverage: 100%, achievement obtained, but this is completely stupid
def calculate_progress(
finished: int,
total: int,
as_percentage: bool,
) -> float:
progress = finished / total
if as_percentage:
return progress * 100
else:
return progress
this function need some value checking
test coverage only measures that every control flow branch is tested
the point of testing is testing for the edge cases
Story: Account Holder withdraws cash
As an Account Holder
I want to withdraw cash from an ATM
So that I can get money when the bank is closed
story example taken from What’s in a Story? [11] by Daniel Terhorst-North | CC-BY 4.0
an acceptance criterion:
Scenario 1: Account has sufficient funds
Given the account balance is $100
And the card is valid
And the machine contains enough money
When the Account Holder requests $20
Then the ATM should dispense $20
And the account balance should be $80
And the card should be returned
technical debt
implied cost of future reworking because a solution prioritized short-term solution over long-term design [17] [18]
Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.
– Michael Feathers, Working Effectively with Legacy Code: Preface [19]
the footprint, the compass and the flag figures by Lorc under CC BY 3.0 via game-icons.net
When we change code, we should have tests in place. To put tests in place, we often have to change code.
– Michael Feathers, Working Effectively with Legacy Code [19]
(Part I / Chapter 2)
source: Working Effectively with Legacy Code by Michael Feathers [19]
We break dependencies to sense when we can’t access values our code computes.
– Michael Feathers, Working Effectively with Legacy Code [19]
e.g., misspelled function name
We break dependencies to separate when we can’t even get a piece of code into a test harness to run.
– Michael Feathers, Working Effectively with Legacy Code [19]
A seam is a place where you can alter behavior in your program without editing in that place.
– Michael Feathers, Working Effectively with Legacy Code: Part I / chp. 4 [19]
A seam is a place in the code that you can insert a modification in behavior. […] One way to take advantage of a seam is to insert some sort of fake.
– tallseth via Stackoverflow | CC BY-SA 3.0
add feature | fix a bug | refactor | optimize | |
---|---|---|---|---|
structure | changes | changes | changes | |
new funcionality | changes | |||
functionality | changes | |||
resource usage | changes |
Michael Feathers, Working Effectively with Legacy Code: part 1 pp 6 [19]
black box
white box
source: Smoke testing (software), Wikipedia [20]
“The phrase smoke test comes from electronic hardware testing. You plug in a new board and turn on the power. If you see smoke coming from the board, turn off the power. You don’t have to do any more testing. [21]”