The Pragmatic Programmer: Chapter 4

posted on December 13, 2023

Pragmatic Paranoia

Nobody writes perfect code. Just as we’ve all been taught to be defensive drivers, so we should be defensive coders.

Design By Contract

First developed by Bertrand Meyer for the language Eiffel. A correct program is one that does no more and no less than it claims to do. Documenting and verifying those claims is the heart of Design By Contract (DBC). Expectations and claims are described as follows:

preconditions: what needs to be true for a routine to be called
without DBC, maybe like conditional parsing input to conditionally call a function
postconditions: the state of data after the routine is done (requires that it will conclude, so no infinite loops)
without DBC, maybe like parsing output and returning data or error
class invariants: class ensures this conditions is true from the perspective of the caller (not necessarily internal to the routine when running) once the routine is finished
without DBC, maybe like an assertion about output Summarized as: If all the routine’s preconditions are met by the caller, the routine shall guarantee that all postconditions and invariants will be true when it completes. If the contract is broken, the “remedy” is invoked, which may be an exception or program termination. This shouldn’t happen; it’s a bug. Some languages have better support for these concepts than others. Clojure has pre-conditions and post-conditions. Elixir has guard clauses. Even in languages that don’t support these concepts, you can honor the principles (Zod is one example). If orthogonal (decoupled) code is “shy” (so that it’s concerns are its’ own), DBC code is “lazy”: be strict in what you will accept before you begin, and promise as little as possible in return. This may seem to contradict Postel’s Law / the Robustness Principle

Be liberal in what you accept, and conservative in what you send.

But a series of “lazy” functions can sum to a liberal / robust acceptance.

DBC differs from Test-Driven Development and Defensive Programming in the following:

DBC requires no mocking or setup.
DBC defines the parameter for success or failure in all cases, whereas testing can only target one specific case at a time.
TDD happens only during the build cycle; DBC and assertions are runtime, so they exist through all cycles.
TDD does not generally focus on checking internal invariants.
DBC is more efficient and DRY-er than defensive programming; if no one has to validate the data, then everyone does.

Implementing DBC

Simply enumerating the input domain range, boundary conditions, and what the routine promises to deliver (and therefore what it doesn’t promise to deliver) is a huge leap forward in writing better software. Most languages don’t support DBC in the code, so you can implement it the best you can.

Assertions: runtime checks of logical conditions
if used in classes where extended from parent / superclass, assertions must be manually called or recreated; not auto-inherited
if you tie assertions to a log level, they may be turned off
there’s no concept of “old” values—values as they existed at the entry to a method, so you have save / assign any data you want to check in the post condition
the runtime doesn’t support checking contracts, so you’re left with bolting it on (like throwing an error)
Crashing Early
validate your input and crash early so that, for example, you’re not passing a NaN value down the line to a sqrt function
Semantic Invariants: a kind of “philosophical contract”
semantic invariants are endemic to the meaning of the thing; they are not changeable business logic
when you find one, state it clearly and concisely
e.g., if building a debit transaction system: “Err in favor of the consumer.”
Dynamic Contracts and Agents
e.g., “I can’t provide that, but if you give me this, then I might provide something else.”

Section Challenges

[ ] Points to ponder: If DBC is so powerful, why isn’t it used more widely? Is it hard to come up with the contract? Does it make you think about issues you’d rather ignore for now? Does it force you to THINK!? Clearly, this is a dangerous tool!
[ ] Exercise 14 Design an interface to a kitchen blender. It will eventually be a web-based, IoT-enabled blender, but for now we just need the interface to control it. It has ten speed settings (0 means off). You can’t operate it empty, and you can change the speed only one unit at a time (that is, from 0 to 1, and from 1 to 2, not from 0 to 2). Here are the methods. Add appropriate pre- and postconditions and an invariant.

int getSpeed()
void setSpeed(int x)
boolean isFull()
void fill()
void empty()Code language: JavaScript (javascript)

[ ] Exercise 15 (possible answer) How many numbers are in the series 0, 5, 10, 15, …, 100?

Dead Programs Tell No Lies

It’s easy to fall into the “that can’t happen” mentality: “Does my switch statement really need a default case?!” But we’re coding defensively; we make sure the data is what we think it is, the code in production is the code we think it is, the correct dependency versions were loaded, etc.

The application code shouldn’t be eclipsed by the error handling code. If the caller has to catch every form of exception and raise the appropriate error, the code is coupled: if the author of the called function adds another exception, the caller is subtly out-of-date.

The Erlang and Elixir languages embrace a “Crash Early” (crash, don’t trash) philosophy.

Defensive programming is a waste of time. Let it crash!—Joe Armstrong

In these environments, crashes are managed with supervisors, which are responsible for cleaning up after it, restarting it, etc. Supervisors are supervised, creating a design of supervisor trees. This technique creates high-availability, fault-tolerant systems. This might not always be appropriate: you may have allocated resources that need freeing, need to log messages, finish transactions, etc.

Still, if the “impossible” happens, your program is no longer viable, so terminate it as soon as possible. A dead program normally does a lot less damage than a broken one.

Assertive Programming

We deceive ourselves when we say “This can never happen…”. Use assertions to prevent the impossible. Whenever you find yourself thinking “but of course this could never happen”, add code to check it. Assertions, however, do not replace error handling.

Be careful of of side effects when making assertions—like calling .next() on an iterator.
Leave assertions on in prod!

Section Challenges

[ ] Exercise 16 A quick reality check. Which of these “impossible” things can happen?
A month with fewer than 28 days
Error code from a system call: can’t access the current directory
In C++: a = 2; b = 3; but (a + b) does not equal 5
A triangle with an interior angle sum ≠ 180°
A minute that doesn’t have 60 seconds
(a + 1) <= a

How to Balance Resources

In short, be careful when allocating resources (like opening a file). Don’t couple functions tightly together by sharing a file resource that one opens and the other closes. Instead, act locally. Some languages have fail-safes for closing filesystem resource references automatically, like Java’s try-with-resources statements. General advice:

Deallocate resources in the reverse order of there allocation so you don’t orphan resources if one contains a reference to another.
When allocating the same set of resources throughout your codebase, use the same ordering. This reduces the possibility of deadlock. I.e., process A claims resource 1 and wants resource 2, but process B claims resource 2 and wants resource 1, causing both to hang.
The resource could be transactions, network connections, memory, files, threads, windows, etc.
Consider balancing over time. Applied to log files, you might ask
Do you rotate the logs and clean them up?
How do you handle the finite space you have for logs?
What are you doing with your unofficial debug logs?
If using a DB, do you expire the records?
Object oriented languages can wrap the resource usage in a class. When the class representing the resource is constructed, you allocate the resource; when destructed (and garbage collected, maybe) you deallocate. This can really help when the language you’re using allows exceptions to interfere with resource deallocation.
How to ensure that you deallocate resources if there’s an exception? Generally two choices:
Use variable scope.
Use finally clause (of try...catch...finally).
Sometimes you cannot balance resources through the resource allocation pattern. Try to establish a semantic invariant for memory allocation. Who is responsible for the data in an aggregate data structure? Three man options:
Top-level structure is responsible for freeing any substructures it contains. These structures recursively delete data they contain.
Top-level structure is deallocated and structures that it points to are orphaned.
Top-level structure refuses to deallocate itself if it contains any substructures.

Section Challenges

[ ] Although there are no guaranteed ways of ensuring that you always free resources, certain design techniques, when applied consistently, will help. In the text we discussed how establishing a semantic invariant for major data structures could direct memory deallocation decisions. Consider how Topic 23, Design by Contract, could help refine this idea.
[ ] Exercise 17 Some C and C++ developers make a point of setting a pointer to NULL after they deallocate the memory it references. Why is this a good idea?
[ ] Exercise 18 Some Java developers make a point of setting an object variable to NULL after they have finished using the object. Why is this a good idea?

Don’t Outrun Your Headlights

Take small steps—always. The rate of feedback you can receive is your speed limit. Feedback is what independently confirms or disproves your action. Steps too large are those that require any “fortune telling”. Fortune telling feels like:

Estimate completion dates months in the future.
Plan a design for future maintenance or extendability.
Guess user’s future needs.
Guess future tech availability.

Designing for future maintenance only works up to a point—only as far ahead as you can see. If you’re aiming further than that, instead, design code that’s easy to change. Make it easy to delete.

The Pragmatic Programmer: Chapter 3

posted on November 30, 2023

The Basic Tools

Invest in your own basic toolbox.

The Power of Plain Text

Plain text is the medium of our craft and lends itself well to storing knowledge. You don’t need an application to interpret a binary format—it’s immediately human-parsable, human-readable, and human-understandable. Plain text can be structured (HTML, Markdown, JSON, YAML), which can help with contextually parsing the meaning of the text.

many useful tools have been built to leverage the advantages text, like diff, grep, etc.
plays well with the Unix philosophy

Section Challenges

[ ] Design a small address book database (name, phone number, and so on) using a straightforward binary representation in your language of choice. Do this before reading the rest of this challenge.
Translate that format into a plain-text format using XML or JSON.
For each version, add a new, variable-length field called directions in which you might enter directions to each person’s house.
What issues come up regarding versioning and extensibility? Which form was easier to modify? What about converting existing data?

Shell Games

GUI interfaces are good, but WYSIWYG and WYSIAYG (what you see is all you get). Take time to configure your shell environment, because it’s a powerful tool.

change your color themes
configure your prompt: there’s a lot of useful information (cwd, git status info, etc)
set up aliases and shell functions, like dcu expanding into docker compose up
make sure your command completion is working: it saves so much time

Section Challenges

[ ] Are there things that you’re currently doing manually in a GUI? Do you ever pass instructions to colleagues that involve a number of individual “click this button”, “select this item” steps? Could these be automated?
[ ] Whenever you move to a new environment, make a point of finding out what shells are available. See if you can bring your current shell with you.
[ ] Investigate alternatives to your current shell. If you come across a problem your shell can’t address, see if an alternative shell would cope better.

Power Editing

Work at gaining skill in manipulating text efficiently by achieving editor fluency. What does “fluent” mean? How much of the following could you accomplish without using a mouse / trackpad?

When editing text, move and make selections by character, word, line, and paragraph.
When editing code, move by various syntactic units (matching delimiters, functions, modules, etc.).
Re-indent code following changes.
Comment and uncomment blocks of code with a single command.
Undo and redo changes.
Split the editor window into multiple panels, and navigate between them.
Navigate to a particular line number.
Sort selected lines.
Search for both strings and regular expressions, and repeat previous searches.
Temporarily create multiple cursors based on a selection or on a pattern match, and edit the text at each in parallel.
Display compilation errors in the current project.

Fluency is something you move towards, not something you arrive at (nobody knows the whole of their text editor). If you find yourself doing something repetitive, see if you can find a better way through your editor. Repeat that new skill a lot so that you can ingrain the more efficient way into your development workflow. When you find editor limitations, see if a plugin can help. Learning to write plugins for your editor can also be a great solution—if you needed that feature, other people probably will, too!

Section Challenges

[ ] No more autorepeat. Everyone does it: you need to delete the last word you typed, so you press down on backspace and wait for autorepeat to kick in. In fact, we bet that your brain has done this so much that you can judge pretty much exactly when to release the key. So turn off autorepeat, and instead learn the key sequences to move, select, and delete by characters, words, lines, and blocks.
[ ] This one is going to hurt. Lose the mouse/trackpad. For one whole week, edit using just the keyboard. You’ll discover a bunch of stuff that you can’t do without pointing and clicking, so now’s the time to learn. Keep notes (we recommend going old-school and using pencil and paper) of the key sequences you learn. You’ll take a productivity hit for a few days. But, as you learn to do stuff without moving your hands away from the home position, you’ll find that your editing becomes faster and more fluent than it ever was in the past.
[ ] Look for integrations. While writing this chapter, Dave wondered if he could preview the final layout (a PDF file) in an editor buffer. One download later, the layout is sitting alongside the original text, all in the editor. Keep a list of things you’d like to bring into your editor, then look for them.
[ ] Somewhat more ambitiously, if you can’t find a plugin or extension that does what you want, write one. Andy is fond of making custom, local file-based Wiki plugins for his favorite editors. If you can’t find it, build it!

Version Control

In short, always use it! A helpful thought experiment is to imagine your computer breaking—how long would it take you to get your environment back up and running? Now how about if you stored your dotfiles in a git repository?

Version control can also be the heart of a good DevOps plan. Did it break? Roll it back.

[ ] What version control systems have you used, and do you have any preferences for one over the other?
[ ] Do you have a preferred git strategy? I.e., Trunk-Based, Git Flow, Trunkless, etc.
[ ] Have you heard of Conventional Commits?

Section Challenges

[ ] Knowing you can roll back to any previous state using the VCS is one thing, but can you actually do it? Do you know the commands to do it properly? Learn them now, not when disaster strikes and you’re under pressure.
[ ] Spend some time thinking about recovering your own laptop environment in case of a disaster. What would you need to recover? Many of the things you need are just text files. If they’re not in a VCS (hosted off your laptop), find a way to add them. Then think about the other stuff: installed applications, system configuration, and so on. How can you express all that stuff in text files so it, too, can be saved? An interesting experiment, once you’ve made some progress, is to find an old computer you no longer use and see if your new system can be used to set it up.
[ ] Consciously explore the features of your current VCS and hosting provider that you’re not using. If your team isn’t using feature branches, experiment with introducing them. The same with pull/merge requests. Continuous integration. Build pipelines. Even continuous deployment. Look into the team communication tools, too: wikis, Kanban boards, and the like. You don’t have to use any of it. But you do need to know what it does so you can make that decision.
[ ] Use version control for non-project things, too.

Debugging

Separate debugging from blame, and it’s just problem solving. Fix the problem, not the blame. Consider using your compiler’s strictest warnings so that you’re not working on a problem that it could have found for you.

Don’t panic.
Yes, it can happen (even though you don’t know how yet).
Gather all relevant data.
You may need to interview the user who reported the bug to gather more relevant data.
Brutally test boundary conditions and realistic end-user conditions.

Strategies

Make the bug reproducible.
Create a failing test before fixing code, or you’ll end up back here.
Read the error message!
If you have a bad result (not an application crash), a debugger can help you trace the value.
Jotting notes (your expectations, what’s happening now) during the process can help you from covering ground already crossed.
If a particular dataset causes a bug, reproduce it locally. Binary chop (binary search) the dataset until you find the input causing the bug.
Regression after release is a good time for binary chop, too.
Binary chop / search can help in debugging when you split the problematic dataset in half and feed if back into the routine.
git bisect is binary search for version control.
Logging and tracing can help you gather historical data (as opposed to debugging, which focuses on current state).
Rubber ducking can help you better understand you own problem, since you have to shift gears mentally and explain the problem thoroughly to another person.
Process of elimination helps: it’s probably not the OS; it’s probably that code you just changed, even though you don’t yet know how.

Debugging Checklist

[ ] Is the problem being reported a direct result of the underlying bug, or merely a symptom?
[ ] Is the bug really in the framework you’re using? Is it in the OS? Or is it in your code?
[ ] If you explained this problem in detail to a coworker, what would you say?
[ ] If the suspect code passes its unit tests, are the tests complete enough? What happens if you run the tests with this data?
[ ] Do the conditions that caused this bug exist anywhere else in the system? Are there other bugs still in the larval stage, just waiting to hatch?

Text Manipulation

awk and sed are text manipulation tools. Perl has excellent text manipulation (Ruby and Python as well). Bottom line, learn a tool that lets you do this. It’s tremendously useful and lets you hack together utilities and prototypes—something that would take much longer using conventional languages.

Section Challenges

[ ] Exercise 11 You’re rewriting an application that used to use YAML as a configuration language. Your company has now standardized on JSON, so you have a bunch of .yaml files that need to be turned into .json. Write a script that takes a directory and converts each .yaml file into a corresponding .json file (so database.yaml becomes database.json, and the contents are valid JSON).
[ ] Exercise 12 Your team initially chose to use camelCase names for variables, but then changed their collective mind and switched to snake_case. Write a script that scans all the source files for camelCase names and reports on them.
[ ] Exercise 13 Following on from the previous exercise, add the ability to change those variable names automatically in one or more files. Remember to keep a backup of the originals in case something goes horribly, horribly wrong.

Engineering Daybooks

The authors recommend taking daily notes in a paper notebook about meetings, what you’re working on, debugging output, ideas, etc.

More reliable than memory.
Give you a place to store ideas not immediately relevant to your task at hand.
Allows you to rubber duck a bit by switching gears to write.

Elasticsearch

posted on November 27, 2023

Elasticsearch: This Could be a Career

Implementing a Custom Analyzer

An analzyser processes indexed data so that it is relevant to text searches. It is made up of three parts: zero or more character filters, one tokenizer, and zero or more token filters. The character filter preprocess the stream of characters before passing the stream to the tokenizer. The tokenized stream is passed into a tokenizer.

« Previous Page
1
…
3
4
5
6
7
…
9
Next Page »

Site Navigation

Site Search

The Pragmatic Programmer: Chapter 4

Pragmatic Paranoia

Design By Contract

Implementing DBC

Section Challenges

Dead Programs Tell No Lies

Assertive Programming

Section Challenges

How to Balance Resources

Section Challenges

Don’t Outrun Your Headlights

The Pragmatic Programmer: Chapter 3

The Basic Tools

The Power of Plain Text

Section Challenges

Shell Games

Section Challenges

Power Editing

Section Challenges

Version Control

Section Challenges

Debugging

Strategies

Debugging Checklist

Text Manipulation

Section Challenges

Engineering Daybooks

Elasticsearch

Elasticsearch: This Could be a Career

Implementing a Custom Analyzer