(defn not-enforce-absolute-redirects [defaults]
(assoc-in defaults [:responses :absolute-redirects] false))
(def my-app
(-> handler
(wrap-defaults (-> site-defaults not-enforce-absolute-redirects))))
This time, it is more of a "Monthly bits" piece because I didn’t have time to publish regular weekly updates in April.
One of our on-premise customers reported a problem with redirects when running CodeScene behind a proxy server terminating HTTPS. In such cases, we’d always recommended our clients to configure their proxy server to rewrite redirect URLs.
However, my colleague wasn’t satisfied with this solution and looked at the http routing machinery we use. They found that friend, a Clojure authentication library, enforced absolute URLs for redirects even though we configured it to use absolute URLs.
I found that ring does the same thing, but its behavior can be customized.
A long time ago, the old HTTP spec mandated using absolute URLs for redirects. This is no longer valid and it’s completely fine to use relative URLs for redirects. See https://en.wikipedia.org/wiki/HTTP_location.
I submitted a pull request to friend’s repo to get rid of enforcing absolute URLs for redirects and configured ring-defaults in our app properly:
(defn not-enforce-absolute-redirects [defaults]
(assoc-in defaults [:responses :absolute-redirects] false))
(def my-app
(-> handler
(wrap-defaults (-> site-defaults not-enforce-absolute-redirects))))
This fixed the customer’s problem.
A programmer shared a nice experience report on Clojurians slack about using the REPL when talking to a business person:
Their eyes widened as I was able to use Calva’s Shift + Option + Enter keyboard shortcut to evaluate each “step” of the threading macro to see the result. Creating “verbs” as functions and stringing them together with threading macros—evaluating each “step” in the REPL—is a powerful “scratchpad/whiteboard” for showing and explaining things to non-technical people.
Had they asked a few months ago, I would have used Python in Jupyter Notebook.And they were in awe of Clojure/VS Code/Calva, saying things like: “What is that tool you’re using?! That’s amazing. I need something like that.
”Reminder to self: Show, don’t tell. And don’t assume that non-technical people won’t understand what’s going on
didibus published clj-ddd-example:
An example implementation of Domain Driven Design in Clojure with lots of information about DDD and its concepts and how it’s implemented in Clojure along the way.
discussion on clojureverse: Domain Driven Design (DDD) in Clojure, an example implementation
I had troubles with the me.raynes.fs/rename
function
not moving files across file system boundary and simply returning false.
This is actually a problem with the underlying java.io.File method renameTo
.
It’s one of the many shortcomings of the
legacy Java file I/O.
The solution was to use the Java NIO.2 API:
(defn rename-file
"Moves source file to the target file while respecting `replace-existing?`, by default true.
`source` and `target` are expected to be a string, java.io.File or java.nio.file.Path.
This uses java.nio.file.Files/move to overcome shortcomings of old `java.io.File/renameTo` method,
notably when moving files between different file systems:
- https://docs.oracle.com/javase/tutorial/essential/io/legacy.html
- https://www.baeldung.com/java-path-vs-file"
([source target]
(rename-file source target true))
([source target replace-existing?]
(Files/move (.toPath (io/file source))
(.toPath (io/file target))
(into-array (if replace-existing?
[StandardCopyOption/REPLACE_EXISTING]
[])))))
The new function moves the file properly (and if it does not, it will throw an exception).
2008 discussion with a couple of comments from Rich Hickey
to understand the difference between print-method
and print-dup
:
Rich: Yes, please consider print/read. It is readable text, works with a lot of data structures, and is extensible.
As part of AOT I needed to enhance print/read to store constants of many kinds, and restore faithfully - This led to a new multimethod - print-dup, for high-fidelity printing.
You can get print-dup behavior by binding print-dup
I have a full example here: https://github.com/jumarko/clojure-experiments/blob/master/src/clojure_experiments/compiler/reader-and-printer.clj#L34-L91. Here’s a snippet of it:
(binding [*print-dup* true]
(dorun
(map prn
[[1 2 3]
{4 5 6 7}
(java.util.ArrayList. [8 9])
,,,
;; prints this:
;; [1 2 3]
;; #=(clojure.lang.PersistentArrayMap/create {4 5, 6 7})
;; #=(java.util.ArrayList. [8 9])
Eric Dallo announced a new lib lsp4clj that should help you create any LSP for any language in Clojure.
Chris Nuernber introduced a brand new library for parsing JSON and CSV. It’s very fast and importantly, doesn’t have any dependency on Jackson.
A useful tip from Clojurians slack shows how to reload and print all dependencies of a namespace:
(require (ns-name *ns*) :reload :verbose)
Pipes are the most common way to read output of a sub-process. It is important to read the sub-processe’s output promptly, though. If you fail to do so, the sub-process will block if its output is larger than max pipe buffer size, which is typically 64 KB.
Sometimes, you may experiences a rather odd "OutOfMemoryError" with the detail error message saying something like this:
Exception in thread "async-dispatch-4" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
This is probably a symptom of something else. It could be insufficient system memory, but it’s more likely that you are hitting the 'max number of open files' limit.
Here are some things to try: How to solve java.lang.OutOfMemoryError: unable to create new native thread
check threads-max (system-wide limit on max number of threads): sysctl kernel.threads-max
check ulimit -n
(number of open files)
count processes & threads: ps -elfT | wc -l
count threads for given process: ps -p $(pgrep java) -lfT | wc -l
check PID limit: sysctl kernel.pid_max
As a follow-up to the previous OOM topic, I came up with a basic Java/Linux monitoring script.
Run it like this:
nohup watch -n 20 './watch.sh >> monitor.log' &> /dev/null &
I answered this StackOverflow question: How to prevent a Java application from executing processes on GNU/Linux?
In short, I don’t think this is easily achievable. Security Manager used to be a tool to solve such problems but it’s deprecated for removal. If you try to limit the number of processes executed by the JVM process you will run into serious issues because threads themselves are counted as processes.
[9816123.415s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 2048k, guardsize: 0k, detached.
java.lang.Shutdown class is a useful place to learn about what happens when the JVM process exits.
An orderly shutdown sequence will call Shutdown#exit whose body is straightforward:
beforeHalt();
runHooks();
halt(status);
For the application programmer, the important method is runHook
- it calls all the registered shutdown hooks.
I’ve had the book for some time but didn’t find time to read it. This month, I listened to a Thoughtworks' podcast about the book where they give a useful summary of it:
It’s all about trade off analysis
Two (three) main things in distributed analysis
sizing (granularity of) services
Communication/ wiring
And then data management (you cannot treat sw and data architecture separately these days)
Book What every programmer should know about object oriented design is useful for understanding various types of cohesion
especially the distinction between the static and dynamic coupling
Architecture kata: SysOps squad stories
Rebecca: Architecture give us (is?) a framework for thinking about and working with complexity
Data architecture
Historically, there was a sharp distinction between analytics (data warehouses) and operational data
Today, we see the need for bringing analytical capabilities back to applications (in real time)
We have three main types: data warehouses, data lakes, data meshes
There’s still a key distinction between them:
operational workloads - the code defines a procedure and stores it’s state (code-first)
analytics - data drives the behavior, like in Machine learning (data-first)
I also checked another podcast’s episode about NoSQL with Martin Fowler. He offered a good piece of advice:
Using Database as an integration point is horrible
Amazon did it right by integrating via external APIs
Amazon CloudFront now supports Server Timing headers
Server Timing headers provide detailed performance information, such as whether the content was served from the cache when the request was received, how the request was routed to the CloudFront edge location, and how much time elapsed during each stage of the connection and response process.
Example:
Server-Timing: cdn-upstream-layer;desc="REC",cdn-upstream-dns;dur=0,cdn-upstream-connect;dur=195,cdn-upstream-fbl;dur=366,cdn-cache-miss,cdn-pop;desc="IAD89-C3",cdn-rid;desc="bjEUzYyv7e3FyYoK93Tw0MNYhNV2zVTMbjFO8g-Tr5aEW108VkzM-w=="
This can help you diagnose CloudFront errors such as mysterious OriginCommError.
To enable the Server-Timing header, create (or edit) a response headers policy.
Another useful article/guide from Cloudonaut. They demonstrate how CloudFront can be used to restrict parts of your website only to authenticated users.
A nasty vulnerability was disclosed by ForgeRock: see https://neilmadden.blog/2022/04/19/psychic-signatures-in-java/: You could produce a fake signature which Java would accept as a valid signature for any message and for any public key!
Action:
If you are running one of the vulnerable versions then an attacker can easily forge some types of SSL certificates and handshakes (allowing interception and modification of communications), signed JWTs, SAML assertions or OIDC id tokens, and even WebAuthn authentication messages
If you have deployed Java 15, Java 16, Java 17, or Java 18 in production then you should stop what you are doing and immediately update to install the fixes in the April 2022 Critical Patch Update.
The very first check in the ECDSA verification algorithm is to ensure that r and s are both >= 1.
Guess which check Java forgot?
I’ve got an interesting question about usage of anti-virus software on cloud servers. I asked a question on the DevOps Engineers slack.
The end result is that not many people use antivirus software on their server infrastructure and some consider it to be even more risky that not running an antivirus at all (it’s a complicated low-level continuously scanning all the system binaries)
A few people mentioned solutions like ClamAV and Falcon platform
A solid book by Garry Weinberg.
A lot of it is about busting various myths about software testing (such that you can find all the bugs via a perfect testing process). They keep saying that testing is about providing information to managers so they can make decisions to mitigate risk. It also offers an advice on how to give such information and how to receive feedback.
Recommended!
Chris Houser is looking for people joing his book club reading the Lisp in Small Pieces book: https://chouser.us/lisp2022/
I decided to order The Art of War: Complete Texts and Commentaries. I’ve stumbled on it every now and then in the past years - for instance, I liked the selection of quotes in The Art of Scalability.
It’s time to give it a shot.
This short talk offers good advice on how write good technical documentation. My summary:
Start with what the reader needs (don’t include stuff just because it’s "related" or "you know about it")
Write less
Write the outline first - helps you to write less too and focus on what the reader needs
"Rubber ducking" - talking helps to clarify thoughts, debug your understanding
you can write it down to be captured in the docs later
yields friendly conversational stuffi
read it out loud
write readably - not one big pile of text
use headings, lists, short paragraphs (one idea)
put most important bits first
it’s not just about documentation - think how to make the underlying thing better (fix the thing instead of "writing around the problem")
I started doing some typing exercises. My goal is to achieve 120 wpm at 99% accuracy but so far it’s been challenging :). I think I’ll need at least a few months to achieve that.
The websites I’m using:
10fastfingers.com
Are you struggling with many tabs and finding the right one quickly?
Just use Cmd + Shift + A
(or Ctrl + Shift + A).
It’s powerful!
I watched this talk. It was fun and re-freshing. As always, Bryan’s energy is incredible.
The key quotation for me:
We wanna solve hard commercially relevant problem with people that inspire us achieving a mission that we believe in.
The word "We" here refers to the mythical 10x/top engineers.
A very good article on the topic of "low-code" tools and techniques. One of the key ideas, for me, was how to measure amount of code: We shouldn’t count the amount of code in lines but in cognitive load.
Therefore, when minimizing code we should strive to reduce the cognitive load primarily, not LOC.
We also need to reduce maintenance and operations (like most SaaS products do).
The Pudding makes data fun. As one example, check their Film Dialogue.
difftastic is a cool diffing tool and it supports Clojure!
You can use it with git log
easily:
GIT_EXTERNAL_DIFF=difft git log -p --ext-diff
Until know, I somehow managed to not know about this technique used by some ISPs (Internet Service Providers). It is often used for mitigating IPv4 address exhaustion. The idea is to translate end users' IP addresses to a single shared IP address. Hundreds users may end up with the same public IP address!
This can cause problems with services blocking/whitelisting access by IP address:
People that shouldn’t be blocked are blocked because a client sharing the same public IP is misbehaving and thus gets blocked.
People that should be blocked gain access to an internal service of a company because they share an IP address of the company’s employee.
I learned about this "magic" facility of Linux OS when reading about Processes in an Uninterruptible Sleep (D) State. It can list all the processes in this 'D' state with associated kernel stack traces. It might be very useful for debugging tricky issues.
You trigger it simply:
echo w > /proc/sysrq-trigger
Then check the system’s logs:
dmesg -T
...
[Fri Apr 22 11:11:40 2022] sysrq: Show Blocked State
[Fri Apr 22 11:11:40 2022] task PC stack pid father
[Fri Apr 22 11:11:40 2022] analysis-schedu D 0 4460 2556 0x00000320
[Fri Apr 22 11:11:40 2022] Call Trace:
[Fri Apr 22 11:11:40 2022] __schedule+0x2e3/0x740
...
Note: The whole The Linux kernel user’s and administrator’s guide might be quite useful!
Namespaces are a fundamental building block for Linux containers. They give us an illusion of:
superuser inside the container (User namespace),
isolated file system (Mount namespace),
process isolation (PID namespace),
separate network (Network namespace)
I think it’s worth getting familiar with them For that, Michael Kerrisk’s talk Containers unplugged: Linux namespaces is an excellent starting point.
Another great talk by Michael Kerrisk and relatively new one (2018) is about strace. It’s an impressive and very useful tool. I used it numerous times to debug tricky issues. It’s also useful for learning about the Linux operating system.
The talk gives you a solid base to start using the tool on your own. You can find slides here and all Michael’s presentations at https://man7.org/conf/index.html.
I talked about VisiData before and I think it’s a great tool. I still struggle with it a lot, mostly because I use it only every now and then. Accidentaly, I found how to open CSV files with non-standard delimiters like semicolons:
vd --csv-delimiter ";" my-ebanking-export.csv
It’s like an interactive heatmap - you can select an arbitrary continuous time-slice of the captured profile, and visualize it as a flame graph.
This is recommeneded for teaching/learning operating systems in the book Operating Systems: Three Easy Pieces.
I needed this recently in our Dockerfile to avoid duplication when using tini
mycmd=(java)
if test "$$" = "1"; then
mycmd=(tini -- java)
fi
exec "${mycmd[@]}" -XshowSettings:vm ...
This is also a good reminder to not run your application as PID 1 inside the container: https://gds-way.cloudapps.digital/manuals/programming-languages/docker.html#running-programs-as-process-id-pid-1
A quick recap of some of the links mentioned in this post:
obsolete version of HTTP spec mandated using absolute URLs for redirects: see https://en.wikipedia.org/wiki/HTTP_location
Don’t run your application as PID 1 inside the container: https://gds-way.cloudapps.digital/manuals/programming-languages/docker.html#running-programs-as-process-id-pid-1
Amazon CloudFront now supports Server Timing headers
Java Security Manager is deprecated for removal.
The Quest for Low-Code: 9 paths, some of which actually work (Gregor Hohpe)
lsp4clj - create any LSP for any language in Clojure
Charred - new JSON & CSV parsing library with zero dependencies and very fast
Film Dialogue - one of the Pudding’s pages.
You could produce a fake signature which Java would accept as a valid signature for any message and for any public key!**
Processes in an Uninterruptible Sleep (D) State - use Linux SysRq to get details about such processes
Michael Kerrisk’s talk Containers unplugged: Linux namespaces
Flamescope - Flamegraph on steroids
Xv6, a simple Unix-like teaching operating system - recommended in Operating Systems: Three Easy Pieces
Chris Houser started a book club reading Lisp in Small Pieces: https://chouser.us/lisp2022/
Serving content only to logged-in users with CloudFront Signed Cookies
Leadership Without Management: Scaling Organizations by Scaling Engineers (Bryan Cantrill)