> Some routers and firewall silently kill idle TCP connections without telling application. Some code (like HTTP client libraries, database clients) keep a pool of TCP connections for reuse, which can be silently invalidated. To solve it you can configure system TCP keepalive. For HTTP you can use Connection: keep-alive Keep-Alive: timeout=30, max=1000 header.
Once a TCP connection has been established there is no state on routers in between the 2 ends of the connection. The issue here is firewalls / NAT entries timing out. And indeed, no RSTs are sent.
We had the issue in K8s with the conntrack module set too low.
Now, you can try to put in an HTTP Keep-Alive, but that will not help you. The HTTP Keep-Alive is merely for connection re-use at the HTTP level, i.e. it doesn't close the connection: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...
An HTTP Keep-Alive does not generate any packages, it merely postpones the close.
A TCP Keep-Alive generates packages which resets the timers.
s/packages/packets/
Thanks I will correct that
> A method that returns Optional<T> may return null.
projects that do this drive me bananas
If I had the emotional energy, I'd open a JEP for a new @java.lang.NonNullReference and any type annotated with it would be a compiler error to assign null to it
public interface Alpha {}
@java.lang.NonNullReference
public interface Beta {}
Alpha a = null; // ok
Beta b = null; // compiler error
javac will tolerate this Beta b;
if (Random.randBoolean()) {
b = getBeta();
} else {
b = newBeta();
}
but I would need to squint at the language specification to see if dead code elimination is a nicety or a formality Beta b;
if (true) {
b = getBeta();
} else {
b = null; // I believe this will be elided and thus technically legal
}
I question the wisdom of even having Optional<T> in a language with nulls. It would raise some eyebrows if a function in Python returned an Optional type object rather than T | None. You have to do a check either way unless you're doing some cute monad-y stuff.
You're far from alone, it does make it a tiny bit easier to see which functions are expected to return null, but that's about it and messing around with it always feels like wasted effort.
It works quite well in Scala, which still tolerates nulls due to being in the JVM and having Java interop. Realistically nothing in the language is going to return null, so the only time you might have to care is when you call Java classes, and all of the Java standard library comes scalaified into having no nulls. And yes, there are enough monadic behavior in the standard library to make Option and Either quite useful, instead of just sum types.
Java really suffers with optional because the language has such love for backwards compatibility that it's extremely unlikely that nulls would even be removed from the standard library in the first place. The fact that the ecosystem relies on ugly auto wiring hacks instead of mandating explicit constructors doesn't help either.
> because the language has such love for backwards compatibility
I still remember when Java 9 introduced modules. And I’m currently pulling my hair because Java 21 renamed all javax.* into jakarta.* because Javax was a trademark of Oracle, and all libs now require a “-jakartax” version for JDK 21.
But somehow I still have to deal with nulls everywhere and erased-at-runtime generics because Java loves backwards compatibility so much. The simple fact all libs released a “-jakartax” proves the entire ecosystem is fully maintained (plus CVEs means unmaintained libs aren’t allowed in production), so they could very well release a -jdk25 version with non-null types.
Maybe this is cute monady stuff, but there isn't an equivalent to Optional<Optional<T>> with only null/None. You usually don't directly write that, but you might incidentally instantiate that type when composing generic code, or a container/function won't allow nulls.
In what context would you not want to treat Optional.of(null) and null as the same? It shouldn't be a big deal.
The None branch of each level of a nested Optional has a different meaning.
But typically it boils down to either you have the data or you don't. It's a subtle difference which I argue you can live without.
Often people use optional or nullable types as a convenient approximation to an Either type.
I still don't see why it would be a problem merging then down even when used like an either. If there is no value then there is no value.
In JSON/REST API bindings, where a deserializer maps JSON to language-native object/struct type, I'll often need to know the difference between:
{}
and { "foo": null }
and { "foo": 42 }
So I'll represent that (in e.g. Rust) as: struct Whatever {
foo: Option<Option<u32>>,
}
None means not present, Some(None) means present but null, and Some(Some(42)) means present with a value.I'll often use this in PATCH endpoints, where not-present means to leave the current value alone, null means to unset it, and a value means to set to that value.
How about a situation where the inner Optional<T> is acquired from another system or database, and the outer Optional<Optional<T>> is a local cache of the value. If the outer Optional is null, then you need to query the other system. If the outer Optional is filled and the inner Optional is null, then you know that the other system explicitly has no value for the data item, and can skip the query. Seems like using nested optionals would be natural here, although of course alternative representations are possible.
There's a lot of quality-of-life stuff enabled by it in Java, since the base language's equivalents to Optional.empty(), Optional.ofNullable(...).orElse(...), etc are painfully verbose by comparison.
In Kotlin this would already be a compile error, no need for another annotation.
> Java, C# and JS use UTF-16-like encoding for in-memory string
That’s incorrect for Java, possibly also for C# and JS.
In any language where strings are opaque enough types [1], the in-memory representation is an implementation detail. Java has been such a language since release 9 (https://openjdk.org/jeps/254)
[1] The ‘enough’ is because some languages have fully opaque types, but specify efficiency of some operations and through it, effectively proscribe implementation details. Having a foreign function interface also often means implementation details cannot be changed because doing that would break backwards compatibility.
> JS use floating point for all numbers. The max accurate integer is 2⁵³−1
That is incorrect. Much larger integers can be represented exactly, for example 2¹⁰⁰.
What is true is that 2⁵³−1 is the largest integer n such that n-1, n, and n+1 can be represented exactly in an IEEE double. That, in turn, means n == n-1 and n == n+1 both will evaluate to false, as expected in ‘normal’ arithmetic.
> possibly also for C# and JS
The representation for C# is very much fixed, as it allows, and very commonly uses, direct access into the string buffer as a ReadOnlySpan<char> or a raw char pointer, where char is the type of UTF-16 codepoints.
JS could maybe get away with it.
When you have code that works a lot with strings the cost overhead of building an app on iso-latin-1 but encoding as utf-16 can be substantial.
I think Java moved away from this back around 8, or possibly 9.
Yeah, I think they didn't mean max "accurate" integer and rather meant max "safe" integer.
Thanks I will correct that
> > Java, C# and JS use UTF-16-like encoding for in-memory string
>
> That’s incorrect for Java,
Maybe so, technically, but if you Base64 encode a string in a language that uses UTF-8 (or another UTF-16 with another endian) and decode it in Java, Java's UTF-16 representation will be the problem you will be dealing with.
That's why when you are constructing a String with a byte array, you always, always, always use the constructor that also takes a character set.
I started to say something about C# strings and then I remembered the clusterfuck when it came to Windows development and strings and depending on which API you call, a string is represented by one of a dozen different ways.
https://stackoverflow.com/questions/689211/interop-sending-s...
That's a nice compendium of tips and useful information.
I wonder if anyone can learn from this. I feel like I only understood what I already knew, or at least was very close to knowing. That's the same thing that happens with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topics, but often very bad at teaching the same topics to an audience that doesn't know anything.
> with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topic
I think that the reason for a manual existence. To have a written record so we don't have to trust our memory. This is what most unix manuals are. You already know what the software can do, you just need to remember the specificity on how to get something done.
> often very bad at teaching the same topics to an audience that doesn't know anything.
What you need then is a tutorial (beginner seeking to learn) or a guide (beginner/intermediate seeking to do). Manuals in this case only serve to have better questions (Now you know what you don't know).
Kind of what I noticed for myself.
When I was a kid I was trying to learn Linux and commands and it was disappointing.
Over the years of using it I don’t need to learn it but I do need to look stuff up.
This looks like not so much traps, but a list of things the author has learned.
Much of it would only apply in certain relatively narrow contexts, but the contexts aren't necessarily mentioned.
Some of it appears to be just wrong.
I guess I'm saying: I would not take this literally, but as something almost like a stream-of-consciousness.
> Python: - Default argument is a stored value that will not be re-created on every call.
PSA for anyone working with datetime variables!
The first "trap" on the page says "min-width: auto makes min width determined by content", but this is false outside of flex/grid.
From MDN: "For block boxes, inline boxes, inline blocks, and all table layout boxes auto resolves to 0."
I guess the first trap should really be: "You cannot read any CSS property in isolation, as just like what the name implies, defaults and what values end up doing cascades through all the rules your document ends up using"
CSS cascade for text properties more or less makes sense.
I have been unable to comprehend CSS layout from any perspective: page designer, implementer, user, anything. It must have someone in mind but I have no idea who I that is.
https://every-layout.dev has by far the best explanations and coherent usage of CSS I've encountered since I started doing webdev for a living in 1998.
Every Layout changed how I look at and do CSS. Great resource with a good philosophy behind it: CubeCSS. It really made CSS fun for me again.
Layout is more bazaar than cathedral. It has had many ideas mixed in by different contributors over decades.
Thanks I will correct that
Largely a good listicle. Some feedback:
> Unicode unification. Different characters in different language use the same code point. Different languages' font variants render the same code point differently. 語
This isn't a trap. The given example character means the same thing in Chinese and Japanese, and the Japanese version was imported from China. People from both languages recognize both font variants as the same conceptual character.
The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
Anyway, this is discussed at length in https://en.wikipedia.org/wiki/Han_unification
> There is a negative zero -0.0 which is different to normal zero. The negative zero equals zero when using floating point comparision. Normal zero is treated as "positive zero".
And there are two ways to distinguish negative zero from normal zero: By their integer bit patterns, or by the fact that 1.0/-0.0 == -Inf vs. 1.0/0.0 == +Inf.
> It's recommended to configure the server's time zone as UTC.
Big yes. I use UTC for servers, logs, photos, and anything that is worth archiving and timestamping properly. Local time is only for colloquial use.
> For integer (low + high) / 2 may overflow. A safer way is low + (high - low) / 2
Yes, but if low and high could be negative numbers, then you've just shifted the overflow to a different range. This matters for general binary search over an integer range, as opposed to unsigned binary search over an array.
> C/C++
I'm going to throw in one of my lists of pitfalls - just using integer types and arithmetic correctly in C/C++ is a massive developer trap. That's like the most basic thing in programming. https://www.nayuki.io/page/summary-of-c-cpp-integer-rules
> Rebase can rewrite history
"Can" is a weasel word; rebase does nothing but rewrite history.
> The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
But we do have А and A. Even though they look the same. And unified Han characters are often quite distinct, it tripped me up as a learner of Chinese more than once. For example, a very common character '喝' (drink) looks quite a bit different: https://en.wiktionary.org/wiki/%E5%96%9D - they have a different number of strokes even. And I can't even copy-paste it here to demonstrate, because it changes form once I copy it from the Wikipedia article.
Han unification is a mess.
> There are subtle differences between numpy and pytorch.
This isn't really a trap, and it doesn't help anyone; it looks like "I got burned but I don't want to share the specifics".
CSS and C++ both have the “pick a subset and enforce that, or suffer” nature. On my to-do list: make a github action that requires manual override to merge any pull request with a css attribute not already present
I am unsure how this is supposed to work for CSS. To my knowledge, most CSS properties cannot be substituted for each other. If the subset to be enforced is "CSS properties already present", what is a developer supposed to do if their CSS property is not already present? Change the design?
Well, (like C++) new css attributes are constantly added. This means you constantly have to choose between the old way or the new way: either is fine, but “pick old or new at random on a per pull request basis” isn’t.
You seem to assume that old CSS properties can be substituted for new ones. But as I said, to my knowledge this isn’t possible in most cases. Can you give an example of two CSS properties where 'either is fine, but only one should be used'?
Or do you mean something else altogether by 'CSS attributes'?
The specific case that inspired this comment was a random mix of margin and gap
A recent trap for me:
Regex semantics is subtly different across languages. E.g. a{,3} matches between 0 and 3 "a" characters in Python. In JavaScript it matches the literal string "a{,3}".
Regex is more a technique than an actual specification. It would be best to find the time to go and read an introductory book about Theory of Computation where they explain the underlying mechanism.
I always use regex101 to develop my regexes. It allows you to switch between different engines.
Honorable mention to [a-z], gotta be my favorite trap
What's the trap for this one? I can't think of any engine that parses this to mean anything other than the letters a through z.
In some common implementations if $LANG is set to certain values, it will fail to match some ASCII letters. This is because not all latin character using languages put Z last in the alphabet.
Try this (you probably need to enable and generate the locale first)
echo y | LANG=lv_LV.UTF-8 grep '[a-z]'
Locales in general should be considered a "trap", just look at Windows CSV separator handling, etc.That's wild. Thanks for explaining. I had no idea this depends on the locale. Looks like I have about a million scripts to fix...
Not in general, but using locales for something different than affecting presentation.
It depends on its use, ultimately, but if your goal is to find a string of letters (a common use IMO), you'll want to use something like \p{L} to ensure you don't miss non-ASCII characters.
eta: fixed regex, I had typed \L, shared from my faulty memory.
[A-z] though is a fun one though as it includes a few extra symbols between upper and lowercase.
Does it? I thought Regex are defined on character classes not on numeric ASCII values. What would a Regex do on a different encoding then?
The part about C# volatile accesses using release-acquire ordering seems to be wrong if I read the C# docs correctly.
"There is no guarantee of a single total ordering of volatile writes as seen from all threads of execution"
https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
>A volatile write operation prevents earlier memory operations on the thread from being reordered to occur after the volatile write. A volatile read operation prevents later memory operations on the thread from being reordered to occur before the volatile read
Looks like release/acquire to me? A total ordering would be sequential consistency.
I think you are quoting from https://learn.microsoft.com/en-us/dotnet/api/system.threadin...
"In C#, using the volatile modifier on a field guarantees that every access to that field is a volatile memory operation"
This makes it sound like you are right and the volatile keyword has the same behaviour as the Volatile class which explicitly says it has acquire-release ordering.
But that seems to contradict "The volatile keyword doesn't provide atomicity for operations other than assignment, doesn't prevent race conditions, and doesn't provide ordering guarantees for other memory operations." from the volatile keyword documentation?
I too interpretat those docs as contradictory, and I wonder if, like how Java 5 strengthened volatile semantics, this happened at some point in C# too and the docs weren't updated? Either way the specification, which the docs say is definitive, says it's acquire/release.
https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
"When a field_declaration includes a volatile modifier, the fields introduced by that declaration are volatile fields. [...] For volatile fields, such reordering optimizations are restricted:
A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence."
Acquire-release ordering provides ordering guarantees for all memory operations. If an acquire observes a releases, the thread is also guaranteed to see all the previous writes done by the other thread - regardless of the atomicity of those writes. (There still can't be any other data races though.)
This volatile keyword appears to only consider that specific memory location whereas the Volatile class seem to implement acquire-release.
Somewhat off topic, but what is a realistic example of where you need atomics with sequential consistency? Like, what useful data structure or pattern requires it? I feel like I've seen every other ordering except that one (and consume) in real world code.
A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.
Sequential consistency mostly become relevant when you have more than two threads interacting with both reads and writes. However, if you only have single-consumer (i.e. only one thread reading) or single-producer (i.e. only one thread writing) then the acquire-release semantics ends up becoming sequential since the single-consumer/producer implicitly enforces a sequential ordering. I can potentially see some multi-producer multi-consumer queues lock-free queues needing sequential atomics.
I think it's rare to see atomics with sequential consistency in practice since you typically either choose (1) a mutex to simplify the code at the expense of locking or (2) acquire-release (or weaker) to minimize the synchronization.
> A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.
Wait, what? So you're saying this spinlock is buggy? What's the bug?
No, sorry. I was just remembering where I've typically seen sequential consistency being used. For instance, Peterson's algorithm was what I had in mind. Spinlock is indeed a good example (although a terrible algorithm which I hope you haven't seen used in practice) of a mutex algorithm which only requires acquire-release.
> If you already use locking, no volatile needed.
Kinda misleading. volatile is for memory mapped I/O and such. volatile means the memory access really happens
I changed the wording of it.
"Associativity law and distribution law doesn't strictly hold because of inaccuracy." should be due to precision loss not inaccuracy (they are different).
updated
The biggest trap of all: building things that no one, including yourself, wants
I would agree on the self part. But otherwise this point of view is distinctly in contrast with the recently republished "work on things that don't scale" article from pg.
Also as a corollary, I was thinking about games from the 90s I spent hundreds of hours playing back then earlier today. Bolo and Escape Velocity in particular come to mind. They were "simple" games with immense depth. But after some fruitless searching, all I find is scattered questions and comments over the last few years looking for modern equivalents are a handful of recommendations for games that are no longer developed or are defunct.
There's clear prior evidence of both success and lack of modern supply. Want to have a minimally minor successful game with established nostalgic audience? Make a new version of EV that is that game at its core. Don't be fancy, just do the thing Ambrosia did. Expand from there.
The simple model is look forward, and also look back. The first itch is small. For yourself. Find some others with a similar itch.
The hard part is pushing your idea into something more people want. That's where pg's 2013 article comes in to play. That's the hard part.
But that leaves a huge space between 0 and 10 where someone can find a successful niche.
Hell I've thought about trying to make a modern EV. It's tantalizing. But I've never made a game in my life. Ok I've rewritten Game of Life a lot. Good concept to try new ideas with. But the amount of work for a solo dev trying to recapture that original magic with no background in game dev is daunting.
Does anyone truly understand all the little edge cases with CSS?
I've write tons and tons of CSS, have done for a decade. I don't sit and think about the exact interactions, I just know a couple things that might work if I'm getting something unexpected.
I don't really see it possible to commit that to memory, unless I literally start working on an interpreter myself.
I think there can be a different way to think about CSS that can help with that feeling of never understanding it all. Recently I’ve heard people influential in the CSS world describe it as a “suggestion” to the browser. The browser has its own styles, the user might have some custom stylesheet on top of the browser’s version, extensions, etc etc and at some point CSS is really more a long list of “suggestions” about how the site should look.
If you embrace that idea to the fullest, you can create some interesting designs/patterns that can be more resilient. The “downside” is that this way of writing css will likely made the pixel perfect head of the marketing department hate you unless they also write code.
I think it’s also okay to say that some ways of writing css just aren’t relevant anymore. A good parallel in mind is building construction and general carpentry. These days, a quick 2x4 stud wall or insulated concrete forms is fast, cheap, and standardized around the world. However, many craftspeople still exist that will create beautiful joinery for what is ultimately a simple thing, but we can appreciate that art standalone. With CSS, I don’t suspect we will ever need to go back to floats or crazy background images or whatever but it’s nice that those tools are still there for not only the sake of back compat, but also as a way to tinker and “craft” something bespoke for a special project or just because you like it. Education will eventually catch up and grid and flexbox will keep gaining popularity until we decide that it’s too complicated and come up with some new algorithm. That can all be true though and you can bring value as a developer without knowing every single aspect to the public API.
But you need to, you know, actually float something in a text. I think to do it with flexbox/grid you need JS that calculates heights and than manually splits the text into boxes with heights, so essentially you are doing rendering.
Also is there another way to position boxes side-by-side in an inline context without float?
> Unset variables. If DIR is unset, rm -rf $DIR/ becomes rm -rf /. Using set -u can make bash error when encountering unset variable.
sweet mercy :O
Someone call the Inquisition
This was a very famous Steam bug
Instead, say
rm -rf $DIR
That is, skip the trailing slash. Then if $DIR is not set, it becomes an invalid command, because no file names were supplied.Better to make the requirement explicit, instead of relying on the argument-parsing details of rm or some other command:
# Default message
$ rm -rf "${DIR:?}"
bash: DIR: parameter null or not set
# Custom message
$ rm -rf "${DIR:?It is not set OMG}"
bash: DIR: It is not set OMG
> Golang use UTF-8 for in-memory string.
Nope. It’s just bytes with no encoding.
Corrected.
There is no such thing as "just bytes" when it comes to Unicode. UTF-8 is a way to represent Unicode codepoints in binary.
But I agree that author's statement is wrong. Go stings are equivalent to byte slices.
Go strings are just bytes. There is no Unicode or encodings.
yaml: https://www.bram.us/2022/01/11/yaml-the-norway-problem/
bash: errexit depends on caller's context, will utterly fail you one day: https://lists.gnu.org/archive/html/bug-bash/2012-12/msg00093...
Added
LF vs CRLF
Which is incredibly painful on Windows systems doing a git clone of shell scripts, since core.autocrlf is often helpful, but not for shell scripts, since it causes the weirdest looking error messages:
MSYS$ cat build.sh
#!/bin/bash
echo "hello, world"
MSYS$ ./build.sh
: not found: 2: build.sh
e.g. https://askubuntu.com/questions/370124/not-found-error-when-...Polite projects will have .gitattributes specifying that .sh or .bash or bin/* or whatever are to always checkout with eol=lf <https://git-scm.com/docs/gitattributes#_eol>
As best I can tell, .ps1 does tolerate Unix lf but I'd bet good money that .bat and .cmd definitely do not
Thanks for reminding. Added.
> Division is much slower than multiplication (unless using approximation). Dividing many numbers with one number can be optimized by firstly computing reciprocal then multiply by reciprocal.
Is this a general fact or specific to a language?
It's generally true even for CPU instructions: https://electronics.stackexchange.com/questions/280673/why-d...
It's hardware-level - division requires more CPU cycles than multiplication on most processor architectures, making this optimization pattern relevant across virtually all programming languages.