A minor headache with dynamically typed programming languages


Let it be said, based on the observations of this novice programmer, that the ultimate pursuit of mankind seems to be the achievement of maximum convenience, hence why I believe dynamically typed languages have taken over as much as they currently have. On a tight deadline or if one simply cannot be asked, a nice garbage collector to avoid memory leak woes, and not having to declare exact types allow for a very fast initial production cycle. On that last point, indeed, even many modern typed languages now include an inference ability for simpler cases.

As I approach the end of my bachelor's degree, several humbling experiences, i.e. acknowledging the unreadability of my past programming assignments, have led me to conclude that type declarations are generally worth the effort if long-term maintenance is required. Returning to a moderately-sized Python script without annotations has often meant a lack of understanding of what types I'm expected to pass around, not to mention dealing with edge cases where, say, I forget the implicit conversion rules or have accidentally reassigned variables.

Now, above being a novice programmer I'm a student biologist, and Julia has really piqued my interest insofar as trying to include it in any works should I have the chance. I feel like it represents an ideal middle ground with the right mix of scripting and more "workhorse" language features:

  • expressive (Ruby-like block expressions, Lisp-like macros)

  • fast for a dynamic language (Julia is JIT compiled with built-in vectorization)

  • reproducible (package manifest similar to cargo.toml/go.mod, artifacts)

  • the best REPL of any language aside from maybe Vimscript which has a whole editor to itself :)

Perhaps the most salient fact in this context is that Julia is fundamentally a dynamically typed language, yet includes optional type annotations combined with multiple dispatching, think a kind of Python where MyPy has more teeth. Alongside type checking, by adding type annotations, one can provide unique implementation methods that are called only when exact types are used. A more well-rounded explanation is available in the Julia manual, however, here is a small code example inspired by "The Unreasonable Effectiveness of Multiple Dispatch" JuliaCon presentation.

Note: If you want to run any of the following code yourself, you can quickly spin up the REPL in a container via <docker or podman> run -it --rm julia:latest .

Note: As of the writing of this article, Hashnode does not support Julia syntax highlighting. Thus, these code blocks are highlighted with the "Haskell" option.

julia> struct Dog name::String end

julia> struct Cat name::String end

julia> sound() = @error "What is the animal?"
sound (generic function with 1 method)

# We can also write sound(_::Dog), we only care about the type for dispatching
julia> sound(::Dog) = "Woof!"
sound (generic function with 2 methods)

julia> sound(::Cat) = "Meuw!"
sound (generic function with 3 methods)

julia> sound()
┌ Error: What is the animal?
└ @ Main REPL[4]:1

# Also, there's piping (|>) in Julia! The dot means applying (broadcasting) the sound() function over all elements of the tuple
julia> (Dog("Toto"), Cat("Bungle")) .|> sound
("Woof!", "Meuw!")

Even though it is optional in most cases, I find myself reflexively adding function parameter type hints as it's simply good practice: when I return to my code I can look at my function f(x::Int, y::String) and get a much better idea of what it does, and I can avoid most of the problems associated with generic types by just not implementing a method with generic parameters (f(x, y)).

Emphasis on the use of the word "most", as unfortunately type issues can still arise if one is not careful as a result of a dynamic type system. Let's look at a particular headache that sprung up following some late-night experimentation.

The standard type hierarchy designates Bool as a subtype of Integer, with true and false being numerically equal to 1 and 0, respectively:

julia> Bool <: Integer
true

julia> true == 1 && true == 1.0
true

julia> false == 0 && false == 0.0
true

This is not an uncommon implementation, mostly for practical purposes. C did not have proper boolean types at all until C99, which itself was one of the major rationales behind having bool objects inherit from the int class in Python. Boolean algebra can also be argued to be more straightforward by not having to deal with two unconnected types, which falls in line with one of Julia's benefits of being able to reproduce code as close as possible to actual mathematics (f(x) = √2x - π^2 is perfectly valid code, for example, and almost any Unicode math symbol can be used).

Let's say I want to broadcast cosine over each numeric element in a vector, which I will box in a new function so I can reuse the name in my package.

julia> cos_broadcast(v::AbstractVector{T}) where T = cos.(v)
cos_broadcast (generic function with 1 method)

julia> cos_broadcast([1, 2, 3])
3-element Vector{Float64}:
  0.5403023058681398
 -0.4161468365471424
 -0.9899924966004454

T is generic over any type, so one might start by restricting it to only subtypes of Number . The obvious cases, like a vector of strings, will work as expected. Now, since true and false are represented numerically, I can accidentally (or maliciously >:3) introduce a vector of boolean values, and my function, none the wiser, will spit out the cosine of 0 and/or 1.


# Shorthand for ...{T}) where {T <: Number}
julia> cos_broadcast(v::AbstractVector{<: Number}) = cos.(v)
cos_broadcast (generic function with 2 methods)

julia> cos_broadcast(["1", "0"])
ERROR: MethodError: no method matching cos(::String)
# Omitting candidate functions and stacktrace for the sake of brevity

julia> cos_broadcast([true, false])
2-element Vector{Float64}:
 0.5403023058681398
 1.0

Ideally, the developer would either prohibit mixing booleans and numbers or warn the user of such unintended behaviour. Proceeding with the former, my initial reaction was to create a type union with all numeric types that excludes Bool , following the Julia numbers type hierarchy diagram below, courtesy of Wikimedia.

julia> const Num = Union{Complex, AbstractFloat, BigInt, Signed, Irrational, Rational, Unsigned}
Union{AbstractFloat, Signed, Unsigned, Complex, Irrational, Rational}

julia> cos_broadcast(v::AbstractVector{<: Num}) = cos.(v)
cos_broadcast (generic function with 1 method)

julia> cos_broadcast([true, false])
ERROR: MethodError: no method matching cos_broadcast(::Vector{Bool})
Closest candidates are:
  cos_broadcast(::AbstractVector{<:Union{AbstractFloat, Signed, Unsigned, Complex, Irrational, Rational}}) at REPL[2]:1
# ...

With type checking, we can throw out a vector of only boolean values without worry. However, since types are still evaluated at runtime, when presented with a vector of elements of an ambiguous type nature, Julia will attempt to promote the elements to a single common type so long as the promotion rules are established. In the event of the elements in the vector being too dissimilar from one another, the fallback behaviour is to promote v to a vector of Any, the supertype of every other type in Julia.

# Booleans will be promoted to their integer representation
julia> cos_broadcast([true, 2, false, 3])
4-element Vector{Float64}:
  0.5403023058681398
 -0.4161468365471424
  1.0
 -0.9899924966004454

# A more extreme example
julia> cos_broadcast([true, π, false, 1im])
4-element Vector{ComplexF64}:
 0.5403023058681398 - 0.0im
               -1.0 - 0.0im
                1.0 - 0.0im
 1.5430806348152437 - 0.0im

# Admittedly, cosine justifiably does not operate on types other than numbers. Also, assume we are starting from scratch and only one method is implemented (T <: Number) and it will only operate on vectors with elements that are subtypes of Number, meaning it will never promote to Vector{<: Any}
# Let's cheat a little by assuming the developer (sleep deprived moi) did not add a type annotation to the function parameter and somehow typed enough coherent words to allow cosine to operate on a String type 
import Base: cos
cos(::String) = "Over 9000!"
cos_broadcast(v::AbstractVector{<: Any}) = cos.(v)
cos_broadcast([1, true, "wow", 1im])
4-element Vector{Any}:
                    0.5403023058681398
                    0.5403023058681398
                     "Over 9000!"
 1.5430806348152437 - 0.0im

To repeat, this is a consequence of Julia being dynamically typed as types are evaluated at runtime. It used to be possible to perform some truly dark wizardry to avoid collapsing the vector to Vector{Any} by playing with the language's internal type promotion rules, however, I hopefully do not need to remind the reader how altering fundamental behaviour in this fashion could lead to breakages in unexpected ways (Rust prohibits applying external traits onto external types for good reason).

To try and save face, one could outright reject instances of a vector containing boolean values, but those pesky type promotions are still at play.

julia> function cos_broadcast(v::AbstractVector{<: Number})
           if true in v || false in v
               return error("Don't do that")
           end
           cos.(v)
       end
cos_broadcast (generic function with 2 methods)

julia> cos_broadcast([1, true])
ERROR: Don't do that
# ...
# OK

julia> cos_broadcast([1])
ERROR: Don't do that
# ...
# Oh noes

# Screw it, warn the user of the consequences
julia> function cos_broadcast(v::AbstractVector{<: Number})
           if true in v || false in v
               @warn "You may have added a boolean value to the vector, if you did so willingly, please consider an alternative career"
           end
           cos.(v)
       end
cos_broadcast (generic function with 1 method)

julia> cos_broadcast([1])
┌ Warning: You may have added a boolean value to the vector, if you did so willingly, please consider an alternative career
# ...

Julia is still much stricter about types than in Python or Javascript, and I feel that type checking and multiple dispatch means it hits a sweet spot for type safety between statically typed and dynamically typed languages, pulling advantages from both ends. This has merely been a random musing, and the potential start of a fruitful blogging adventure, that highlights how dreadfully distinct software development can be at times.