tags : Programming Languages, Concurrency in Golang

Compiling, Linking and OS dependencies

Syscalls

How does go make syscalls?

  • Differs by Operating system
  • In Linux, because it has a stable ABI, it makes the syscalls directly skipping libc. w Linux, the kernel to userspace interface is stable(which syscalls use), but the in kernel interfaces are not stable.
  • Enabling or disabling CGO has nothing to do with whether or not go syscall goes via libc, the C code used via CGO ofcourse will use libc.
  • Go does create wrappers around some syscalls (TODO: Need to dig into this)
  • Support for different syscalls in different OS is incremental. Eg. If X syscall is available in OS A and B. Go might have support for X only in A as of the moment.

Places where go handles syscalls

Portability

  • syscalls are not portable by nature, they are specific to the system. We need to add a build tag for the ARCH and OS that syscall invocation is valid for.

Dynamic and Static Linking

Static linking

  • musl

    • Statically compiled Go programs, always, even with cgo, using musl
    • You can also statically link with musl, but note that musl lacks features that people might want to use non-pure Go in the first place. For example, musl does’t support arbitrary name resolvers, e.g. no LDAP support; it only supports DNS, just like the pure Go net package.
    • On the other hand musl does support os/user.

Dynamic linking

CGO

CGO is essentially utilizing C api calls to shared libraries exporting C interface. It is a tradeoff.

Using CGO

  • CGO_ENABLED=1
  • Some things are only available as C libraries, re-implementing that in Go would be costly.
  • CGO is also used in some parts of standard library. Eg. (net and os/user). It’s not a strict requirement though, you can use these packages w/o CGO and they’ll use stripped down version written in Go. But if you want the full thing, you have no other option than to enable CGO

Without using CGO

Cross Compilation

See Cross Compilation

Cross compilation in Golang

  • Unless you’re using a native cross compiler(eg. Clang, Golang Compiler), to cross-compile a program, you need to separately build and install a complete gcc+binutils toolchain for every individual arch that you want to target.
  • Which Go this is easy(cross compiler out of the box) + dependencies ensured to support.

Cross compilation and CGO

CGO allows us to access C libraries in the system we’re building/compiling on. It has no idea about C libraries of other systems. So mostly CGO is disabled by default if cross-compiling. However, if you need to cross-compile go code with CGO, you need a cross-compiling C compiler for the target machine. It can be done but it is a bit of PITA.

  • Using Zig

    CGO_ENABLED=1 GOOS=linux GOARCH=amd64 CC="zig cc -target x86_64-linux" CXX="zig c++ -target x86_64-linux" go build --tags extended

Others

Packages and Modules

  • Read Go Modules Reference
  • module-aware mode is the way 2 go me fren. (GO111MODULE=""/auto)
  • ditch gopath

Meta notes

  • package path and module path may look similar, the difference lies in the existence of go.mod file inside the directory. i.e Repository root need not be the place where the module is defined.

Packages

  • package path / import path
    • Identity of a package
    • module path + subdirectory
    • Eg. golang.org/x/net/html
  • Each directory that has Go source code inside, possibly including the root directory, is a package.
  • Example: x/... matches x as well as x’s subdirectories.

Module

  • The tree(module) with branches(packages) and leafs(*.go files) growing on branches.
  • packages sharing the same lifecycle(version number) are bundled into a module.
  • module path
    • Defined in the go.mod file by the module directive
    • Identity of a module
    • Acts as a prefix for package import paths within the module.
    • Eg. golang.org/x/net, golang.org/x/tools/gopls

Semantic versioning & versions from VCS

  • A version identifies an immutable snapshot of a module. Each version starts with the letter v + semantic versioning.
  • v0.0.0, v1.12.134, v8.0.5-pre, v9.2.2-beta+meta and v2.0.9+meta are valid versions.

VCS and pseudo versioning

  • We can also get modules from VCS using tags/branches/revisions/commits that don’t follow semantic versioning.
  • In these cases, the go command will replace golang.org/x/net@daa7c041 with v0.0.0-20191109021931-daa7c04131f5.
  • This is called pseudo-version. You usually won’t be typing a pseudo version by hand.

Why separate directory for Major versions

Golden rule: If an old package and a new package have the same import path ⇒ The new package must be backwards compatible with the old package.

  • v0 / pre-release suffix: Unstable, doesn’t need to be backwards compatible. No major version suffix directory allowed. So when starting new projects be under v0 as long as possible.
  • v1 : Defines the compatibility/stability. No major version suffix directory allowed.
  • v2 / v2+ : Since major version bump by definition means breaking changes, by the golden rule, we need it to have separate module path.

Building

What happens when go command tries to load a package?

  • When we try to load a package, indirectly we need to find the module path
  • It first looks into the build list, if not found it’ll try to fetch the module (latest version) from a module proxy mentioned in the GOPROXY env var.
  • go tidy / go get does this automatically.

Generating build list

  • When we run the go command, a list of final module versions is prepared from the go.mod file of main module + transitively required modules using minimal version selection. This final list of module+version is used for go{build,list,test,etc}. This is the build list
  • // indirect : This is added to go.mod of main module, when module is not directly required in the main module. So you should have all the dependencies in the go.mod file.

Workspaces

  • New feature 1.18+
  • You’re not meant to commit go.work files. They are meant for local changes only.
  • Has use and replace directives that can be useful for scratch work

Module Proxy

  • module proxy is an HTTP server that can respond to GET requests for certain paths
  • We don’t have a central package authority in the vein of npm or crates.io. Go modules have no names, only paths. The package management system uses the package path/module path to learn how to get the package. If it can’t find the package locally, it’ll try getting it from a module proxy.
  • module proxy related vars: GOPRIVATE, GONOPROXY, GOPROXY="https://proxy.golang.org,direct"
  • Different module proxies can have their own conventions (Eg. gopkg.in has some diff conventions)

Access private packages is a PITA

Project organization and dependencies

Standard Library

Project structure

  • multi-module monorepos is unusual
  • multi-package monorepo is common

Language topics

Pointers

  • There’s no pointer arithmetic in go
  • Go guarantees that, thing being pointed to will continue to be valid for the lifetime of the pointer.
    func f() *int {
            i := 1
            return &i
    } // Go will arrange memory to store i after f returns.

Methods

  • In general, all methods on a given type should have either value or pointer receivers, but not a mixture of both.

Context

Signaling and Request cancellation

  • Example: a client timeout - > your request context is canceled - > every I/O operations and long running processes will be canceled too
  • It’s not possible for a function that takes a context.Context to cancel it
    • It could do is newCtx, cancel := context.WithCancel(origCtx).
    • It can listen for Done on that ctx and do something(usually cancellation of ongoing task) based on it.
    • Done is triggered when cancel() on the ctx is called.
    • When a Context is canceled, all Contexts derived from it are canceled.
      • Eg. when cancel() is called on newCtx, newCtx and all Contexts derived from it are canceled. (origCtx is NOT canceled)
      • Eg. when cancel() is called on origCtx, origCtx and all Contexts derived from it are canceled. (origCtx and newCtx are canceled)
  • context.Background() is never canceled.

Storing values

  • The storage of values in a context is a bit controversial. main use case for “context” is cancellation signals.
  • In the above example, newCtx will have access to the same values as origCtx
  • context.Value() is like Thread Local Storage (see Threads, Concurrency) for goroutines but in a cheap suit.

Other notes on context

  • Context is that it should flow through your program.

    • Imagine a river or running water.
    • Do pass from function to function down your call stack, augmented as needed. (Usually as the first argument)
    • Don’t want to store it somewhere like in a struct.
    • Don’t want to keep it around any more than strictly needed.
  • When to create context?

    • Good practice to add a context to anything that might block on I/O, regardless of how long you assume it might take.
    • Context object is created with each request and expires when the request is over. (request is general sense)
    • context.Background()
      • Use pure context.Background() ONLY to handle your app lifecycle, never in a io/request function.
      • Just passing context.Background() there offers no functionality.
    • context.WithCancel
      • In io/request functions, use something like context.WithCancel(context.Background()) because that’ll allow you to cancel the context.
      • Fresh context
        • Eg. context.WithCancel(context.Background()), context.WithCancel(context.TODO())
      • Derived context
        • Eg. context.WithCancel(someExistingCtx)
    • context.TODO
      • Adding context to program later can be problematic, so consider using context.TODO if unsure what context to use. It’s similar to using context.Background() but it’s a clue to your future self that you are not sure about the context yet rather than you explicitly want a background context.
  • Separation of context

    • General rule: If the work(i/o) you’re about to perform can outlive the lifetime of outer function, you’d want to create a fresh context instead of deriving from the context of the outer function(if there is one)
      • Eg. HTTP requests context are not derived from the server context as you still want to process on-going request while the app shuts down.
    • Think clearly about the boundaries and lifetimes, don’t mess app context to handle async function, internal consumer or request etc.
  • Context package and HTTP package

    • You can get the context from http.Request with .Context(). It’s like this is because the http package was written before context was a thing.
    • Outgoing client requests, the context is canceled when
      • We explicitly cancel the context
    • Incoming server requests, the context is canceled when
      • The client’s connection closes
      • The request is canceled (with HTTP/2)
      • The ServeHTTP method returns
  • Context an Instrumentation

    • Instrumentation libraries generally use the context to hold the current span, to which new child spans can be attached.

Resources on context

Maps

  • Go Maps are hashmap. O(1) ACR, O(n) WCR
  • A Map value is a pointer to a runtime.hmap structure.
  • Since it’s a pointer, it should be written as *map[int]int instead of map[int]int. Go team changed this historically cuz it was confusing anyway.
  • Maps change it’s structure
    • When you insert or delete entries
    • The map may need to rebalance itself to retain its O(1) guarantee
  • What the compiler does when you use map
    v := m["key"]     // → runtime.mapaccess1(m, ”key", &v)
    v, ok := m["key"] // → runtime.mapaccess2(m, ”key”, &v, &ok)
    m["key"] = 9001   // → runtime.mapinsert(m, ”key", 9001)
    delete(m, "key")  // → runtime.mapdelete(m, “key”)

Embedding interfaces & structs

Embedding Interface

// combines Reader and Writer interfaces
type ReadWriter interface {
    Reader
    Writer
}

Embedding Struct

  • Embedding directly, no additional bookkeeping

    • When invoked, the receiver of the method is the inner type not the outer one.
    • i.e when the Read method of a bufio.ReadWriter is invoked, receiver is the inner Reader and not ReadWriter.
     
    // bufio.ReadWriter
    type ReadWriter struct {
        *Reader  // *bufio.Reader
        *Writer  // *bufio.Writer
    	*log.Logger
    }
     
    // - the type name of the field, ignoring the package
    //   qualifier, serves as a field name
    // - Name conflicts are ez resolvable
    var poop ReadWriter
    poop.Reader // refers to inner Reader
    poop.Logger // refers to inner Logger
  • Embedding in-directly, additional bookkeeping

    type ReadWriter struct {
        reader *Reader
        writer *Writer
    }
    func (rw *ReadWriter) Read(p []byte) (n int, err error) {
        return rw.reader.Read(p)
    }

Error and panics

  • recover only makes sense inside defer
  • defer can modify named return values

Aliases

  • type byte = uint8
  • type rune = int32
  • type any = interface{}

Interfaces

  • Interfaces are just description of what something should resemble, by the methods.
  • The implementation of the interface can be done by a struct, int, func anything. Doesn’t matter. You can define a method on a func, on a int just the same way you can define a method on a struct.

io stuff

Ben Johnson has great blogpost series covering these in good depth

  • io
    • Abstractions on byte-stream
    • General io utility functions that don’t fit elsewhere.
  • bufio
    • Like io but with a buffer
    • Wraps io.Reader and io.Writer and helps w automatic buffering
  • bytes
    • Represent byte slice([]byte) as byte-stream (strings also provide this)
    • general operations on []byte.
    • bytes.Buffer implements io.Writer (useful for tests)
  • io/ioiutil (deprecated)
    • Deprecated
    • functionality moved to io or os packages

io

  • Reading

    • Read
      • returns io.EOF as normal part of usage
      • If you pass an 8-byte slice you could receive anywhere between 0 and 8 bytes back.
    • ReadFull
      • for strict reading of bytes into buffer.
    • MultiReader
      • Concat multiple readers into one
      • Things are read in sequence
      • Eg. Concat in memory header with some file reader
    • TeeReader
      • Like the tee command. Specify an duplicate writer when reader gets read. Might be useful for debugging etc.
  • Writing

    • MultiWriter
      • Duplicate writes to multiple writers. Similar to TeeReader tho but happens when writing shit
    • WriteString
      • An performance improvement on Write on packages that support it. Falls back to Write
  • Transferring btwn Reading & Writing

    • Copy : Allocates a 32KB temp buff to copy from src:Reader to dst:Writer
    • CopyBuffer : Provide your own buffer instead on letting Copy create one
    • CopyN : Similar to copy but you can set a limit on total bytes. Useful when reader is continuously growing for example or want to do limited read etc.
    • WriteTo and ReadFrom are optimized methods that are supposed to transfer data without additional allocation. If available, Copy will use these.
  • Files

    Usually, you have a continuous stream of bytes. But files are exceptions. You can do stuff like Seek w them.

  • Reading and Writing Bytes(uint8) & Runes(int32)

    • ByteReader
    • ByteWriter
    • ByteScanner
    • RuneReader
    • RuneScanner
    • There’s no RuneWriter btw

bytes and strings package

Provides a way to interface in-memory []byte and string as io.Reader and io.Writers

  • bytes package has 2 types
    • bytes.Reader which implements io.Reader (NewReader)
    • bytes.Buffer which implements io.Writer
  • bytes.Buffer is OK for tests etc
    • Consider bufio for proper usecases w buffer related io.
    • bytes.Buffer is a buffer with two ends
      • can only read from the start of it
      • can only write to the end of it
      • No seeking

strings, bytes, runes, characters

  • Formal for loop will loop through byte in string but for range loop will loop through rune
  • string : Readonly slice of bytes. NOT slice of characters.
  • “poop” is a string. `poop` is a raw string.
    • string can contain escape sequences, so they’re not always UTF-8.
    • raw string cannot contain escape sequences, only UTF-8 because Go source code is UTF-8. (almost always)
  • Unicode
    • See Unicode
    • code point U+2318, hex val 2318, (bytes e28c98) represents the symbol .
  • character
    • May be represented by a number of different sequences of code points
      • i.e different sequences of UTF-8 bytes
    • In Go, we call Unicode code points as rune (int32).

Encoding

Encoding vs Marshaling

  • Usually these mean the same thing, but Go has specific meanings.
  • x.Encoder & x.Decoder are for working w io.Writer & io.Reader (files eg.)
  • x.Marshaler & x.Unmarshaler are for working w []byte (in memory)

Encoding for Primitives vs Complex objects

  • Primitive stuff

    • bytes
      • Text encoding(base64)/ binary encoding
      • encoding package
        • BinaryMarshaler, BinaryUnmarshaler, TextMarshaler, TextUnmarshaler
        • These are not used so much because there’s not a single defined way to marshal an object to binary format. Instead we have Custom Protocols which is covered w other packages such as encoding/json etc.
      • encoding/hex, encoding/base64 etc.
    • integers
      • encoding/binary, wen we needs endian stuff and variable length encoding
      • For in-memory we have ByteOrder interface
      • For streams we have Read and Write. This also supports composite types but better to just use Custom Protocols.
    • string
      • ASCII, UTF8
      • unicode/utf16, encoding/ascii85, golang.org/x/text, fmt, strconv etc.
  • Complex obj stuff

    • Complex objects is where Custom Protocols comes in
    • This is mostly about encoding more complex stuff like language specific data structure etc.
    • Here we can go JSON, CSV, Protocol Buffers, MsgPack etc etc.
    • In a sense, Database‘es also encode data for us.
    • Example packages: encoding/json, encoding/xml, encoding/csv, encoding/gob. Other external stuff is always there like Protocol Buffers.

More on encoding/json

  • Encoding process
    • For primitives we have in-built mapping for json
    • For custom objects, it checks if types for json.Marshaler, if not then encoding.TextMarshaler. Eg. Time implements TextMarshaler which creates RFC3339 string. Otherwise it builds it from primitives then that’s cached for future use.
  • Decoding
    • 2 parts
      • 1st parse (Scanner)
      • convert stuff to appropriate data type. Eg. Base 10 numbers to base 2 ints. (Decodestate) Uses reflect
    • JSON is LL(1) Parsable. (See Context Free Grammar (CFG)) so uses uses a single byte lookahead buffer

cgo

Go Pointer, Pass to GoGo Pointer, Pass to C
Go codeYESYES, must point to C memory
C codeNO
  • Go’s pointer type can contain C pointers aswell as Go pointers
  • Go pointers, passed to C may only point to data stored in C

Application architecture

  • Accept interfaces(broader types)
  • Return structs(specific types)

Handler vs HandlerFunc

  • See HandleFunc vs Handle : golang
  • Anything(struct/function etc.) that implements the http.Handler interface
  • The interface has the ServeHTTP method for handling HTTP requests and generating response
  • Avoid putting business logic in handlers the same way you won’t put business logic into controllers
  • http.HandlerFunc is an example of a handler (of type function) which implements http.Handler
  • When we write functions that contain the signature of http.HandlerFunc we’ve written a handler function.

Logging and Error Handling

  • In short: log the error once, at the point you handle it.
    • Only log the error where it is handled, otherwise wrap it and return it without logging.
    • At some point, you will log it as either an error if there is nothing you can do about it, or a warning if somehow you can recover from it (not panic recover).
    • However your log record will contain the trace from the point where the error occurred, so you have all the information you need.

Go and sqlite

Go and Databases

Notes on using sqlc with golang

Postgres Gotchas

  • PostgreSQL DEFAULT is for when you don’t provide a column value in INSERT statement. If you provide NULL as a value it’ll be considered as a value and DEFAULT won’t apply.