Originally published on Medium: https://dariodip.medium.com/go-embed-unleashed-1eab8b4b1ba6

One of the reasons why I love Go is that Go compiler creates static binaries by default unless you use cgo to call C code, in which case it will create a dynamically linked library.

It makes it easy to ship Go programs, create very lightweight Docker images and exploit cross-compiling and plugins.

Starting from Go 1.16, it also gives you the opportunity to ship your own “batteries” using the package embed.


Package embed provides access to files embedded in the running Go program.

Importing the library embed you can use the //go:embed directive to initialize a variable of type string, []byte or embed.FS with the content of files read from the package directory or subdirectories at compile time. In other words, //go:embed directive allows you to include the contents of arbitrary files and directories in your Go application.

Using the //go:embed comment to your code, the compiler will know to include files to the static binary. The base form of the comment is the following:

//go:embed FILENAMES  
var <VARIABLE NAME> <string, []byte or embed.FS> 

So, your “embedded” variable can be of type string, []byte or embed.FS for a group of files.

The go:embed directive understands Go file globs, so patterns like files/.txt will also work (but not **/.txt recursive globbing).

For a complete technical explanation, you can readthe official documentation.

Let’s see a simple example using //go:embed to embed a file into a variable.

➜  ls         
hello.txt         hello_embedded.go  
➜  cat hello.txt   
Hello from embedded!  
➜  go run hello_embedded.go  
Hello from embedded!

The same result can be achieved using []byte:

And still, we can embed an entire directory as it is shown in the following snippet:

Doing that, we can also iterate over the embedded filesystem. This makes us able to include several files and folders, creating a kind of overlay filesystem.

The method ReadDir returns a list of DirEntry, an interface that declares the following methods:

  • Name(): returns the name of the file (or subdirectory) described by the entry;
  • IsDir(): reports whether the entry describes a directory;
  • Type(): returns the type bits for the entry;
  • Info(): returns the FileInfo for the file or subdirectory described by the entry.

A practical example: caching complex structures

A very useful use case for //go:embed can be caching complex structures.

Imagine we have a program that handles structures that are expensive to be created but that will be kept into memory for all the life of the program. We may take advantage of //go:embed to embed them into memory at the start time.

In our example, we will use a ComplexStruct structure, that stores a value and its MD5 and SHA256 ingests. Here the code:

It is clear that the method NewComplexStruct is expensive since it has to compute both MD5 and SHA256 ingests of the value.

Supposing we know that some values are going to be used, we could create an encoder to store a cached version of our values, embedding it into our program to take advantage of precomputed values. Here the code for the encoder:

Running this function with a precomputed list of ComplexStruct writes a file called complex_structures.gob in the current directory. Running this code I have created a gob file. Examining that file we will know that it is a binary file of 396 bytes:

➜  file complex_structures.gob   
complex_structures.gob: data  
➜  wc -c complex_structures.gob   
     3422 complex_structures.gob

Now let’s write our piece of code to decode it into our program:

Running the program will give us the following result:

➜ go run hello_embedded.go      
v=ciao,md5=6e6bc4e49dd477ebc98ef4046c067b5f,sha256=b133a0c0e9bee3be20163d2ad31d6248db292aa6dcb1ee087a2aa50e0fc75ae2  
v=dario,md5=8a49317e060e23bb86f9225ca581e0a9,sha256=2dbe33913ae5d6b16a801119f5fa1c419620c26d1a456e01490d697eb9b12589  
v=golang,md5=21cc28409729565fc1a4d2dd92db269f,sha256=d754ed9f64ac293b10268157f283ee23256fb32a4f8dedb25c8446ca5bcb0bb3  
v=complex,md5=4b8bafdec076f25030c303049f4e6586,sha256=ea4b35e8f83279eab1e670e389d71201b360f291a0dc30c659ed708ac9c67d76  
v=structure,md5=07414f4e15ca943e6cde032dec85d92f,sha256=520cdb563bf80b193aab6aad62781a9647c75dbf76748117299c7dac0ae63a87  
...

Not that easy to read but efficient 😄.

Let’s see how the size of our Go binary changes including an embedded file:

➜  go build -o embedded .  
➜  wc -c embedded   
 2468752 embedded  
➜  go build -o notembedded .  
➜  wc -c notembedded          
 2464624 notembedded

So our binary with an embedded file of 396 bytes will be 4128 bytes bigger. Trying with a bigger file the result is the following:

➜  wc -c complex_structures.gob   
    3422 complex_structures_bigger.gob  
➜  wc -c embedded   
 2464592 embedded

And it is 4160 bytes bigger. So don’t panic, it won’t waste your binary.

After a few analysis, I tried to understand where and how the embedded file is actually embedded. Inspections have been made on the binaries from the first example, the one with the text file having “Hello from embedded!”, because text value is easier to find into a binary.

➜  strings embedded_test | grep -i hello  
 to unallocated span/usr/share/zoneinfo/37252902984619140625EMULTIHOP (Reserved)Egyptian_HieroglyphsIDS_Trinary_OperatorMeroitic_HieroglyphsSIGALRM: alarm clockSIGTERM: terminationSTREAM ioctl timeoutSeek: invalid offsetSeek: invalid whenceTerminal_Punctuationauthentication errorbad defer size classbad system page sizebad use of bucket.bpbad use of bucket.mpchan send (nil chan)close of nil channeldodeltimer0: wrong Pfloating point errorforcegc: phase errorgc_trigger underflowgo of nil func valuegopark: bad g statusinconsistent lockedminvalid write resultmalloc during signalnotetsleep not on g0p mcache not flushedpacer: assist ratio=preempt off reason: reflect.makeFuncStubruntime: double waitruntime: pipe failedruntime: unknown pc semaRoot rotateRighttime: invalid numbertrace: out of memorywirep: already in goworkbuf is not emptywrite of Go pointer  of unexported method previous allocCount=, levelBits[level] = 186264514923095703125931322574615478515625Anatolian_HieroglyphsHello from embedded!

There it is. Let’s understand where it is stored inside the binary file:

➜  xxd embedded_test | grep -B 2 -A 2 "from"  
--  
000c9d10  35 34 37 38 35 31 35 36  32 35 41 6e 61 74 6f 6c  |5478515625Anatol|  
000c9d20  69 61 6e 5f 48 69 65 72  6f 67 6c 79 70 68 73 48  |ian_HieroglyphsH|  
000c9d30  65 6c 6c 6f 20 66 72 6f  6d 20 65 6d 62 65 64 64  |ello from embedd|  
000c9d40  65 64 21 0a 49 6e 73 63  72 69 70 74 69 6f 6e 61  |ed!.Inscriptiona|  
000c9d50  6c 5f 50 61 68 6c 61 76  69 4f 74 68 65 72 5f 47  |l_PahlaviOther_G|  
--  
...

Finally, we know where our string is. It is embedded from the offset 000c9d20 to 000c9d40 in our binary.

Warning

Never forget to include embed in your code to avoid compiler errors like:

➜  go build -o notembedded .     
./hello_embedded.go:42:3: //go:embed only allowed in Go files that import "embed"

Have a nice embedding.