From 567175f833b13d7c4f9363504616d17b1a3efa2c Mon Sep 17 00:00:00 2001 From: Andrew Kelley Date: Mon, 18 Mar 2019 21:40:24 -0400 Subject: [PATCH] add documentation for Memory closes #1904 --- doc/langref.html.in | 262 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 255 insertions(+), 7 deletions(-) diff --git a/doc/langref.html.in b/doc/langref.html.in index b8ee292462..8e6f22a049 100644 --- a/doc/langref.html.in +++ b/doc/langref.html.in @@ -7928,13 +7928,261 @@ pub fn main() void { {#header_close#} {#header_open|Memory#} -

TODO: explain no default allocator in zig

-

TODO: show how to use the allocator interface

-

TODO: mention debug allocator

-

TODO: importance of checking for allocation failure

-

TODO: mention overcommit and the OOM Killer

-

TODO: mention recursion

- {#see_also|Pointers#} +

+ The Zig language performs no memory management on behalf of the programmer. This is + why Zig has no runtime, and why Zig code works seamlessly in so many environments, + including real-time software, operating system kernels, embedded devices, and + low latency servers. As a consequence, Zig programmers must always be able to answer + the question: +

+

{#link|Where are the bytes?#}

+

+ Like Zig, the C programming language has manual memory management. However, unlike Zig, + C has a default allocator - malloc, realloc, and free. + When linking against libc, Zig exposes this allocator with {#syntax#}std.heap.c_allocator{#endsyntax#}. + However, by convention, there is no default allocator in Zig. Instead, functions which need to + allocate accept an {#syntax#}*Allocator{#endsyntax#} parameter. Likewise, data structures such as + {#syntax#}std.ArrayList{#endsyntax#} accept an {#syntax#}*Allocator{#endsyntax#} parameter in + their initialization functions: +

+ {#code_begin|test|allocator#} +const std = @import("std"); +const Allocator = std.mem.Allocator; +const assert = std.debug.assert; + +test "using an allocator" { + var buffer: [100]u8 = undefined; + const allocator = &std.heap.FixedBufferAllocator.init(&buffer).allocator; + const result = try concat(allocator, "foo", "bar"); + assert(std.mem.eql(u8, "foobar", result)); +} + +fn concat(allocator: *Allocator, a: []const u8, b: []const u8) ![]u8 { + const result = try allocator.alloc(u8, a.len + b.len); + std.mem.copy(u8, result, a); + std.mem.copy(u8, result[a.len..], b); + return result; +} + {#code_end#} +

+ In the above example, 100 bytes of stack memory are used to initialize a + {#syntax#}FixedBufferAllocator{#endsyntax#}, which is then passed to a function. + As a convenience there is a global {#syntax#}FixedBufferAllocator{#endsyntax#} + available for quick tests at {#syntax#}std.debug.global_allocator{#endsyntax#}, + however it is deprecated and should be avoided in favor of directly using a + {#syntax#}FixedBufferAllocator{#endsyntax#} as in the example above. +

+

+ Currently Zig has no general purpose allocator, but there is + one under active development. + Once it is merged into the Zig standard library it will become available to import + with {#syntax#}std.heap.default_allocator{#endsyntax#}. However, it will still be recommended to + follow the {#link|Choosing an Allocator#} guide. +

+ + {#header_open|Choosing an Allocator#} +

What allocator to use depends on a number of factors. Here is a flow chart to help you decide: +

+
    +
  1. + Are you making a library? In this case, best to accept an {#syntax#}*Allocator{#endsyntax#} + as a parameter and allow your library's users to decide what allocator to use. +
  2. +
  3. Are you linking libc? In this case, {#syntax#}std.heap.c_allocator{#endsyntax#} is likely + the right choice, at least for your main allocator.
  4. +
  5. + Is the maximum number of bytes that you will need bounded by a number known at + {#link|comptime#}? In this case, use {#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} or + {#syntax#}std.heap.ThreadSafeFixedBufferAllocator{#endsyntax#} depending on whether you need + thread-safety or not. +
  6. +
  7. + Is your program a command line application which runs from start to end without any fundamental + cyclical pattern (such as a video game main loop, or a web server request handler), + such that it would make sense to free everything at once at the end? + In this case, it is recommended to follow this pattern: + {#code_begin|exe|cli_allocation#} +const std = @import("std"); + +pub fn main() !void { + var direct_allocator = std.heap.DirectAllocator.init(); + defer direct_allocator.deinit(); + + var arena = std.heap.ArenaAllocator.init(&direct_allocator.allocator); + defer arena.deinit(); + + const allocator = &arena.allocator; + + const ptr = try allocator.create(i32); + std.debug.warn("ptr={*}\n", ptr); +} + {#code_end#} + When using this kind of allocator, there is no need to free anything manually. Everything + gets freed at once with the call to {#syntax#}arena.deinit(){#endsyntax#}. +
  8. +
  9. + Are the allocations part of a cyclical pattern such as a video game main loop, or a web + server request handler? If the allocations can all be freed at once, at the end of the cycle, + for example once the video game frame has been fully rendered, or the web server request has + been served, then {#syntax#}std.heap.ArenaAllocator{#endsyntax#} is a great candidate. As + demonstrated in the previous bullet point, this allows you to free entire arenas at once. + Note also that if an upper bound of memory can be established, then + {#syntax#}std.heap.FixedBufferAllocator{#endsyntax#} can be used as a further optimization. +
  10. +
  11. + Are you writing a test, and you want to make sure {#syntax#}error.OutOfMemory{#endsyntax#} + is handled correctly? In this case, use {#syntax#}std.debug.FailingAllocator{#endsyntax#}. +
  12. +
  13. + Finally, if none of the above apply, you need a general purpose allocator. Zig does not + yet have a general purpose allocator in the standard library, + but one is being actively developed. + You can also consider {#link|Implementing an Allocator#}. +
  14. +
+ {#header_close#} + + {#header_open|Where are the bytes?#} +

String literals such as {#syntax#}"foo"{#endsyntax#} are in the global constant data section. + This is why it is an error to pass a string literal to a mutable slice, like this: +

+ {#code_begin|test_err|expected type '[]u8'#} +fn foo(s: []u8) void {} + +test "string literal to mutable slice" { + foo("hello"); +} + {#code_end#} +

However if you make the slice constant, then it works:

+ {#code_begin|test|strlit#} +fn foo(s: []const u8) void {} + +test "string literal to constant slice" { + foo("hello"); +} + {#code_end#} +

+ Just like string literals, `const` declarations, when the value is known at {#link|comptime#}, + are stored in the global constant data section. Also {#link|Compile Time Variables#} are stored + in the global constant data section. +

+

+ `var` declarations inside functions are stored in the function's stack frame. Once a function returns, + any {#link|Pointers#} to variables in the function's stack frame become invalid references, and + dereferencing them becomes unchecked {#link|Undefined Behavior#}. +

+

+ `var` declarations at the top level or in {#link|struct#} declarations are stored in the global + data section. +

+

+ The location of memory allocated with {#syntax#}allocator.alloc{#endsyntax#} or + {#syntax#}allocator.create{#endsyntax#} is determined by the allocator's implementation. +

+

TODO: thread local variables

+ {#header_close#} + + {#header_open|Implementing an Allocator#} +

Zig programmers can implement their own allocators by fulfilling the Allocator interface. + In order to do this one must read carefully the documentation comments in std/mem.zig and + then supply a {#syntax#}reallocFn{#endsyntax#} and a {#syntax#}shrinkFn{#endsyntax#}. +

+

+ There are many example allocators to look at for inspiration. Look at std/heap.zig and + at this + work-in-progress general purpose allocator. + TODO: once #21 is done, link to the docs + here. +

+ {#header_close#} + + {#header_open|Heap Allocation Failure#} +

+ Many programming languages choose to handle the possibility of heap allocation failure by + unconditionally crashing. By convention, Zig programmers do not consider this to be a + satisfactory solution. Instead, {#syntax#}error.OutOfMemory{#endsyntax#} represents + heap allocation failure, and Zig libraries return this error code whenever heap allocation + failure prevented an operation from completing successfully. +

+

+ Some have argued that because some operating systems such as Linux have memory overcommit enabled by + default, it is pointless to handle heap allocation failure. There are many problems with this reasoning: +

+ + {#header_close#} + + {#header_open|Recursion#} +

+ Recursion is a fundamental tool in modeling software. However it has an often-overlooked problem: + unbounded memory allocation. +

+

+ Recursion is an area of active experimentation in Zig and so the documentation here is not final. + You can read a + summary of recursion status in the 0.3.0 release notes. +

+

+ The short summary is that currently recursion works normally as you would expect. Although Zig code + is not yet protected from stack overflow, it is planned that a future version of Zig will provide + such protection, with some degree of cooperation from Zig code required. +

+ {#header_close#} + + {#header_open|Lifetime and Ownership#} +

+ It is the Zig programmer's responsibility to ensure that a {#link|pointer|Pointers#} is not + accessed when the memory pointed to is no longer available. Note that a {#link|slice|Slices#} + is a form of pointer, in that it references other memory. +

+

+ In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers. + In general, when a function returns a pointer, the documentation for the function should explain + who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever, + to free the pointer. +

+

+ For example, the function's documentation may say "caller owns the returned memory", in which case + the code that calls the function must have a plan for when to free that memory. Probably in this situation, + the function will accept an {#syntax#}*Allocator{#endsyntax#} parameter. +

+

+ Sometimes the lifetime of a pointer may be more complicated. For example, when using + {#syntax#}std.ArrayList(T).toSlice(){#endsyntax#}, the returned slice has a lifetime that remains + valid until the next time the list is resized, such as by appending new elements. +

+

+ The API documentation for functions and data structures should take great care to explain + the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it + is to free the memory referenced by the pointer, and lifetime determines the point at which + the memory becomes inaccessible (lest {#link|Undefined Behavior#} occur). +

+ {#header_close#} {#header_close#} {#header_open|Compile Variables#}