Update Docs

This commit is contained in:
Adrien Bouvais 2024-11-26 21:31:53 +01:00
parent e41f53b692
commit 3c01150e33
7 changed files with 180 additions and 134 deletions

View File

@ -1,37 +1,25 @@
# Data types
There is 8 data types:
ZipponDB have a little set of types. This is on purpose, to keep the database simple and fast. But more type may be added in the future.
- `int`: 32 bit integer
- `float`: 64 bit float. Need to have a dot, `1.` is a float `1` is an integer
- `bool`: Boolean, can be `true` or `false`
- `string`: Character array between `''`
- `UUID`: Id in the UUID format, used for relationship, ect. All struct have an id member
- `date`: A date in yyyy/mm/dd
- `time`: A time in hh:mm:ss.mmmm
- `datetime`: A date time in yyyy/mm/dd-hh:mm:ss:mmmm
## Primary Data Types
All data types can be an array of those types using `[]` in front of it. So `[]int` is an array of integer.
ZipponDB supports 8 primary data types:
## Date and time
| Type | Description | Example |
|------|-------------|---------|
| int | 32-bit integer | 42 |
| float | 64-bit float (must include a decimal point) | 3.14 |
| bool | Boolean value | true or false |
| string | Character array enclosed in single quotes | 'Hello, World!' |
| UUID | Universally Unique Identifier | 123e4567-e89b-12d3-a456-426614174000 |
| date | Date in yyyy/mm/dd format | 2024/10/19 |
| time | Time in hh:mm:ss.mmmm format | 12:45:00.0000 |
| datetime | Combined date and time | 2024/10/19-12:45:00.0000 |
ZipponDB use 3 different date and time data type. Those are use like any other type like `int` or `float`.
## Array Types
### Date
Any of these data types can be used as an array by prefixing it with `[]`. For example:
Data type `date` represent a single day. To write a date, you use this format: `yyyy/mm/dd`.
Like that: `2024/10/19`.
### Time
Data type `time` represent a time of the day. To write a time, you use this format: `hh:mm:ss.mmmmm`.
Like that: `12:45:00.0000`.
Millisecond and second are optional so this work too: `12:45:00` and `12:45`
### Datetime
Data type `datetime` mix of both, it use this format: `yyyy/mm/dd-hh:mm:ss.mmmmm`.
Like that: `2024/10/19-12:45:00.0000`.
Millisecond and second are optional so this work too: `2024/10/19-12:45:00` and `2024/10/19-12:45`
- `[]int`: An array of integers
- `[]string`: An array of strings

View File

@ -1,12 +1,56 @@
TODO: Update this part
# Quickstart
1. **Get a binary:** You can build the binary directly from the source code for any architecture (tutorial is coming), or using the binary in the release (coming too).
2. **Create a database:** You can then run the binary, this will start a Command Line Interface. The first thing to do is to create a new database. For that, run the command `db new path/to/directory`,
it will create a ZipponDB directory. Then `db metrics` to see if it worked.
3. **Select a database:** You can select a database by using `db use path/to/ZipponDB`. You can also set the environment variable ZIPPONDB_PATH, and it will use this path,
this needs to be the path to a directory with proper DATA, BACKUP, and LOG directories.
4. **Attach a schema:** Once the database is created, you need to attach a schema to it (see next section for how to define a schema). For that, you can run `schema init path/to/schema.txt`.
This will create new directories and empty files used to store data. You can test the current db schema by running `schema describe`.
5. **Use the database:** ou can now start using the database by sending queries like that: `run "ADD User (name = 'Bob')"`.
This guide will help you set up and start using ZipponDB quickly.
## Step 1: Get a Binary
Obtain a binary for your architecture by:
- Building from source code (tutorial coming soon)
- Downloading a pre-built binary from the releases page (coming soon)
## Step 2: Create a Database
Run the binary to start the Command Line Interface. Create a new database by running:
``` bash
db new path/to/directory
```
This will create a new ZipponDB directory. Verify the creation by running:
``` bash
db metrics
```
## Step 3: Select a Database
Select a database by running:
```bash
db use path/to/ZipponDB
```
Alternatively, set the `ZIPPONDB_PATH` environment variable to the path of a valid ZipponDB directory (containing DATA, BACKUP, and LOG directories).
## Step 4: Attach a Schema
Define a schema (see the next section for details) and attach it to the database by running:
```bash
schema init path/to/schema.txt
```
This will create the necessary directories and empty files for data storage. Test the current database schema by running:
```bash
schema describe
```
## Step 5: Use the Database
Start using the database by sending queries, such as:
```bash
run "ADD User (name = 'Bob')"
```
You're now ready to explore the features of ZipponDB!

View File

@ -1,12 +1,10 @@
# Schema
In ZipponDB, you use structures, or structs for short, and not tables to organize how your data is stored and manipulated. A struct has a name like `User` and members like `name` and `age`.
In ZipponDB, data is organized and manipulated using structures, or structs, rather than traditional tables. A struct is defined by a name, such as `User`, and members, such as `name` and `age`.
## Create a Schema
## Defining a Schema
ZipponDB use a seperate file to declare all structs to use in the database.
Here an example of a file:
To declare structs for use in your database, create a separate file containing the schema definitions. Below is an example of a simple schema file:
```lua
User (
name: str,
@ -15,9 +13,9 @@ User (
)
```
Note that `best_friend` is a link to another `User`.
In this example, the `best_friend` member is a reference to another `User` struct, demonstrating how relationships between structs can be established.
Here is a more advanced example with multiple structs:
Here's a more complex example featuring multiple structs:
```lua
User (
name: str,
@ -42,14 +40,16 @@ Comment (
)
```
***Note: `[]` before the type means an array of this type.***
*Note: The [] symbol preceding a type indicates an array of that type. For example, []User represents an array of User structs.*
## Migration to a new schema - Not yet implemented
## Schema Migration (Coming Soon)
In the future, you will be able to update the schema, such as adding a new member to a struct, and update the database. For the moment, you can't change the schema once it's initialized.
In future releases, ZipponDB will support schema updates, allowing you to modify existing structs or add new ones, and then apply these changes to your database. Currently, schema modifications are not possible once the database has been initialized.
## Commands
### Planned Migration Features
`schema init path/to/schema.file`: Init the database using a schema file.
`schema describe`: Print the schema use by the currently selected database.
- Add new members to existing structs
- Modify or remove existing members
- Rename structs or members
- Update relationships between structs
- More...

View File

@ -2,26 +2,22 @@
TODO
***Note: Code snipped do not necessary represent the actual codebase but are use to explain principle.***
***Note: Code snippets in this documentation are simplified examples and may not represent the actual codebase.***
# Tokenizers
## Tokenizers
All `Tokenizer` work similary and are based on the [zig tokenizer.](https://github.com/ziglang/zig/blob/master/lib/std/zig/tokenizer.zig)
Tokenizers are responsible for converting a buffer string into a list of tokens. Each token has a `Tag` enum that represents its type, such as `equal` for the `=` symbol, and a `Loc` struct with start and end indices that represent its position in the buffer.
The `Tokenizer` role is to take a buffer string and convert it into a list of `Token`. A token have an enum `Tag` that represent what the token is, for example `=` is the tag `equal`, and a `Loc` with a `start` and `end` usize that represent the emplacement in the buffer.
All tokenizers work similarly and are based on the [zig tokenizer.](https://github.com/ziglang/zig/blob/master/lib/std/zig/tokenizer.zig) They have two main methods: next, which returns the next token, and getTokenSlice, which returns the slice of the buffer that represents the token.
The `Tokenizer` itself have 2 methods: `next` that return the next `Token`. And `getTokenSlice` that return the slice of the buffer that represent the `Token`, using it's `Loc`.
This is how to use it:
Here's an example of how to use a tokenizer:
```zig
const toker = Tokenizer.init(buff);
const token = toker.next();
std.debug.print("{s}", .{toker.getTokenSlice(token)});
```
I usually use a `Tokenizer` in a loop until the `Tag` is `end`. And in each loop I take the next token and will use a switch on the `Tag` to do stuffs.
Here a simple example:
Tokenizers are often used in a loop until the `end` tag is reached. In each iteration, the next token is retrieved and processed based on its tag. Here's a simple example:
```zig
const toker = Tokenizer.init(buff);
var token = toker.next();
@ -31,31 +27,23 @@ while (token.tag != .end) : (token = toker.next()) switch (token.tag) {
}
```
### All tokenizers
### Available Tokenizers
There are four different tokenizers in ZipponDB:
There is 4 differents tokenizer in ZipponDB, I know, that a lot. Here the list:
- **ZiQL:** Tokenizer for the query language.
- **cli:** Tokenizer the commands.
- **schema:** Tokenizer for the schema file.
- **data:** Tokenizer for csv file.
They all have different `Tag` and way to parse the array of bytes but overall are very similar. The only noticable difference is that some use a null terminated string (based on the zig tokenizer) and other not.
Mostly because I need to use dupeZ to get a new null terminated array, not necessary.
Each tokenizer has its own set of tags and parsing rules, but they all work similarly.
# Parser
## Parser
`Parser` are the next step after the tokenizer. Its role is to take `Token` and do stuff or raise error. There is 3 `Parser`, the main one is for ZiQL, one for the schema and one for the cli.
Note that the cli one is just the `main` function in `main.zig` and not it's own struct but overall do the same thing.
Parsers are the next step after tokenization. They take tokens and perform actions or raise errors. There are three parsers in ZipponDB: one for ZiQL, one for schema files, and one for CLI commands.
A `Parser` have a `State` and a `Tokenizer` as member and have a `parse` method. Similary to `Tokenizer`, it will enter a while loop. This loop will continue until the `State` is `end`.
A parser has a `State` enum and a `Tokenizer` instance as members, and a parse method that processes tokens until the `end` state is reached.
Let's take as example the schema parser that need to parse this file:
```
User (name: str)
```
When I run the `parse` method, it will init the `State` as `start`. When in `start`, I check if the `Token` is a identifier (a variable name), if it is one I add it to the list of struct in the current schema, if not I raise an error pointing to this token.
Here the idea for a `parse` method:
Here's an example of how a parser works:
```zig
var state = .start;
var token = self.toker.next();
@ -68,54 +56,59 @@ while (state != .end) : (token = self.toker.next()) switch (state) {
}
```
The issue here is obviously that we are in an infinite loop that just going to add struct or print error. I need to change the `state` based on the combinaison of the current `state` and `token.tag`. For that I usually use very implicite name for `State`.
For example in this situation, after a struct name, I expect `(` so I will call it something like `expect_l_paren`. Here the idea:
```zig
var state = .start;
var token = self.toker.next();
while (state != .end) : (token = self.toker.next()) switch (state) {
.start => switch (token.tag) {
.identifier => {
self.addStruct(token);
state = .expect_l_parent;
},
else => printError("Error: Expected a struct name.", token),
},
.expect_l_parent => switch (token.tag) {
.l_paren => {},
else => printError("Error: Expected (.", token),
},
else => {},
}
```
The parser's state is updated based on the combination of the current state and token tag. This process continues until the `end` state is reached.
And that's basicly it, the entire `Parser` work like that. It is fairly easy to debug as I can print the `state` and `token.tag` at each iteration and follow the path of the `Parser`.
The ZiQL parser uses different methods for parsing:
Note that the `ZiQLParser` use different methods for parsing:
- **parse:** The main one that will then use the other.
- **parseFilter:** This will create a `Filter`, this is a tree that contain all condition in the query, what is between `{}`.
- **parseCondition:** Create a `Condition` based on a part of what is between `{}`. E.g. `name = 'Bob'`.
- **parseAdditionalData:** Populate the `AdditionalData` that represent what is between `[]`.
- **parseNewData:** Return a string map with key as member name and value as value of what is between `()`. E.g. `(name = 'Bob')` will return a map with one key `name` with the value `Bob`.
- **parseOption:** Not done yet. Parse what is between `||`
- `parse`: The main parsing method that calls other methods.
- `parseFilter`: Creates a filter tree from the query.
- `parseCondition`: Creates a condition from a part of the query.
- `parseAdditionalData`: Populates additional data from the query.
- `parseNewData`: Returns a string map with key-value pairs from the query.
- `parseOption`: Not implemented yet.
# FileEngine
## File parsing
The `FileEngine` is that is managing files, everything that need to read or write into files is here.
TODO: Explain ZipponData and how it works.
I am not goind into too much detail here as I think this will change in the future.
## Engines
# Multi-threading
TODO: Explain
How do I do multi-threading ? Basically all struct are saved in multiples `.zid` files. Each files have
a size limit defined in the config and a new one is created when no previous one is found with space left.
### DBEngine
When I run a GRAB query and parse all files and evaluate each struct, I use a thread pool and give a file
to each thread. Each thread have it's own buffered writer and once all finished, I concatenate all writer
and send it.
TODO: Explain
The only atomic value share by all threads are the number of founded struct (to stop thread if enough are found when
[10] is use). And the number of finished thread, so I know when I can concatenate and send stuffs.
### FileEngine
Like that this keep things simple and easy to implement. I dont have parallel thread that run different
that need to access the same file.
The file engine is responsible for managing files, including reading and writing. This section is not detailed, as it is expected to change in the future.
### SchemaEngine
TODO: Explain
### ThreadEngine
TODO: Explain
## Multi-threading
ZipponDB uses multi-threading to improve performance. Each struct is saved in multiple `.zid` files, and a thread pool is used to process files concurrently. Each thread has its own buffered writer, and the results are concatenated and sent once all threads finish.
The only shared atomic values between threads are the number of found structs and the number of finished threads. This approach keeps things simple and easy to implement, avoiding parallel threads accessing the same file.
## Filters
TODO: Explain the data strucutre and how it works.
## AdditionalData
TODO: Explain the data strucutre and how it works.
## Condition
TODO: Explain the data strucutre and how it works.
## NewData
TODO: Explain the data strucutre and how it works.

View File

@ -1,5 +1,7 @@
# Command Line Interface
ZipponDB use a CLI to interact, there is few commands available for now as focus was given to ZiQL. But more commands will be added in the future.
## run
Run a ZiQL query on the selected database.
@ -12,12 +14,6 @@ run QUERY
## db
**Usage:**
```
db COMMAND
```
### db metrics
Print some metrics from the db, including: Size on disk and number of entities stored.
@ -94,6 +90,16 @@ Name | Type | Description | Default
---- | ---- | ------------------- | ----
TODO | TODO | TODO | TODO
### schema describe
Print the schema use by the selected database.
**Usage:**
```
schema use path/to/schema.file [OPTIONS]
```
## quit
Quit the CLI.

View File

@ -1,11 +1,25 @@
# ZipponDB Intro
# ZipponDB: A Lightweight Relational Database in Zig
ZipponDB is a relational database written entirely in Zig from scratch with 0 dependencies.
ZipponDB is a relational database built from the ground up in Zig, with zero external dependencies. Designed for simplicity, performance, and portability, it's ideal for small to medium applications that prioritize reliability and ease of use over complex features.
ZipponDB's goal is to be ACID, light, simple, and high-performance. It aims at small to medium applications that don't need fancy features but a simple and reliable database.
## Key Features
### Why Zippon ?
- **Relational Model (Coming Soon):** ZipponDB is being developed to support easy relationship.
- **Simple Query Language:** A straightforward and minimalist query language makes- interacting with the database easy.
- **Lightweight and Fast:** ZipponDB's small footprint and efficient design ensure- quick performance.
- **Portable:** Built with Zig, ZipponDB can be easily compiled and deployed across- various platforms.
- Relational database (Soon)
- Simple and minimal query language
- Small, light, fast, and implementable everywhere
## Why Choose ZipponDB?
If you need a database that is:
- **Easy to integrate:** ZipponDB's minimal design and simple query language make- integration into your projects straightforward.
- **Resource-efficient:** Its lightweight nature minimizes resource consumption,- making it suitable for resource-constrained environments.
- **Reliable:** Built with a focus on correctness and stability.
- **Cross-platform:** Deployable wherever Zig is supported.
Then ZipponDB might be the right choice for you.
## Current Status and Future Plans
While still under development, ZipponDB is actively being improved. Relational features are currently in progress and will be available soon. Stay tuned for updates and new releases! We encourage feedback from the community.

View File

@ -44,6 +44,7 @@ nav:
- Schema: Schema.md
- ZipponQL: ZiQL.md
- Data types: Data type.md
- Commands: Commands.md
- Command Line Interface: cli.md
- Benchmark: Benchmark.md
- Technical: Technical docs.md
- Roadmap: Roadmap.md