Soliciting feedback on the SQLite pack API

Hi all,

I'd like to solicit your input on one of the biggest new features of the SQLite plugin (nvgt_sqlite).

For those who are unaware, the SQLite plugin is gaining a new pack format for data files, game assets and more, supporting transparent encryption without the need for file-level encryption (that is, the pack object takes care of the encryption for you automatically so you don't need to encrypt your files yourself). This feature is known as sqlite_pack. It mirrors the official pack object's API, minus a few differences, and packs are backed by an SQLite database. The only remaining caveat is that it is not possible to embed these pack files at this time.

It is worth noting that, unlike a normal file system, all of the pack file names support arbitrary strings and are not limited to any subset of characters that a typical on-disk file system might support. This means that a file name like sounds^^data#ogg is equally as valid as any other.

The full API for the SQLite pack object (sqlite_pack) is as follows (in angelscript pseudocode):

enum pack_open_mode {
	SQLITE_PACK_OPEN_MODE_READ_ONLY,
	SQLITE_PACK_OPEN_MODE_READ_WRITE,
	SQLITE_PACK_OPEN_MODE_CREATE,
	SQLITE_PACK_OPEN_MODE_URI,
	SQLITE_PACK_OPEN_MODE_MEMORY,
	SQLITE_PACK_OPEN_MODE_NO_MUTEX,
	SQLITE_PACK_OPEN_MODE_FULL_MUTEX,
	SQLITE_PACK_OPEN_MODE_SHARED_CACHE,
	SQLITE_PACK_OPEN_MODE_PRIVATE_CACHE,
	SQLITE_PACK_OPEN_MODE_NO_FOLLOW
}

class sqlite_pack {
	sqlite_pack();
	~sqlite_pack();
	bool open(const string &in filename, const int mode = SQLITE_PACK_OPEN_MODE_READ_ONLY, const string& key = "");
	bool rekey(const string& key);
	bool close();
	bool add_file(const string &in disc_filename, const string& in pack_filename, bool allow_replace = false);
	bool add_directory(const string &in dir, const bool allow_replace = false);
	bool add_memory(const string &in pack_filename, const string& in data, bool allow_replace = false);
	bool delete_file(const string &in pack_filename);
	bool file_exists(const string &in pack_filename) const;
	string get_file_name(int64 index) const;
	string[]@ list_files() const;
	uint get_file_size(const string &in pack_filename) const;
	string read_file(const string &in pack_filename, uint offset_in_file, uint read_byte_count) const;
	bool get_active() const property;
	uint get_size() const property;
	datastream@ open_file(const string& file_name, const bool rw);
	void allocate_file(const string& file_name, const int64 size, const bool allow_replace = false);
	bool rename_file(const string& old, const string& new_);
	void clear();
	sqlite3statement@ prepare(const string& statement, const bool persistent = false);
	string[]@ find(const string& what, const sqlite_pack_find_mode mode = SQLITE_PACK_FIND_MODE_LIKE);
	dictionary@[]@ exec(const string& sql);
}

Although this API nearly entirely mirrors the pack API, it comes with some strings attached:

  • It is not possible to add a file larger than 2 GB.
  • In the event of an error occurring when adding a file, the entire operation is rolled back as though the addition never occurred.

The second caveat is particularly important, since this also applies to add_directory. In the event of a failure to add a file in any directory or sub-directory of a top-level directory you specify via add_directory, the entire operation, including all files previously added in that top-level directory during the directory addition operation, are also undone. To put it another way, if you add a directory "sounds", and "sounds/step1.ogg", "sounds/step2.ogg", and "sounds/step3.ogg" successfully add but "sounds/step4.ogg" fails to add for some reason, it will be as though no files in "sounds" were ever added to begin with. This is called atomicity, and it exists to ensure that the pack file always is consistant and that you don't have half-added files or something equally as bad.

I'll skip documenting the already documented pack API methods, and will just briefly cover the new ones:

  • The rekey method allows for the encryption and decryption of a pack file on the fly. If this method is called on a decrypted pack file, it will be encrypted; if it is called with a blank string on an encrypted pack file, it will be decrypted. This can also be used to rotate the encryption key used to encrypt a pack file, but this is a bit more involved and currently not automatically done for you (and may never automatically be done).
  • The add_directory method recursively adds all (regular) files within the directory specified. However, it comes with the caveat that you don't have any control over any of the names used, so their relative file names will be used.
  • The open_file method opens a file for either reading only or reading and writing, depending on whether the rw parameter is true (for read-write mode) or false (for read-only mode). The file MUST always exist. It is not possible to resize a file using this method. That is, you cannot seek to the end of a file and write to it. Instead, the file must be re-added.
  • The allocate_file method is used for file preallocation. This allows you to allocate space in the pack for a file if you know it's size and then incrementally write to it with open_file as you receive data from somewhere else. This allows you to, e.g., stream writes into a pack from the internet.
  • The rename_file method allows you to rename files. The old file MUST exist, or false will be returned.
  • The clear method erases all files in the pack (but touches nothing else).
  • The prepare statement allows you to execute arbitrary SQL on a pack file. It uses the familiar parameterized SQL syntax that readers should be familiar with as placeholders (i.e., ?, ?NNN, etc.). Applications MUST NOT modify the pack_files table in any way; doing so is explicitly undefined behavior and will cause the pack API to break in an unexpected manner, and no support will be provided if applications do so.
  • The find method searches for files given a pattern. If mode is SQLITE_PACK_FIND_MODE_LIKE, the infix LIKE operator will be used; if mode is SQLITE_PACK_FIND_MODE_GLOB, the infix GLOB operator will be used; and if mode is SQLITE_PACK_FIND_MODE_REGEXP, the infix REGEXP operator will be used. The operators have the following definitions and descriptions:
    • The LIKE operator SHALL be used to perform pattern matching on text strings. It SHALL support two wildcard characters: the percent sign %, which SHALL match zero or more characters, and the underscore _, which SHALL match exactly one character. When employing the LIKE operator, the pattern SHALL be case-sensitive.
    • The GLOB operator is similar to the LIKEoperator in it's behavior but uses Unix-style wildcard matching. In this syntax, the asterisk*SHALL signify zero or more characters, while the question mark?` SHALL signify exactly one character. This operator is inherently case-sensitive.
    • The REGEXP operator provides the most flexibility and allows the use of PCRE2-compatible regular expressions to search for files. The case-sensitivity and other exact functionality SHALL depend on the expression used. The behavior is undefined if the regular expression is too large or if it is recursive.
  • The exec method allows the one-step execution of SQL statements. It MUST NOT be a replacement for prepare, and prepare MUST be used on untrusted user input.

Other than these new functions, the only other change is the mode parameter to open. There is no append flag, since this isn't necessary. The mode parameter must always at least be SQLITE_PACK_OPEN_MODE_READ_ONLY, SQLITE_PACK_OPEN_MODE_READ_WRITE, or a combination of SQLITE_PACK_OPEN_MODE_READ_WRITE and SQLITE_PACK_OPEN_MODE_CREATE. An exception SHALL be generated upon use of the pack object after opening if the SQLITE_PACK_OPEN_MODE_CREATE flag is specified on it's own. Other than that, the remaining flags have the following meanings:

Flag Description
SQLITE_PACK_OPEN_MODE_URI The file name is a URI, as defined by RFC3986. The interpretation of this URI is VFS-defined.
SQLITE_PACK_OPEN_MODE_MEMORY This pack should be opened in memory and never be written to disk. Transparent encryption is currently not available in this mode.
SQLITE_PACK_OPEN_MODE_NO_MUTEX Enable multi-threaded operation. Access to the pack is thread-safe as long as separate threads use different pack objects (but those other pack objects are allowed to access the same pack file).
SQLITE_PACK_OPEN_MODE_FULL_MUTEX Use serialized accesses to the pack file and the pack object. Any number of threads may use the same pack object, but actual concurrency is not possible. This mode is enabled by default internally for safety reasons.
SQLITE_PACK_OPEN_MODE_SHARED_CACHE Share a cache between processes if multiple processes access the pack file simultaneously.
SQLITE_PACK_OPEN_MODE_PRIVATE_CACHE If the pack is accessed by multiple processes, each process gets it's own private cache.
SQLITE_PACK_OPEN_MODE_NO_FOLLOW Do not follow symbolic links.

I'd appreciate your feedback about this API. If there are any features you'd like in the API, don't hesitate to ask! In future I may re-document this API in the articles category (or this topic will be moved there), but I felt that soliciting feedback felt like a wise choice first.

You are free to download and try the plugin even now. As always, this is an extremely early version of the new API, so expect some functions to break or to crash the engine entirely.

Downloads

nvgt_sqlite.dll (2.6 MB)

3 Likes

I have added the rename_file and clear functions to the pack API, and have also uploaded the file here for people to try. I hope you enjoy!

I have added the prepare function. It is now possible to execute arbitrary SQL on pack files!

1 Like