Rust Time in SQLite - Text vs. Integer/UnixTime

Discord Thread about this post

When working with any database, there will always be a time when it comes to storing some date and time information. In most databases, there is typically a datetime type that can be used for this purpose.

However, SQLite has a relatively simple type system and does not support any native datetime type. Fortunately, the SQLite type system has good support for a set of primitive types, which provides a few options for storing date and time information effectively.

Here are the two top decisions to make when storing date and time in SQLite:

Regardless of the types and format, a typical good practice is to always store time in UTC to avoid any confusion or overhead.

Now, among all those options, there are two top choices that work quite well: the TEXT/ISO 8601 option and the Unix UnixTime as Integer option. Regardless of which one we choose, it is usually a good idea to normalize all the time storage types/formats across the application code.

Below, we will go over these two options and explain our rationale for choosing Option 2) for our codebase, concluding with some Rust code examples for that choice.

Option 1) Type: TEXT, Format: IS0 8601

One common option is to store the date in TEXT type as a ISO 8601 profile such as RFC 3339 (e.g., 2024-02-10T19:26:35.884943Z).

Pros:

Cons:

Option 2) Type: INTEGER, Format: UnixTime µs

Another common approach is to store the date as an INTEGER representing Unix Time (Epoch time starts on January 1, 1970, 00:00:00 UTC) and decide on a precision.

Most databases do not really support u64 in their numeric types, including SQLite, which means we are limited to i64.

Fortunately, i64, even at microsecond precision, is more than sufficient to store time. The i64::MAX value will be approximately 292,277-01-09... UTC, which is practically infinite for our timescale. And since we also have negative numbers, we can go back in time quite far. But assuming we don't need to keep track of dinosaurs, we should be fine.

Pros:

Cons:

Other types / formats

While the float with Julian Date and the BLOB custom format can be considered and might be valid in some cases, they tend to be a bit more specialized and even less common than the two options mentioned above, without providing significant additional value.

Feel free to discuss in the Discord link at the top of the post if you have a different perspective.

Conclusion and Decision

The decision to choose one of the options above can be considered subjective and more of a preference, as both options will work well regardless of the choice.

That being said, for our coding approach, we think that the Option 2), (i.e. INTEGER / UnixTime µs) is the one that have the highest net value.

Here are the main reasons:

Due to the reasons mentioned above, Option 2 shifts our decision reasonning from "Why?" to "Why not?". This means we would need to identify compelling reasons not to use it, rather than justifying its use over other options.

That being said, different architectural and code design approaches necessitate different decisions. Therefore, we are not suggesting that this is the universally best choice for all scenarios. Follow what you and your team deem appropriate for your project, as both options are valid and can scale relatively well.

Side Note:

In our Rust code, we like to wrap those primitive type i64 into a single tuple struct to give them a little bit more sementic in our application code. So, we have a type UnixTimeUs (Us for micro seconds) as

#[derive(Clone, Copy, PartialEq, Eq, Debug, Deref, From, Into
          Field, FromSqliteValue,Serialize, Deserialize)]
#[cfg_attr(feature = "for-ts", derive(schemars::JsonSchema))]
pub struct UnixTimeUs(i64);

And this is allowing us to have:

pub struct MyData {
    // ...

    pub ctime: UnixTimeUs, // creation time.
	pub mtime: UnixTimeUs, // last modification time.
}

What about PostgreSQL?

PostgreSQL does have a datetime type, so we typically use this type (except when we have a sync from an SQLite database ongoing).

Personal Note: That being said, the end-to-end simplicity of the UnixTime µs / i64 format is quite appealing to use even in the case of PostgreSQL. And while it might seem "wrong" to not use the native PostgreSQL datetime type, I am not 100% sure it would be a bad design choice.

Note: There is a slight nuance in PostgreSQL and Rust SQLx where if we do not define the time type to include a timezone, we might encounter serialization/deserialization issues from the database, even though PostgreSQL recommends having datetime without a timestamp. This is a minor issue and does not affect the decision to use the native PostgreSQL datetime type.

Rust Code Snippets for generating/parsing UnixTime µs

Here are some code snippets we are using to generate our UnixTime µs.

Note: The RFC 3339 code example is just that—an example. Typically, we stream the i64 Unix Time format directly to the client and handle user time formatting on the client side (e.g., with JS/TS in the case of a web browser).

now_unix_time_us - Current time in microsecond precision

use std::time::{Duration, SystemTime, UNIX_EPOCH};

/// Returns the Unix Time in microseconds.
///
/// Note 1: If there is any error with `duration_since UNIX_EPOCH` (which should almost never happen),
///         it returns the start of the EPOCH.
/// Note 2: The maximum UTC epoch date that can be stored in i64 with microseconds precision
///         would be approximately `292277-01-09 ... UTC`.
///         Thus, for all practical purposes, it is sufficiently distant to be of no concern.
pub fn now_unix_time_us() -> i64 {
    let now = SystemTime::now();
    let since_the_epoch = now.duration_since(UNIX_EPOCH).unwrap_or(Duration::new(0, 0));

    since_the_epoch.as_micros().min(i64::MAX as u128) as i64
}

unix_time_us_to_rfc3339 - UnixTime microseconds to Rfc3339 String

/// Returns the RFC3339 format of an Unix Time in microseconds.
///
/// Note 1: If anything fails, it will return the start of the EPOCH formatted as RFC3339 "1970-01-01T00:00:00Z".
///         Other strategies can be valid.
pub fn unix_time_us_to_rfc3339(unix_time_us: i64) -> String {
	let unix_time_us: i128 = unix_time_us as i128 * 1000;

	// In the unlikely event of a failure, return the start of the EPOCH.
	let time = OffsetDateTime::from_unix_timestamp_nanos(unix_time_us).unwrap_or(OffsetDateTime::UNIX_EPOCH);

	// This should probably never fail.
	time.format(&Rfc3339).unwrap_or_else(|_| "1970-01-01T00:00:00Z".to_string())
}

rfc3339_to_unix_time_us - Rfc3339 String to UnixTime microseconds

pub type Result<T> = core::result::Result<T, Error>;
pub type Error = Box<dyn std::error::Error>; // For early dev.

use time::format_description::well_known::Rfc3339;
use time::OffsetDateTime;

/// Returns the Unix Time in microseconds for a date-time string formatted in RFC3339.
///
/// Note: Given the higher probability of error with this function,
///       it will return a `Result<T>`.
///       Other strategies may also be valid.
pub fn rfc3339_to_unix_time_us(time_rfc3339: &str) -> Result<i64> {
    let time =
        OffsetDateTime::parse(time_rfc3339, &Rfc3339).map_err(|_| format!("Fail to parse: {}", time_rfc3339))?;
    let time_ms = time.unix_timestamp_nanos() / 1000;
    let time_ms = time_ms.min(i64::MAX as i128) as i64;
    Ok(time_ms)
}

Discord Thread about this post