Exactly what is "less ideal" about it? SQLite has distinct advantages over n-tie...

beoberha · on Jan 19, 2023

We’re not talking about latency though. The argument is around the HA story. Totally agree that if your app and DB are on the same machine and if HA isn’t a hard requirement, SQLite is probably what you should go to before PG.

However, if you can’t have any downtime, you’d have to build some bespoke solution to make sure you have standby replicas ready to go if there’s an issue on the primary instance. For a lot of enterprises, that’s a total nonstarter.

tptacek · on Jan 19, 2023

We're going around in circles. You can replicate SQLite. What makes replicated SQLite "less ideal" than n-tier databases?

kthejoker2 · on Jan 19, 2023

Just playing devil's advocate (I don't have much of a dog in the fight):

* SQLite's replication isn't built in, you have to use another library (LiteStream, LiteFS, etc) to achieve it. That in itself is an indication it's not inherently designed for such, and at a minimum will require a lot more testing to ensure your HA/DR failover scenarios work like you're envisioning. Perception matters.

* Litestream, LiteFS today are in "beta mode", fairly DIY, mostly CLI and YAML configs, making it comparatively complex and error-prone to configure vs. most n-tier databases which offer OOTB replication via GUI, and it's usually just a checkbox with cloud PaaS databases.

* "No one ever got fired for choosing IBM" there are tons of blogs and walkthroughs of setting up HA/DR for all the major RDBMSes, it's a safe choice if you run into any issues, whereas you might find yourself deep in an obscure dev forum debugging Docker dependencies or networking protocols. It's a risk.

* Needless to say, commercial support also matters.

* On LiteStream's own tips page ( https://litestream.io/tips/ ) there's a subheader called "Data Loss Window", if I were competing in any way with this solution, I think I'd bring that up pretty early.

Broadly, I think it'd be fine for a lightly written, mostly read OLTP system that's relatively stable. Anything else and I might get itchy about relying on just the WAL for replication.

SQLite fills a great ethos of "no muss, no fuss" but replication a lot times is 100% fuss.

mbreese · on Jan 19, 2023

It seems like you keep missing the SQLite caveat — it’s the same process as your application. HA doesn’t make sense here in the context of SQLite by itself.

Here’s how I think about it, but I don’t do replicates with SQLite often, so I might be wrong.

You really can’t compare SQLite to a Postgres or MySQL or Oracle server. With a database server you worry about replication/HA etc for the server. If the server is up, it can be queried, independent of the application.

With SQLite, you worry about replicating the data. If your application is running, it can access the data. You just need to make sure the data is consistent between nodes (if you use multiple nodes).

The data can be replicated, but if your SQLite node goes down, there’s no need to worry about the application because that means the application is also down. Similarly, if the application is up, your database is also up!

beoberha · on Jan 19, 2023

I mean this is our entire point. SQLite isn’t built for HA. You’re stuck replicating it yourself or using non-enterprise-ready solutions. It’s not SQLite’s fault, it’s just not the tool for the job if you need HA.

tptacek · on Jan 19, 2023

SQLite is one of the most reliable codebases in the industry. You're ignoring what the comment says. In an n-tier architecture, both your app server and your database server needs to be HA, because they're servers. With SQLite, the app server is the database server. It's still HA, there's simply one fewer thing to HA-ify.

I don't know what "non-enterprise-ready" means, but it sounds like a selling point to me. The XML can come later.

vidarh · on Jan 19, 2023

When the app server is stateless, HA for the app server is usually a near trivially easy problem to solve though, and a whole lot of web devs are used to treating the database as a black box that they'll treat as always up so I get the fear and uncertainty of suddenly having the HA setup interfere with "their" domain.

Coming at it from a devops angle, I'm used to being the one to have to deal with the database HA myself, and then having one tier less becomes a lot more appealing, and not really more scary than ensuring any other database setup is resilient.

tptacek · on Jan 19, 2023

The comment I replied to upthread was also talking about n-tier replication features that weren't built in.

LiteFS isn't the only way to replicate a SQLite database.

Do you want Oracle servers? Because "Nobody got fired for choosing IBM" is how you get Oracle servers.

If you read just one additional sentence in to the Litestream "Data Loss Window" section, you'll find: "This is how many replication tools work including PostgreSQL’s log-shipping replication".

I don't know what you mean by "relying on just the WAL". The WAL is the database. What else would you replicate? How are you contrasting this against Postgres replication?

chillfox · on Jan 19, 2023

Sometimes I honestly think that the problem is that the SQLite ecosystem makes it easy to understand how things work, so people get scared. Whereas other things comes off as magic and magic is comforting, you don't need to know how it works, you can just trust it.

layla5alive · on Jan 19, 2023

100%

On so many topics, this is the answer. People want a pleasant fiction of clean abstracted magic.

MobiusHorizons · on Jan 19, 2023

It is so hard for me to imagine ever trusting magic, but I think you are very right that people seem to prefer it. I think this discrepancy explains so much of the conflict I get into in my career.

ngrilly · on Jan 19, 2023

> LiteFS isn't the only way to replicate a SQLite database.

What are other ways to replicate that you're aware of?

Another approach I like is to use a replicated block storage like GCP Regional Persistent Disks. This is making failover trivial. But then I'm not sure the replica can be used as a read-only copy.

vidarh · on Jan 19, 2023

> What are other ways to replicate that you're aware of?

Litestream (wal -> object storage), Rqlite (wrap in http layer + raft + streaming changes between nodes; rqlite kinda defeats much of the purpose for me), Dqlite (in-process server-thread + raft + streaming changes; dqlite obscures that there are server threads plus network communication involved, but it's there), LiteReplica (VFS extension streaming to a replica), Verneull (VFS extension -> object storage), Sqlite Ceph VFS (VFS extension storing to RADOS)

Very different sets of tradeoffs. E.g. several of these requires you to run a separate tool to do the replication. Several depends on object storage either for the primary storage (the Ceph extension) or as the replication target / to restore from, which may be fine if you already have a compatible object store. Some can use e.g. Consul to control which server is the writer.

> Another approach I like is to use a replicated block storage like GCP Regional Persistent Disks. This is making failover trivial. But then I'm not sure the replica can be used as a read-only copy.

Block storage that can guarantee in-order read/writes is fine. Network filesystems that can guarantee in-order writes and support proper locking can also allow writes (with the usual caveats) from multiple nodes. The problem here is that you really must be sure, and often - e.g. for NFS - it can appear to work just fine but be just broken enough you'll run into it at just the wrong moment.

beoberha · on Jan 19, 2023

I will throw my weight behind this reply as what I would have written if I had the time and energy :)

throwaway892238 · on Jan 19, 2023

Abstractions. Sometimes you need them. Different operating systems, CPU architectures, applications, clients, libraries, versions, hell different companies running proprietary apps, licensing, regulations... A database with a dumb network socket on a random host is fairly well abstracted.

As far as HA goes, that abstraction can then go toward allowing your apps and DB to be built for different reliability/durability/etc requirements, potentially saving money, gaining better performance or reliability, meeting weird compliance requirements, contractual requirements, vendor or software limitations, etc.

It's also easier to troubleshoot and maintain the database separate from the application in some circumstances. If SQLite is bundled with your application, it might be harder to deal with a problem affecting one or the other. In traditional organizations, you can hire people to manage databases and people to manage applications, rather than one person who has to deal with both, or even weirder, two people that would be working on one db-in-your-app.

tptacek · on Jan 19, 2023

Sometimes this is true, and sometimes it's what CORBA advocates say you to get you to adopt object brokers. Sometimes your app benefits from an explicit networked database tier, and sometimes it doesn't. I'm fine if you think most of the time it does. I'm not fine if you think it rarely does: that's not true. There are big advantages to not having a separate database tier, and the industry is sleeping on them.

(To be clear: the company I work for sells a Postgres service, and we benefit from n-tier databases! I have strong opinions here because I think this is an interesting controversy.)

chillfox · on Jan 19, 2023

"you’d have to build some bespoke solution"

Or you could use one of the many off the shelf solutions that has sprung up over the last few years.